Announcement Announcement Module
Collapse
No announcement yet.
Spring data MongoDB: Managing large Data set Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Spring data MongoDB: Managing large Data set

    Hi,

    I am trying to stream data from a very large Oracle database to MongoDB for archiving. I have succeeded in implementing this with spring data jpa (Hibernate) and MongoDB. I am however hitting a

    Code:
    java.lang.OutOfMemoryError: GC overhead limit exceeded
    after about 300,000 records. I have tried various strategies to clear the archived object in memory but the error is not going away.

    This is what I am currently doing:
    1. I use Akka to create different actors for mining data from the oracle database.
    2. Each Akka Actor is wired a singleton instance of a DAO for the Oracle database which is backed by a Stateless Hibernate session (so there's no caching of objects)
    3. I clear every list that is saved to mongo


    I have tried to increase the memory allocated to the JVM as well as adding the JVM option
    Code:
    -XX:-UseGCOverheadLimit
    but that has not helped

    Question: What is the best strategy for handling large data with Spring Data & MongoDB? How do I prevent the caching of saved objects which is currently hogging my memory? I'd prefer not increasing memory.

    Thanks in advance.

    Cheers,

  • #2
    Profiling memory

    After profiling the memory usage of my application, I realized that
    Code:
    oracle.jdbc.driver.CachedRowElement
    was taking up a huge chunk of memory. Any suggestions as to how to clear this cache?

    I have attached a screenshot of the profile

    Attachment
    Attached Files

    Comment


    • #3
      I solved it. I set
      Code:
      <prop key="hibernate.jdbc.use_scrollable_resultset">false</prop>
      to stop the driver from caching the records from the DB.

      Comment

      Working...
      X