Announcement Announcement Module
Collapse
No announcement yet.
Out of memory issue(Memory leak) Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Out of memory issue(Memory leak)

    Hi all, our use case is a simple batch repeat. My job configuration:
    Code:
    <batch:job id="job" job-repository="jobRepository">
         <batch:step id="step1" next="step2">
              <batch:tasklet ref="beforeCustomerProcessingTasklet" />
         </batch:step>
         <batch:step id="step2">
              <batch:tasklet>
                   <batch:chunk reader="customerReader"
                                      processor="customerProcessor" 
                                      writer="customerWriter"
                                      skip-limit="1000000"
                                      commit-interval="10000">
                        <batch:skippable-exception-classes>
                             org.springframework.dao.DataAccessException
                        </batch:skippable-exception-classes>
                   </batch:chunk>
              </batch:tasklet>
         </batch:step>
    </batch:job>
    
    <beans:bean id="customerReader" scope="step"  class="org.springframework.batch.item.database.JdbcCursorItemReader" >
         <beans:property name="fetchSize" value="10000" />
         <beans:property name="dataSource" ref="dataSource" />
         <beans:property name="sql">
               <beans:value>query.customers.select</beans:value>
         </beans:property>
         <beans:property name="rowMapper">
              <beans:bean class="es.tid.tesco.ur.dataprocessing.data.access.daos.jdbcSpring.impl.CustomerRowMapper" />
         </beans:property>
    </beans:bean>
    We need when org.springframework.dao.DataAccessException occurs:
    1. Do not stop job execution(some other process wil process fault queries and we need fault queries to be logged)
    2. Throw up the exception(so the caller script knows it was an exception and no other processed are executed)
    3. Do not rollback of the chunks that where executed correctly.

    As you can see on the configuration, we are fetching 10000 customers from DB(that means on memory), and we are commiting in chunks of 10000 customers too. For each customer, we execute 6 queries(so we have a batchUPdate of 60000 queries).
    Having configured the heap size to a maximum of 1024M, we do not have memory issues( Out of memory exception). But we are observing that the heap memory the process allocates increses proportionally to the number of customers we process:
    - 200000 customers consume about 250M
    - 750000 customers comsume about 750M
    For us it seams like between chunk execution, the garbage collector cannot free the memory allocated by previous customers. We are wondering if it could be that spring batch(under our configuration) is keeping references to the already processed customers.

    We go to live in two weeks so, we will really appreciate any help/suggestion.

    Cheers,

    Zoraida.-

  • #2
    Hi to all again,

    after using another tool for profiling(jmap), we could see that the number of Customer instances keeps constant all the time(around 10000 customer/chunk). The dump of jmap shown us that a big number of byte arrays where allocated(250M for 700.000 customers). We observed that these arrays of bytes where allocated from the begining so we tried to run our application with 3.000.000 customers in our database and see what happens. Now, the java heap space exception occurs at the begining of the execution and the number of byte arrays allocated has increased(800M for 3.000.000 customers). Customer instances are not allocated by the rowMapper(because the exception occurs while open the cursor) so we strongly belive it is a problem of reading records from the database.

    In my previous post, I attached our JdbcCursorItemReader configuration. I have been reading about the properties of JdbcCursorItemReader but I do not see any property which suggest me a solution for this.

    On the other side, I found this post: http://forums.mysql.com/read.php?39,...658#msg-221658 which suggests that it could be a problem with the connector verion that we are using so we tried with different jconnectors without exit.

    Anyone had a similar problem?

    Thanks.

    Comment


    • #3
      Hello,

      I think we solved our problem:

      maxRows Sets the limit for the maximum number of rows the underlying ResultSet can hold at any one time.

      is a property of an Statement(jconnector api) and also it is exposed by the JdbcCursorItemReader class. I hope it helps someone.

      Comment


      • #4
        HI again,
        sorry, I just wanted to clarify that finally maxRow did not resolve our issue. We wanted stream out our customers and we are using Mysql so our final configuration for our reader was:

        Code:
        <beans:bean id="customerReader" scope="step"  class="org.springframework.batch.item.database.JdbcCursorItemReader" >
                <beans:property name="fetchSize" value="-2147483648" /> // Integer.MIN_VALUE(see Mysql documentation)
                <beans:property name="verifyCursorPosition" value="false"/> // we do not want to call ResultSet.getRow function 
             <beans:property name="dataSource" ref="dataSource" />
             <beans:property name="sql">
                   <beans:value>query.customers.select</beans:value>
             </beans:property>
             <beans:property name="rowMapper">
                  <beans:bean class="some.package.impl.CustomerRowMapper" />
             </beans:property>
        </beans:bean>
        Tip:
        http://dev.mysql.com/doc/refman/5.0/...ion-notes.html

        Comment

        Working...
        X