Announcement Announcement Module
Collapse
No announcement yet.
GC-Problem/OutOfMemoryError with nested jobs Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • GC-Problem/OutOfMemoryError with nested jobs

    Hi all,

    I have two relative complex jobs which are working fine so far. Now, I have the task to merge them to one job.

    My first try was to define a job with two steps. The first step invokes the first job, the second step invokes the second job. Sadly, by doing this, I am getting an OutOfMemoryError. It seems that the objects of an Step with ItemReader/Writer are not garbage-collected. It runs with commit-interval=10000

    As mentioned the step runs properly as a dedicated job. But as a nested job, it fails.

    I have found a disagreeable workaround. Instead of nesting the jobs in one further job, I am copying all steps to the new job. Then it is running properly.


    But why? What do I have to consider while nesting Spring Batch Jobs?

    For any help, I am very happy...

    Regards,
    miwo

  • #2
    I want to add some code:

    When I run following job on its own, it is working fine without any heap problems:

    Code:
    <batch:job id="dwhTgeResponseJob">
    ...	
    <batch:step id="importTargetGroupFromFile">
    	<batch:tasklet>
    		<batch:chunk reader="multiResourceReaderForTargetGroup"
    			writer="tgeResponseDatabaseWriter" commit-interval="10000" />
    		<batch:listeners>
    			<batch:listener ref="passOffersListener" />
    		</batch:listeners>
    	</batch:tasklet>
    	<batch:next on="FAILED" to="releaseTargetGroupResponseLockFailed" />
    	<batch:next on="*" to="activateOffers" />
    </batch:step>
    ....
    If I define a further job which contains a tasklet which invokes the upper job, I get an OutOfMemoryError in the upper step, because the objects which are used in the upper step are never be removed from heap. I could see that in the generated heap dump.

    Here's the outer job which is nesting the upper job via a tasklet:

    Code:
    <batch:job id="offerActivationJob" restartable="false">
    <batch:step id="dwhTgeResponseStep">
    	<batch:tasklet ref="runDwhResponseTasklet" />
    	<batch:end on="FAILED"/>
    	<batch:next on="*" to="keyDbSendOfferStep"/>
    </batch:step>
    <batch:step id="keyDbSendOfferStep">
    	<batch:tasklet ref="runKeyDbRequestTasklet" />
    </batch:step>
    ...
    Does someone have an idea why this heap problem occurs?

    Regards,
    miwo

    Comment


    • #3
      Without some more information I think it is impossible to say what is going on here. You nested job always fails? Or only after several executions? Who has the references that are not garbage collected? What are they?

      A commit interval of 10000 is excessive and probably isn't helping, but I don't immediately see the link.

      Comment


      • #4
        Do a memory profile. Where is your memory?

        Try changing commit size to 200. Does that help?

        What is the JVM memory params on the server? What is the server physical memory? Can you give it more?

        Comment


        • #5
          Hi,

          the Xmx was set to 1GB and physical memory is 16GB.

          Without this nested job and 10000 interval, it is running with Xmx set to 256M.
          With nested job, it is running and running and after the 1GB is full, I get an OutOfMemoryError. In he heap dump I see >300,000 objects which were read by the ItemReader. Commit interval stays the same (10000) and I am not saving the reference of the objects anywhere. So, they should be garbage-collected. Sadly, they don't.

          Resizing the commit-interval didn't help.

          I think the outer job is referencing on the objects. But actually I don't see any reason why he is doing this. Maybe due to transaction handling?

          Regards,
          Michael

          Comment


          • #6
            Are you passing actual Java objects between jobs?

            Comment


            • #7
              I think I see the problem: you are using a Tasklet to run the inner job and not changing any TX settings, so the system is trying to run your whole job in the same transaction. This cannot possibly work (especially for large datasets).

              You can use a JobStep (or <step job=.../>) or you can use your tasklet and change the TX propagation to NOT_SUPPORTED.

              Comment

              Working...
              X