Announcement Announcement Module
Collapse
No announcement yet.
Bug when using <split> with SimpleAsyncTaskExecutor? Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bug when using <split> with SimpleAsyncTaskExecutor?

    Hi

    I'm new to Spring Batch und I'm using Version 2.1.0-RC1. I played around with "<split>" and noticed strange behaviour.

    I have the following definitions in XML.
    Code:
    <beans xmlns="http://www.springframework.org/schema/beans"
    	xmlns:aop="http://www.springframework.org/schema/aop" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    	xmlns:util="http://www.springframework.org/schema/util" xmlns:batch="http://www.springframework.org/schema/batch"
    	xsi:schemaLocation="http://www.springframework.org/schema/aop http://www.springframework.org/schema/aop/spring-aop-2.0.xsd
    		http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch-2.1.xsd
    		http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-2.0.xsd
    		http://www.springframework.org/schema/util http://www.springframework.org/schema/util/spring-util.xsd">
    
    	<batch:job id="splitTrial" job-repository="jobRepository">
    		<batch:split id="loadData" task-executor="splitTaskExecutor">  
    			<batch:flow>
    				<batch:step id="loadOfferte">
    					<batch:tasklet ref="loader1">
    					</batch:tasklet>
    				</batch:step>
    			</batch:flow>
    			<batch:flow>
    				<batch:step id="loadSender">
    					<batch:tasklet ref="loader1">
    					</batch:tasklet>
    				</batch:step>
    			</batch:flow>
    			<batch:flow>
    				<batch:step id="loadReceiver">
    					<batch:tasklet ref="loader1">
    					</batch:tasklet>
    				</batch:step>
    			</batch:flow>
    		</batch:split>
    	</batch:job>
    
    
    	<bean id="loader1" class="ch.mobi.nddb.batch.loader.Loader1" />
    	<bean id="splitTaskExecutor" class="org.springframework.core.task.SimpleAsyncTaskExecutor" />
    
    
    	<bean id="splitTrialJob" class="org.springframework.batch.core.job.SimpleJob">
    		<property name="name" value="splitTrial" />
    		<property name="jobRepository" ref="jobRepository" />
    	</bean>
    
    	<bean id="transactionManager" class="org.springframework.batch.support.transaction.ResourcelessTransactionManager" />
    
    	<bean id="jobRepository"
    		class="org.springframework.batch.core.repository.support.SimpleJobRepository">
    		<constructor-arg>
    			<bean
    				class="org.springframework.batch.core.repository.dao.MapJobInstanceDao" />
    		</constructor-arg>
    		<constructor-arg>
    			<bean
    				class="org.springframework.batch.core.repository.dao.MapJobExecutionDao" />
    		</constructor-arg>
    		<constructor-arg>
    			<bean
    				class="org.springframework.batch.core.repository.dao.MapStepExecutionDao" />
    		</constructor-arg>
    		<constructor-arg>
    			<bean
    				class="org.springframework.batch.core.repository.dao.MapExecutionContextDao" />
    		</constructor-arg>
    	</bean>
    
    	<bean id="jobLauncher"
    		class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
    		<property name="jobRepository" ref="jobRepository" />
    	</bean>
    </beans>
    The implementation for Loader1 looks like
    Code:
    public class Loader1 implements Tasklet {
      private static Logger logger = Logger.getLogger(Loader1.class);
    
      @Override
      public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception {
        if (logger.isDebugEnabled()) {
          logger.debug("entering execute");
        }
        Thread.sleep(2000);
        return null;
      }
    }
    I use org.springframework.batch.core.launch.support.Comm andLineJobRunner with the arguments "splitTrial.xml" "splitTrial" to start.

    I noticed the following:
    At least for one thread of the AsyncTaskExecutor, the error
    Code:
    12:05:30,306 ERROR [AbstractStep] Encountered an error saving batch meta data.This job is now in an unknown state and should not be restarted.
    is logged. This Error is printed due to a
    Code:
    org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=1 with wrong version (2), where current version is 1
    which is thrown when calling getJobRepository().update(stepExecution); at line 245 in AbstractStep.

    There is no problem if I use the SyncTaskExecutor.

    Does anybody have an explanation for this?

    Thanks.

    Hansjoerg

  • #2
    Further investigation

    Hi

    I investigated the behaviour a little bit more in depth. Inside MapStepExecutionDao.updateStepExecution, I added some log statements:
    Code:
      public void updateStepExecution(StepExecution stepExecution) {
        Assert.notNull(stepExecution.getJobExecutionId());
        Map<Long, StepExecution> executions = executionsByJobExecutionId.get(stepExecution.getJobExecutionId());
        Assert.notNull(executions, "step executions for given job execution are expected to be already saved");
    
        StepExecution persistedExecution = executionsByStepExecutionId.get(stepExecution.getId());
        Assert.notNull(persistedExecution, "step execution is expected to be already saved");
    
        synchronized (stepExecution) {
          printStepExecution("update - per  ", persistedExecution);
          if (!persistedExecution.getVersion().equals(stepExecution.getVersion())) {
    
            throw new OptimisticLockingFailureException("Attempt to update step execution id=" + stepExecution.getId()
                + " with wrong version (" + stepExecution.getVersion() + "), where current version is "
                + persistedExecution.getVersion());
          }
    
          stepExecution.incrementVersion();
          printStepExecution("update - new  ", stepExecution);
          StepExecution copy = copy(stepExecution);
          executions.put(stepExecution.getId(), copy);
          executionsByStepExecutionId.put(stepExecution.getId(), copy);
          printStepExecution("update - copy ", executionsByStepExecutionId.get(stepExecution.getId()));
        }
      }
    Note the printStepExecution statements:
    update - per prints the stepExecution content as it was returned by executionsByStepExecutionId.get.
    update - new prints the updated stepExecution (with new version nr)
    update - copy prints the updated content in executionsByStepExecutionId.get

    When I run this, the following log is created:
    Code:
    ...
    07:25:08,688 DEBUG (SimpleAsyncTaskExecutor-3) [Loader1] entering execute
    07:25:08,688 DEBUG (SimpleAsyncTaskExecutor-1) [Loader1] entering execute
    07:25:08,688 DEBUG (SimpleAsyncTaskExecutor-2) [Loader1] entering execute
    07:25:10,685 DEBUG (SimpleAsyncTaskExecutor-3) [MapStepExecutionDao] update - per  [id: 2, vers: 1, name: loadReceiver, hash: 13878947]
    07:25:10,685 DEBUG (SimpleAsyncTaskExecutor-3) [MapStepExecutionDao] update - new  [id: 2, vers: 2, name: loadReceiver, hash: 20092482]
    07:25:10,685 DEBUG (SimpleAsyncTaskExecutor-3) [MapStepExecutionDao] update - copy [id: 2, vers: 2, name: loadReceiver, hash: 15594377]
    ...
    07:25:10,716 DEBUG (SimpleAsyncTaskExecutor-3) [MapStepExecutionDao] update - per  [id: 2, vers: 1, name: loadReceiver, hash: 13878947]
    ...
    07:25:10,716 ERROR (SimpleAsyncTaskExecutor-3) [AbstractStep] Encountered an error saving batch meta data.This job is now in an unknown state and should not be restarted.
    org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=2 with wrong version (2), where current version is 1
    	at ch.mobi.spring.batch.patch.MapStepExecutionDaoOrig.updateStepExecution(MapStepExecutionDaoOrig.java:87)
    	at ch.mobi.spring.batch.patch.SimpleJobRepository.update(SimpleJobRepository.java:159)
    	at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:236)
    	at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:115)
    	at org.springframework.batch.core.job.flow.JobFlowExecutor.executeStep(JobFlowExecutor.java:61)
    	at org.springframework.batch.core.job.flow.support.state.StepState.handle(StepState.java:60)
    	at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:144)
    	at org.springframework.batch.core.job.flow.support.SimpleFlow.start(SimpleFlow.java:124)
    	at org.springframework.batch.core.job.flow.support.state.SplitState$1.call(SplitState.java:83)
    	at org.springframework.batch.core.job.flow.support.state.SplitState$1.call(SplitState.java:81)
    	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
    	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
    	at java.lang.Thread.run(Thread.java:619)
    The strange thing here is, that the second "update - per " (@07:25:10,716) is displaying the same content as the first one (@07:25:10,685), despite the fact, that the "update - copy" line clearly shows, that the content was updated.

    I guess the problem arises somehow due to the fact, that executionsByStepExecutionId is created with TransactionAwareProxyFactory.createTransactionalMa p(). So it seems to me, that between the "update - copy" line and the second "update - per" line the content was somehow "rollbacked" or was not really "committed". However, I'm in the same thread here...

    Could it be, that using SimpleAsyncTaskExecutor together with a ResourcelessTransactionManager and a SimpleJobRepository configured with MapStepExecutionDao is probably not working?

    Thanks for any response.

    Hansjoerg

    Comment


    • #3
      The Map-based repository is not thread safe. There is a MapJobRepositoryFactoryBean with Javadocs that describe the limitations.

      Comment


      • #4
        Reference documentation should mention this limitation

        Hello,

        After reading SpringBatch Reference documentation (v 2.1.0), I imagined it was possible to use MapJobRepositoryFactoryBean in a "real world" batch.

        Chapter 4.2.3 "In-Memory Repository" indicates that this kind of JobRepository is good for performance, or if you don't need to persist batch metadata.

        It would be helpfull to update reference documentation and to mention that this repository does not support multi-threaded usage, and warn that it should be used only for prototyping.

        Thanks.

        Comment


        • #5
          Is there no in-memory thread safe repository? We are still in prototype phase and hitting this error with a Map repo as well. We don't want to persist everything to the database, as it drastically slows down our job.In most cases we would rather handle restartability ourselves though the use of transactions and such.

          Comment


          • #6
            HSQLDB, Derby and H2 all offer in-memory, thread safe data stores.

            Comment


            • #7
              Already have a connection to Oracle, seem silly to set up a second in memory datastore just for spring batch to store its few KB of data... But every time I try to use the normal JDBC datastore, performance of spring batch dies.

              Unless I am missing some simple plug-in way to get an in memory datastore for just the Spring Batch job stuff...

              Are their plans to make a map datastore that is stable enough to use in a production environment?
              Last edited by bwawok; Oct 24th, 2010, 01:04 PM. Reason: added more

              Comment


              • #8
                There should be no performance problems related to use of an external RDBMS for the JobRepository as long as you set the commit interval sensibly - the overhead of Batch using the external store is pretty minimal if it is only using it for occasional savepoints.

                Really there is no point us spending a lot of effort on a thread safe in-memory store when there are already in-memory RDBMS implementations available. And in any case, you have to worry about distributed transactions as soon as you have multiple stores, so I'd recommend using your external server in all cases unless it really hurts (and I don't see why it would - it works for everyone else).

                Having said that, the MapJobRepository is a lot more thread safe now than it was in 2.0. but I still wouldn't recommend using it in production.
                Last edited by Dave Syer; Oct 25th, 2010, 08:01 PM. Reason: spelling

                Comment

                Working...
                X