Announcement Announcement Module
Collapse
No announcement yet.
DeadlockLoserDataAccessException Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • DeadlockLoserDataAccessException

    Below is my job configuration, job repository isolation level is default - using sybase db.

    Code:
    <job id="myJob">
    <split id="prodSplit" task-executor="taskExecutor" next="loadPosnDlyStg">
        <flow>
    		<step id="loadOtcProd" parent="prodOtcStgLoad" next="pollOtcStg"/>
    		<step id="pollOtcStg" parent="prodOtcStgPoller" next="distOtcProd"/>
    		<step id="distOtcProd" parent="prodOtcDist"/>
        </flow>
        <flow>
    		<step id="loadListProd" parent="prodListStgLoad" next="distListProd"/>
    		<step id="distListProd" parent="prodListDist"/>
        </flow>
    </split>
    <step id="loadPosnDlyStg" parent="posnDlyStgLoad" next="loadPosnDly"/>
    <step id="loadPosnDly" parent="posnDlyLoad"/>
    <listeners>
    	<listener ref="jobListener"/>
    </listeners>
    </job>
    Quite frequently I am getting the below exception (looks like on at some point in time two threads are updating which is the cause of the problem)

    Code:
    org.springframework.dao.DeadlockLoserDataAccessException: PreparedStatementCallback; SQL [UPDATE BATCH_STEP_EXECUTION set START_TIME = ?, END_TIME = ?, STATUS = ?, COMMIT_COUNT = ?, READ_COUNT = ?, FILTER_COUNT = ?, WRITE_COUNT = ?, EXIT_CODE = ?, EXIT_MESSAGE = ?, VERSION = ?, READ_SKIP_COUNT = ?, PROCESS_SKIP_COUNT = ?, WRITE_SKIP_COUNT = ?, ROLLBACK_COUNT = ?, LAST_UPDATED = ? where STEP_EXECUTION_ID = ? and VERSION = ?]; Your server command (family id #0, process id #489) encountered a deadlock situation. Please re-run your command.
    ; nested exception is com.sybase.jdbc3.jdbc.SybSQLException: Your server command (family id #0, process id #489) encountered a deadlock situation. Please re-run your command.
    
    	at org.springframework.jdbc.support.SQLErrorCodeSQLExceptionTranslator.doTranslate(SQLErrorCodeSQLExceptionTranslator.java:265)
    	at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:72)
    	at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:602)
    	at org.springframework.jdbc.core.JdbcTemplate.update(JdbcTemplate.java:786)
    	at org.springframework.jdbc.core.JdbcTemplate.update(JdbcTemplate.java:842)
    	at org.springframework.jdbc.core.JdbcTemplate.update(JdbcTemplate.java:846)
    	at org.springframework.batch.core.repository.dao.JdbcStepExecutionDao.updateStepExecution(JdbcStepExecutionDao.java:172)
    	at org.springframework.batch.core.repository.support.SimpleJobRepository.update(SimpleJobRepository.java:167)
    	at sun.reflect.GeneratedMethodAccessor147.invoke(Unknown Source)
    	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    	at java.lang.reflect.Method.invoke(Method.java:597)
    ..............

  • #2
    I never saw that but I don't use sybase as a rule, and that platform is notorious for unnecessary deadlocks. I suppose you could wrap the JobRepository in a RetryOperationsInterceptor. It might be better to try and understand why it is deadlocked and try to tune the database (e.g. rebuild or add or remove indexes - ask a sybase guru).

    Comment


    • #3
      Dave,

      We did everything possible on the db side, it happens with row level lock too.
      At any point of time several jobs access the same repository.

      As we can configure retry's in application, Is there a way to handle this and configure retry intervals around job repository ?

      Thanks.

      Comment


      • #4
        Yes; "wrap the JobRepository in a RetryOperationsInterceptor".

        Comment


        • #5
          Thanks dave,

          I made the following change and now I get a UnexpectedRollbackException, may be the configuration is wrong - looks like its rolling back the transaction instead of a re-try..

          Code:
          <b:bean id="jobRepository"
          		class="org.springframework.batch.core.repository.support.JobRepositoryFactoryBean"
          		p:dataSource-ref="pmdbDataSource" p:transactionManager-ref="transactionManager" />
          	
          	<aop:config>
          		<aop:pointcut id="repositoryPointcut" expression="execution(* org.springframework.batch.core..*Repository+.*(..))"/>
              	<aop:advisor pointcut-ref="repositoryPointcut" advice-ref="txAdvice" order="2"/>
              	<aop:advisor pointcut-ref="repositoryPointcut" advice-ref="retryAdvice" order="1"/>
           	</aop:config>
          
          	<tx:advice id="txAdvice" transaction-manager="transactionManager">
              	<tx:attributes>
                  	<tx:method name="*" />
              	</tx:attributes>
          	</tx:advice>
          	
          	<b:bean id="retryAdvice" class="org.springframework.batch.retry.interceptor.RetryOperationsInterceptor"/>
          the exception is

          Code:
          org.springframework.transaction.UnexpectedRollbackException: Transaction rolled back because it has been marked as rollback-only at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:717) at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:147) at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:261) at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:76) at org.springframework.batch.repeat.support.RepeatTemplate.getNextResult(RepeatTemplate.java:367) at org.springframework.batch.repeat.support.RepeatTemplate.executeInternal(RepeatTemplate.java:214) at org.springframework.batch.repeat.support.RepeatTemplate.iterate(RepeatTemplate.java:143) at
          Last edited by Nitty; Jun 17th, 2010, 04:05 PM. Reason: missed out exception msg

          Comment


          • #6
            You don't need the txAdvice because a JobRepository has its own reference to the transaction manager. Not sure why the exception would not lead to a retry. Maybe you could try without txAdvice and post a full stacktrace if it doesn't work?

            Comment


            • #7
              removed txAdvice and here's the full stacktrace..(everytime it happens at different control table)

              Code:
              08:42:51,363 ERROR main BatchController:129 - stack trace of the error ..org.springframework.dao.DataAccessResourceFailureException: Could not increment identity; nested exception is com.sybase.jdbc3.jdbc.SybSQLException: Your server command (family id #0, process id #1064) encountered a deadlock situation. Please re-run your command.
              
                      at org.springframework.jdbc.support.incrementer.SybaseMaxValueIncrementer.getNextKey(SybaseMaxValueIncrementer.java:108)
                      at org.springframework.jdbc.support.incrementer.AbstractDataFieldMaxValueIncrementer.nextLongValue(AbstractDataFieldMaxValueIncrementer.java:125)
                      at org.springframework.batch.core.repository.dao.JdbcJobInstanceDao.createJobInstance(JdbcJobInstanceDao.java:110)
                      at org.springframework.batch.core.repository.support.SimpleJobRepository.createJobExecution(SimpleJobRepository.java:127)
                      at sun.reflect.GeneratedMethodAccessor630.invoke(Unknown Source)
                      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
                      at java.lang.reflect.Method.invoke(Method.java:597)
                      at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:307)
                      at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
                      at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
                      at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:107)
                      at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
                      at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)
                      at $Proxy0.createJobExecution(Unknown Source)
                      at sun.reflect.GeneratedMethodAccessor630.invoke(Unknown Source)
                      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
                      at java.lang.reflect.Method.invoke(Method.java:597)
                      at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:307)
                      at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
                      at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
                      at org.springframework.batch.retry.interceptor.RetryOperationsInterceptor$1.doWithRetry(RetryOperationsInterceptor.java:68)
                      at org.springframework.batch.retry.support.RetryTemplate.doExecute(RetryTemplate.java:238)
                      at org.springframework.batch.retry.support.RetryTemplate.execute(RetryTemplate.java:147)
                      at org.springframework.batch.retry.interceptor.RetryOperationsInterceptor.invoke(RetryOperationsInterceptor.java:55)
                      at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
                      at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:89)
                      at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
                      at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)
                      at $Proxy1.createJobExecution(Unknown Source)
                      at org.springframework.batch.core.launch.support.SimpleJobLauncher.run(SimpleJobLauncher.java:111)
                      at com.gs.fa.controller.LoadLauncher.launch(LoadLauncher.java:43)
                      at com.gs.fa.controller.LoadLauncher.launchJob(LoadLauncher.java:57)
                      at com.gs.fa.controller.BatchController.executeBatch(BatchController.java:86)
                      at com.gs.fa.controller.BatchController.main(BatchController.java:202)
              Caused by: com.sybase.jdbc3.jdbc.SybSQLException: Your server command (family id #0, process id #1064) encountered a deadlock situation. Please re-run your command.
              
                      at com.sybase.jdbc3.tds.Tds.processEed(Tds.java:2942)
                      at com.sybase.jdbc3.tds.Tds.nextResult(Tds.java:2246)
                      at com.sybase.jdbc3.jdbc.ResultGetter.nextResult(ResultGetter.java:69)
                      at com.sybase.jdbc3.jdbc.SybStatement.nextResult(SybStatement.java:220)
                      at com.sybase.jdbc3.jdbc.SybStatement.nextResult(SybStatement.java:203)
                      at com.sybase.jdbc3.jdbc.SybStatement.updateLoop(SybStatement.java:1804)
                      at com.sybase.jdbc3.jdbc.SybStatement.executeUpdate(SybStatement.java:1787)
                      at com.sybase.jdbc3.jdbc.SybStatement.executeUpdate(SybStatement.java:434)
                      at org.apache.commons.dbcp.DelegatingStatement.executeUpdate(DelegatingStatement.java:228)
                      at org.springframework.jdbc.support.incrementer.SybaseMaxValueIncrementer.getNextKey(SybaseMaxValueIncrementer.java:92)
                      ... 33 more

              Comment


              • #8
                Here's the more detailed description of another instace of deadlock

                Code:
                12:34:30,170 DEBUG SimpleAsyncTaskExecutor-274 TaskletStep:393 - Saving step execution before commit: StepExecution: id=19076, name=loadPosnDly, status=STARTED, exitStatus=EXECUTING, readCount=1, filterCount=0, writeCount=1 readSkipCount=0, writeSkipCount=0, processSkipCount=0, commitCount=2, rollbackCount=0, exitDescription=
                12:34:30,891  INFO SimpleAsyncTaskExecutor-274 XmlBeanDefinitionReader:315 - Loading XML bean definitions from class path resource [org/springframework/jdbc/support/sql-error-codes.xml]
                12:34:30,892  INFO SimpleAsyncTaskExecutor-275 SimpleStepHandler:113 - Executing step: [TaskletStep: [name=distTrans]]
                12:34:30,900 DEBUG SimpleAsyncTaskExecutor-275 AbstractStep:180 - Executing: id=19091
                12:34:30,924  INFO SimpleAsyncTaskExecutor-274 SQLErrorCodesFactory:125 - SQLErrorCodes loaded: [DB2, Derby, H2, HSQL, Informix, MS-SQL, MySQL, Oracle, PostgreSQL, Sybase]
                12:34:30,927 DEBUG SimpleAsyncTaskExecutor-275 StepContextRepeatCallback:67 - Preparing chunk execution for StepContext: [email protected]
                12:34:30,929 DEBUG SimpleAsyncTaskExecutor-275 StepContextRepeatCallback:75 - Chunk execution starting: queue size=0
                12:34:30,933 DEBUG SimpleAsyncTaskExecutor-275 StepScope:148 - Creating object in scope=step, name=scopedTarget.transDist
                12:34:30,936 DEBUG SimpleAsyncTaskExecutor-275 TransDistribution:31 - Call trans distrubtion for batchId=3063
                12:34:30,946 ERROR SimpleAsyncTaskExecutor-274 AbstractStep:213 - Encountered an error executing the step
                org.springframework.transaction.UnexpectedRollbackException: Transaction rolled back because it has been marked as rollback-only
                        at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:717)
                        at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:147)
                        at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:261)
                        at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:76)
                        at org.springframework.batch.repeat.support.RepeatTemplate.getNextResult(RepeatTemplate.java:367)
                        at org.springframework.batch.repeat.support.RepeatTemplate.executeInternal(RepeatTemplate.java:214)
                        at org.springframework.batch.repeat.support.RepeatTemplate.iterate(RepeatTemplate.java:143)
                        at org.springframework.batch.core.step.tasklet.TaskletStep.doExecute(TaskletStep.java:247)
                        at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:196)
                        at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:115)
                        at org.springframework.batch.core.job.flow.JobFlowExecutor.executeStep(JobFlowExecutor.java:61)
                        at org.springframework.batch.core.job.flow.support.state.StepState.handle(StepState.java:60)
                        at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:144)
                        at org.springframework.batch.core.job.flow.support.SimpleFlow.start(SimpleFlow.java:124)
                        at org.springframework.batch.core.job.flow.support.state.SplitState$1.call(SplitState.java:83)
                        at org.springframework.batch.core.job.flow.support.state.SplitState$1.call(SplitState.java:81)
                        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
                        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
                        at java.lang.Thread.run(Thread.java:619)
                12:34:30,981 DEBUG SimpleAsyncTaskExecutor-275 TaskletStep:381 - Applying contribution: [StepContribution: read=0, written=0, filtered=0, readSkips=0, writeSkips=0, processSkips=0, exitStatus=EXECUTING]
                Here's the sybase db log..

                Code:
                01:00000:01354:2010/06/18 12:34:30.84 server Deadlock Id 125 detected 
                Deadlock Id 125: detected. 1 deadlock chain(s) involved. 
                  
                Deadlock Id 125: Process (Familyid 0, Spid 1324, Suid 32) was executing a UPDATE command at line 1. 
                Deadlock Id 125: Process 1324 was involved in application ''. 
                Deadlock Id 125: Process 1324 was involved on host name 'xxxxx'. 
                Deadlock Id 125: Process 1324 was involved in transaction '$chained_transaction'. 
                SQL Text:  
                Deadlock Id 125: Process (Familyid 0, Spid 1354, Suid 32) was executing a SELECT command at line 1. 
                Deadlock Id 125: Process 1354 was involved in application ''. 
                Deadlock Id 125: Process 1354 was involved on host name 'xxxx'. 
                Deadlock Id 125: Process 1354 was involved in transaction '$chained_transaction'. 
                SQL Text: SELECT STEP_EXECUTION_ID, STEP_NAME, START_TIME, END_TIME, STATUS, COMMIT_COUNT, READ_COUNT, FILTER_COUNT, WRITE_COUNT, EXIT_CODE, EXIT_MESSAGE, READ_SKIP_COUNT, WRITE_SKIP_COUNT, PROCESS_SKIP_COUNT, ROLLBACK_COUNT, LAST_UPDATED, VERSION from BATCH_STEP_EXECUTION where JOB_EXECUTION_ID = @p0 order by STEP_EXECUTION_IDý 
                Deadlock Id 125: Process (Familyid 0, Spid 1354) was waiting for a 'shared row' lock on row 0 page 976758 of the 'BATCH_STEP_EXECUTION' table in database 'pmdb_load' but process (Familyid 0, Spid 1324) already held a 'exclusive row' lock on it. 
                Deadlock Id 125: Process (Familyid 0, Spid 1324) was waiting for a 'exclusive row' lock on row 1 page 976604 of the 'BATCH_STEP_EXECUTION' table in database 'pmdb_load' but process (Familyid 0, Spid 1354) already held a 'shared row' lock on it. 
                  
                Deadlock Id 125: Process (Familyid 0, Spid 1324) was chosen as the victim. End of deadlock information.

                Comment


                • #9
                  What version of Spring Batch are you using?

                  Comment


                  • #10
                    Spring Batch - 2.1.0.RC1

                    Comment


                    • #11
                      It might be better to upgrade to a full release instead of a milestone. Not that this would explain the issue necessarily, but there isn't much to be gained from using an unsupported distribution. There are two issues here (two stack traces):

                      To work around the problem in the AbstractStep with retry, you would need to use the retry configuration at the level of the chunk in the declaration of the step (e.g. in XML). Then you have to exclude the updateStepExecution() method from the pointcut that you use to apply retry to the JobReposository.

                      The launch seems to have failed with a DeadlockLoserException, but I can't see from the logs if was actually retried or not. Assuming it was, then it looks like you were unlucky and it failed after exhausting the retries. But you should be able to verify from the logs if it is retrying (especially with DEBUG logging for org.springframework.batch.retry). You can also tune the retry, e.g. more attempts, exponential backoff.

                      Comment


                      • #12
                        Here's the log with debug on for retry..

                        Is there a easy way to test the retry on repository rather than wait for the event to happen ?

                        Code:
                        03:52:43,721 DEBUG SimpleAsyncTaskExecutor-119 StepContextRepeatCallback:67 - Preparing chunk execution for StepContext: [email protected]
                        03:52:43,721 DEBUG SimpleAsyncTaskExecutor-119 StepContextRepeatCallback:75 - Chunk execution starting: queue size=0
                        03:52:43,722 DEBUG SimpleAsyncTaskExecutor-119 HeaderMapper:61 - parsing busDate in the header :20100628
                        03:52:43,722 DEBUG SimpleAsyncTaskExecutor-119 StepScope:148 - Creating object in scope=step, name=scopedTarget.metaDataProcessor
                        03:52:43,722 DEBUG SimpleAsyncTaskExecutor-119 StepScope:148 - Creating object in scope=step, name=scopedTarget.prodTransListStgProcessor
                        03:52:43,723 DEBUG SimpleAsyncTaskExecutor-119 ChunkOrientedTasklet:87 - Inputs not busy, ended: true
                        03:52:43,723 DEBUG SimpleAsyncTaskExecutor-118 SimpleFlow:156 - Completed state=faJob.prodSplit.0.pollOtcStg with status=COMPLETED
                        03:52:43,723 DEBUG SimpleAsyncTaskExecutor-119 TaskletStep:381 - Applying contribution: [StepContribution: read=6, written=0, filtered=6, readSkips=0, writeSkips=0, processSkips=0, exitStatus=EXECUTING]
                        03:52:43,723 DEBUG SimpleAsyncTaskExecutor-118 SimpleFlow:143 - Handling state=faJob.prodSplit.0.distOtcProd
                        03:52:43,723 DEBUG SimpleAsyncTaskExecutor-119 RetryTemplate:205 - RetryContext retrieved: [RetryContext: count=0, lastException=null, exhausted=false]
                        03:52:43,723 DEBUG SimpleAsyncTaskExecutor-118 RetryTemplate:205 - RetryContext retrieved: [RetryContext: count=0, lastException=null, exhausted=false]
                        03:52:43,723 DEBUG SimpleAsyncTaskExecutor-119 RetryTemplate:234 - Retry: count=0
                        03:52:43,723 DEBUG SimpleAsyncTaskExecutor-118 RetryTemplate:234 - Retry: count=0
                        03:52:43,724 DEBUG SimpleAsyncTaskExecutor-119 TaskletStep:393 - Saving step execution before commit: StepExecution: id=1000000012921, name=loadTransListProd, status=STARTED, exitStatus=EXECUTING, readCount=6, filterCount=6, writeCount=0 readSkipCount=0, writeSkipCount=0, processSkipCount=0, commitCount=1, rollbackCount=0, exitDescription=
                        03:52:43,725 DEBUG SimpleAsyncTaskExecutor-119 RetryTemplate:205 - RetryContext retrieved: [RetryContext: count=0, lastException=null, exhausted=false]
                        03:52:43,725 DEBUG SimpleAsyncTaskExecutor-119 RetryTemplate:234 - Retry: count=0
                        03:52:44,334 DEBUG SimpleAsyncTaskExecutor-118 RetryTemplate:205 - RetryContext retrieved: [RetryContext: count=0, lastException=null, exhausted=false]
                        03:52:44,336 DEBUG SimpleAsyncTaskExecutor-118 RetryTemplate:234 - Retry: count=0
                        03:52:44,342 DEBUG SimpleAsyncTaskExecutor-118 RetryTemplate:205 - RetryContext retrieved: [RetryContext: count=0, lastException=null, exhausted=false]
                        03:52:44,345 DEBUG SimpleAsyncTaskExecutor-118 RetryTemplate:234 - Retry: count=0
                        03:52:44,350  INFO SimpleAsyncTaskExecutor-119 XmlBeanDefinitionReader:315 - Loading XML bean definitions from class path resource [org/springframework/jdbc/support/sql-error-codes.xml]
                        03:52:44,352  INFO SimpleAsyncTaskExecutor-118 SimpleStepHandler:113 - Executing step: [TaskletStep: [name=distOtcProd]]
                        03:52:44,354 DEBUG SimpleAsyncTaskExecutor-118 AbstractStep:180 - Executing: id=1000000012925
                        03:52:44,357 DEBUG SimpleAsyncTaskExecutor-118 RetryTemplate:205 - RetryContext retrieved: [RetryContext: count=0, lastException=null, exhausted=false]
                        03:52:44,358 DEBUG SimpleAsyncTaskExecutor-118 RetryTemplate:234 - Retry: count=0
                        03:52:44,364 DEBUG SimpleAsyncTaskExecutor-118 RetryTemplate:205 - RetryContext retrieved: [RetryContext: count=0, lastException=null, exhausted=false]
                        03:52:44,365 DEBUG SimpleAsyncTaskExecutor-118 RetryTemplate:234 - Retry: count=0
                        03:52:44,377  INFO SimpleAsyncTaskExecutor-119 SQLErrorCodesFactory:125 - SQLErrorCodes loaded: [DB2, Derby, H2, HSQL, Informix, MS-SQL, MySQL, Oracle, PostgreSQL, Sybase]
                        03:52:44,377 DEBUG SimpleAsyncTaskExecutor-118 StepContextRepeatCallback:67 - Preparing chunk execution for StepContext: [email protected]
                        03:52:44,382 DEBUG SimpleAsyncTaskExecutor-118 StepContextRepeatCallback:75 - Chunk execution starting: queue size=0
                        03:52:44,380 DEBUG SimpleAsyncTaskExecutor-119 RetryTemplate:261 - Checking for rethrow: count=1
                        03:52:44,385 DEBUG SimpleAsyncTaskExecutor-118 StepScope:148 - Creating object in scope=step, name=scopedTarget.prodDistOtc
                        03:52:44,389 DEBUG SimpleAsyncTaskExecutor-119 RetryTemplate:234 - Retry: count=1
                        03:52:44,393 DEBUG SimpleAsyncTaskExecutor-118 ProdDistribution:43 - clearing otc prod step cache for batch =1733
                        03:52:44,393 DEBUG SimpleAsyncTaskExecutor-118 ProdDistribution:51 - Call product distrubutuin of batchId=1733,source=FA,prodType=O
                        03:52:44,418 ERROR SimpleAsyncTaskExecutor-119 AbstractStep:213 - Encountered an error executing the step
                        org.springframework.transaction.UnexpectedRollbackException: Transaction rolled back because it has been marked as rollback-only
                                at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:717)
                                at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:147)
                                at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:261)
                                at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:76)
                                at org.springframework.batch.repeat.support.RepeatTemplate.getNextResult(RepeatTemplate.java:367)
                                at org.springframework.batch.repeat.support.RepeatTemplate.executeInternal(RepeatTemplate.java:214)
                                at org.springframework.batch.repeat.support.RepeatTemplate.iterate(RepeatTemplate.java:143)
                                at org.springframework.batch.core.step.tasklet.TaskletStep.doExecute(TaskletStep.java:247)
                                at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:196)
                                at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:115)
                                at org.springframework.batch.core.job.flow.JobFlowExecutor.executeStep(JobFlowExecutor.java:61)
                                at org.springframework.batch.core.job.flow.support.state.StepState.handle(StepState.java:60)
                                at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:144)
                                at org.springframework.batch.core.job.flow.support.SimpleFlow.start(SimpleFlow.java:124)
                                at org.springframework.batch.core.job.flow.support.state.SplitState$1.call(SplitState.java:83)
                                at org.springframework.batch.core.job.flow.support.state.SplitState$1.call(SplitState.java:81)
                                at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
                                at java.util.concurrent.FutureTask.run(FutureTask.java:138)
                                at java.lang.Thread.run(Thread.java:619)

                        Comment


                        • #13
                          Did you move the retry out of the updateStepExecution? Your thread 119 seems to have encountered a problem but it continues immediately to retry something (probably the repository update) and then fails. This wouldn't be the case if the retry was controlled by the step. So make sure you remove your retry config from that method and move it up to the step/chunk. Can you post your new config?

                          It is quite difficult to simulate a repository failure. You could perhaps add an exception throwing APO advice, but it wouldn't really be the same thing.

                          Comment


                          • #14
                            Sorry.. not sure I completly follow this.

                            Since the problem is around the control (repository) tables, I was thinking the retry should ideally be around the repository rather than the step/chunk level.

                            If it's moved to the step the entire step would be retried right?
                            In that case there may be chunks that would already be commited and retring the entire step would try to insert the chunk again which will cause integritiy constraints in our stage tables.

                            I think in this case it should be done on the chunk level and any error while performing control operation should rollback the entire chunk but we have some steps which are tasklets calling store procedures for these we may have to configure the retry step level.

                            I'm not sure how to configure chuck & step for specific scenarios mentioned above with pointcuts.

                            please advice..

                            Comment


                            • #15
                              Having it step retry may the workaround.

                              The failure would retry the entire step but it should start from where it left off from the last successful chunck commit..

                              Removed the retry config around the repository.

                              Adding it to the methods of Step..


                              Code:
                              <aop:config>
                                  	<aop:pointcut id="stepPointCut" expression="execution(* org.springframework.batch.core.Step.*(..))"/>
                                  	<aop:advisor pointcut-ref="stepPointCut" advice-ref="retryAdvice" order="-1"/>
                              </aop:config>
                              
                              <b:bean id="retryAdvice" class="org.springframework.batch.retry.interceptor.RetryOperationsInterceptor"/>

                              Comment

                              Working...
                              X