Announcement Announcement Module
Collapse
No announcement yet.
Adding Failover to Job Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Adding Failover to Job

    Hi,
    We need to add fault tolerance capability to the batch jobs we are running. For an example, lets say a job is failed due to db connection failure. We want to retry that job again after some time(after a minute). How do we achieve this using Spring-batch?


    I read somewhere in the forum that it is achievable using RetryOperationsInterceptor. Can you provide a complete example for this?. I am not sure whether we should apply this interceptor at the JobLauncher level or somewhere else? Can you throw some points on this?

    thanks in advance,
    Ram

  • #2
    Sorry this hasn't been addressed in the reference documentation just yet. However, we do have a sample job that shows exactly how to achieve this:

    Code:
    	<bean id="retrySample" parent="simpleJob">
    		<property name="steps">
    			<bean id="step1" parent="simpleStep"
    				class="org.springframework.batch.execution.step.support.StatefulRetryStepFactoryBean">
    				<property name="itemReader" ref="itemGenerator" />
    				<property name="itemWriter" ref="itemWriter" />
    				<property name="retryLimit" value="3" />
    				<property name="retryableExceptionClasses" value="java.lang.Exception" />
    			</bean>
    		</property>
    	</bean>
    Unfortunately, this FactoryBean approach has been added since M5 has been released, but you can use it in the latest trunk build, and it will go out as part of RC1. If you are using an older version of the framework, the retrySample job is still there and works with that version, you should be able to use it as a guide for configuring retry in your job.

    -Lucas

    Comment


    • #3
      N.B. the example Lucas gave retries of individual items within a job execution. To retry the whole job you need to handle it at the launcher or scheduler level. Look at the Javadocs and the Infrastructure website for more information on how to use teh RetryTemplate. But RetryTemplate and RetryInterceptor are only two ways of doing that - you could also use an external scheduling tool that re-tries failed processes. In fact that is probably more appropriate if you are going to wait a significant amount of time before a retry - you might as well stop the whole process and re-start it later (RetryTemplate obviously has to work within the same process, and will block the Thread while it waits).

      Comment


      • #4
        Originally posted by Dave Syer View Post
        N.B. the example Lucas gave retries of individual items within a job execution. To retry the whole job you need to handle it at the launcher or scheduler level. Look at the Javadocs and the Infrastructure website for more information on how to use the RetryTemplate. But RetryTemplate and RetryInterceptor are only two ways of doing that - you could also use an external scheduling tool that re-tries failed processes. In fact that is probably more appropriate if you are going to wait a significant amount of time before a retry - you might as well stop the whole process and re-start it later (RetryTemplate obviously has to work within the same process, and will block the Thread while it waits).

        How do i retry jobs at the launcher or scheduler level
        ? i.e.using RetryTemplate?
        In my case, i am extending QuartzJobBean which uses SimpleJobLauncher to launch the jobs. This job launcher uses asynchronous task executor. To retry the job at the scheduler level, scheduler has to know the job's status. But scheduler will not know the job's status because of the asynchronous nature of the launcher.

        I had used RetryTemplate successfully to retry at the item level. I don't think we can use retrytemplate at the launcher level if the launcher uses asynchronous task executor.

        I tried retrying the job at the after() method of joblistener using the injected job launcher. But it throws JobExecutionAlreadyRunning Exception.So i can't do it at the listener level.

        Any inputs? Please correct me if i am wrong.

        regards,
        kris

        Comment


        • #5
          Why do you need an asynchronous JobLauncher? If you are sceduling with Quartz doesn't that take care of any threading that might be neede? I am a bit hazy on the details and I don't have my PC with me to check them, but you could look at the Quartz sample job and see how that works (without retry).

          Comment

          Working...
          X