Announcement Announcement Module
Collapse
No announcement yet.
Job restartability v. job rerunnability Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Job restartability v. job rerunnability

    If I configure a job to be restartable, I can restart the job and it will pick up where it left off. Fair enough.

    However, how can I rerun a JobInstance that has been completed?

    For example, say there's a bug in my ItemReader such that I miss some items, and my ItemWriter can correctly handle whether an item's work has already been done (inside, or outside, for that matter, a JobInstance). Now, I fix my ItemReader and rerun the Job with the same JobParameters: Batch balks, saying that there's already a JobInstance and exits with an exception.

    What is my workaround to make JobInstances rerunnable?

    One way is to require the user to manually add an "id" JobParameter and provide some unique key, but that's error-prone.

    Another way is to programatically add a JobParameter called "id" that is a UUID so that, effectively, every run of every job results in a new JobInstance. Which listener would I register with to add such a JobParameter before the JobInstance is created?

    Yet another is to request an enhancement calling for a boolean rerunnable attribute on the <job> element, which would have a similar effect.

    Are there any other ways that I'm missing? What should I do here that's less prone to error?

    Thanks,
    Matthew
    Last edited by matthewadams; May 19th, 2011, 09:28 AM.

  • #2
    Use JobParametersIncrementer

    I know there is JobParametersIncrementer which can make unique JobParameters, increasing 'run.count' whenever Batch jobs run.
    So, if you want to relaunch some jobs regardless of the jobs' status like FAILED or COMPLETED, you can configure JobParametersIncrementor like RunCountJobParameterIncrementer in the configurations of the jobs.

    Quite simple!

    Code:
    <job id="BatchJob" incrementer="runCountIncrementer" ...>
    	...
    </job>

    Comment


    • #3
      Not sure what you mean, chori. Can you elaborate please?

      Comment


      • #4
        Hi, Matthew

        Spring Batch provides a function to be able to relaunch a Batch job with the sample JobParameters as it creates new JobInstance that has same Biz JobParameters plus an unique JobParameter like 'run.id' created by JobParametersIncrementer like RunIdIncrementer.

        Firstly, RunIdIncrementer out of the box of Spring Batch should be set in your Batch Job configuration as follows.

        Code:
        <bean id="runIdIncrementer" class="org.springframework.batch.core.launch.support.RunIdIncrementer"/>
        You can change the parameter key name 'run.id' to other key by setting <property name="key" value="run.count"/>.
        Anyway,

        Code:
        <job id="BatchJob" incrementer="runIdIncrementer">
        	...
        </job>
        After configuring it, if you use CommandLineJobRunner to run Batch Jobs, you can perform the JobRunner with the option named '-next' and BIZ JobParameters. (e.g. TEST1, TEST2 and TEST3 are dependent on certain Batch Job.)

        Code:
        CommandLineJobRunner -next /sample/batch/BatchJob.xml BatchJob TEST1=1 TEST2=2 TEST3=3
        At the first time when performing the Job, the JobParameters are created in the 'BATCH_JOB_PARAMS' spring batch table.

        Code:
        TEST1=1;TEST2=2;TEST3=3;run.id=1
        And then this BatchJob finishes whether the BatchStatus of it is 'FAILED' or 'COMPLETED'.
        From next time when executing the Job with the same JobParameters, you can see them at the table below.

        Code:
        TEST1=1;TEST2=2;TEST3=3;run.id=2
        TEST1=1;TEST2=2;TEST3=3;run.id=3
        TEST1=1;TEST2=2;TEST3=3;run.id=4
        ...
        From what has mentioned above, you can re-run your BatchJobs with the sample jobParameters by using RunIdIncrementer.

        On the other hand, in terms of BatchController Service based on web apps to perform BatchJobs of request from BatchClient by http, you can call startNextInstance(jobName) method of SimpleJobOperator class to be able to re-run them.

        I think you can understand machenism of JobParametersInrementer's operation in more detaill, refer to CommandLineJobRunner and SimpleJobOperator sources.
        Last edited by chori; May 23rd, 2011, 12:33 AM.

        Comment


        • #5
          I don't know if the JobParametersIncrementer solution is valid for me. Like matthewadams, I need rerunnability in my jobs. I have a daily job with, to simplify, two parameters: A partner id and a date. Sometimes some registers can't be processed by my job this day but I want to complete the job with most of them an re-run the job in the future but with the same parameters, because there are the key to find the left registers. With the JobParametersIncrementer I can re-run the job, but, how can I find my job after?

          I mean, without the JobParametersIncrementer, I can find always my JobInstances with the JobName and the JobParameters and then monitor if this job for this date and partner company is completed or failed. But if I add a new parameter (the run.id for instance), I can't find my Job, because I don't know the last run.id used by the JobParametersIncrementer.

          If I'm wrong I really appreciate some help. I don't know if rerunnability is a common requirement and could be useful some kind of "rerunnable" flag.

          Thanks.

          Comment


          • #6
            Dear occus3,

            As far as I am concerned, another way is to use restartability in your job without using JobParametersIncrementor. I mean, if possible, you can intentionally change your job's status to 'FAILED' status if some registers left even though the job is completed. Spring Batch restarts the failed job since whenever executing a batch job, the JobRepository checks whether or not the last failed JobExecution exists in the course of creating new JobExecution and if exist, set the ExecutionContext of the last failed job, not creating new empty one.
            Changing intentionally your job's status to 'FAILED' after checking whether to remain some registers can be implemented using JobExecution.setStatus(BatchStatus.FAILED) in afterJob() method of JobExecutionListener. after develop, you can easily configure it in your job. But you should set 'restartable' property of your job as 'true' and need to consider about setting 'allow-start-if-complete' property of your steps as 'true' as well.

            If you can do so as what I mentioned above, you can monitor the failed job executions until all registers are processed.
            Additionally, you can recognize the last status of job execution by performing query joining BATCH_JOB_INSTANCE and BATCH_JOB_EXECUTION table which can use max(job_execution_id) or count(job_execution_id) and so on.

            I am not sure this way is suitable for you.

            Chori.

            Comment


            • #7
              Thanks for the idea, chori. I don't really like to left FAILED my jobs because it could be confusing. Finally as my job hasn't to be restartable but runnable, I create a new jobInstance every time with a parameter "currentDate" that takes the system date on every execution to differenciate the JobParameters.

              Thanks.

              Comment


              • #8
                I don't use the parameters incrementer, but rather just use a new Date to signify the start of the same job but at a different time. The way I structured my jobs, with custom item reader/processor/writer, they can all have transient failures that then get logged. In some cases, the records in error will be retried a certain number of times, and then marked to skip if they've failed more than 5 times. This keeps my jobs running periodically, and a small backlog of 'bad data' will build up that then has to be handled separately (perhas by another job!).

                I found this style of recoverable architecture much better than trying to account for all errors that occur during a single batch run, and then having to re-process the whole thing to recover.

                Comment

                Working...
                X