Announcement Announcement Module
Collapse
No announcement yet.
Passing arguments to Tasklet/ItemProvider/ItemProcessor? Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Passing arguments to Tasklet/ItemProvider/ItemProcessor?

    I've spent the last days reading up on the batch framework, trying to figure out how to refactor our code to fit in it.

    I would like to share some arguments within the batch job, input filename or criterias used to filter data.
    From what I can find the JobExecutionContext or StepExecutionContext seems to be the place to handle this, but how can I get a handle to the JobExecutionContext in a ItemProvider or a ItemProcessor?

    The samples all have configured filenames which are injected to the ItemProvider. Feels like I've missed something in how to create a ItemProvider/Processor?

    /Johan

  • #2
    A processor or provider (or any bean) has access to the StepExecution if it is step scoped and implements StepContextAware, or through the StepSynchronizationManager. N.B. the APIs for this changed quite a lot recently so the *Context objects you mention no longer exist in recent snapshots. If you want to work with an API that will survive your next upgrade I recommend you get a snaphot of m3.

    The BatchResourceFactory also underwent some heavy refactoring to fit into the new world (it was pretty lame in m2 since it was not StepContextAware). This might be the best way for you to parameterise the file input.

    Generally the domain design calls for JobIdentifier to *completely* identify a JobInstance - including all runtime parameters. The StepContext is provided mainly for identification and reporting purposes. Setting a JobIdentifier strategy is a matter of using the JobExecutor* API directly, or the JobIdentiferFactory. There are only two implementations of JobIdentifier currently, so more might be required.

    We would be very interested in feedback about this - how successful is the API in your case?

    Comment


    • #3
      We have two cases where we could use Spring Batch, if I describe them maybe you could point me in the right direction?

      1. Creating reports: Here I would like to give the batch some parameters of what to include in the report, either a list of id's or date interval. The items are then passed on to a Jaxb marshaller which serialises the objects to a xml file.
      The serialising could be somewhat tricky using spring batch. Right now it's done in five steps where I first serialise the main document-object (headers and other) half way until I encounter the start-tag for the first type of data found in the query. All items of the three types of data is then serialised into the document, and finally the end-tags of the main document-object are serialised.
      As it is done right now I need to share the same instance of the marshaller between the five ItemProviders/Processors.
      In some way I would like to get notified of when the batch is finished to find out the id of the generated report. (Perhaps sending a jms-message in the last step.)

      2. Consuming reports: The only parameter to the batch is a report-id to find the xml-file to consume. This is more or less the opposite of the above description, a document is unmarshalled using Jaxb and Stax-events are passed through three different event consumers.
      This would only use one ItemProvider/Processor.
      Here I would like to use a batch-commit and the batch could benefit from the restartable patterns.

      Most of the jobs are written (not finished) in a online-fashion right now, and when I adressed the batch-commit issue I found Spring Batch. It might not be the right tool for me, but it feels close to target...

      /Johan

      Comment


      • #4
        For the first part of that question, I think it could be handled fairly easily by the StaxEventWriterOutputSource. Essentially, the output of your query would be mapped to a domain object, which the OutputSource would then serialize using jaxb. For more information on it works, please look at the documentation for the OutputSource, integration tests, and the samples jobs. Additionally, since Spring OXM is used, it's documentation is available as well.

        In terms of limiting the input, using a PropertyPlaceholderConfigurer can help with parameter passing. For example:

        Code:
        <property name="drivingQuery" value="SELECT * from TABLE where DATE > ${date}" />
        This will work well if the job is being kicked off from the command line, and only requires a placeholder configurer to be declared in the application context.

        Another approach would be to create a custom JobIdentifierFactory, that returns your own custom extension the JobIdentifier interface. This could then be obtained by making your processor step scoped, and implement the StepContextAware interface. Please keep in mind that StepScope will also need to be in the application context. (at least until our custom namespace handler is finished)

        The issue of notifying the processor when no more input is available is currently being debated:

        http://opensource.atlassian.com/proj...owse/BATCH-151

        This is part of a larger discussion about the relationship between the provider, processor, and output source. Until this is resolved in the next milestone, a work around would be to extend whichever tasklet you are using to call a 'final' method after the provider returns null. For example:

        Code:
        class CustomTasklet extends RestartableItemProviderTasklet{
        
        public ExitStatus execute(){
          ExitStatus result = super.execute()
          //only complete should cause finalize call
          if(result == ExitStatus.COMPLETED){
              if(itemProcessor instanceof FinalizableItemProcessor){
                  //cast and call finalize method
               }
            }
           return result;
          }
        }
        It's certainly not perfect, but hopefully you get the idea.

        Comment


        • #5
          Thanks for your help!

          I think I would use the approach with JobIdentifierFactory, the criterias can be combined in to many ways to just use properties in a context-file.

          Most of the logic for the jobs are already done, and from what I know now it would require to much rewriting to make them fit into the batch framework. The batch-commit issue could hopefully be resolved in a more simplistic way, and the restarting is not really a requirement. Keep it simple...

          /Johan

          Comment

          Working...
          X