Announcement Announcement Module
No announcement yet.
Load batch job config at runtime Page Title Module
Move Remove Collapse
Conversation Detail Module
  • Filter
  • Time
  • Show
Clear All
new posts

  • Load batch job config at runtime


    I am already using Spring Batch for a limited number of jobs, but I am in the process of transferring all jobs from our custom batch framework to our Spring Batch implementation.

    The problem is that there are close to 50 jobs in our custom framework. If I put it into our current implementation, all spring configs for all jobs will be loaded at batch startup. Not only will this take an obscene amount of time, but it will also place quite a bit of stress on the heap space etc.

    Is there a way to configure the jobs to run, but load the specific spring configs for that job at runtime? That way, once the job has completed, the context can be closed and objects garbage collected as required.

    My initial solution is to override SimpleJob or FlowJob and more specifically override the doExecute(JobExecution execution) method. This will load the Job specific spring context, set the steps in the parent *Job and then continue with the execution logic.

    My main queries are:
    • Will this work?
    • Does anyone have any better solutions?


  • #2
    There is a JobLoader interface in Spring Batch. It wasn't designed for this use case, but you might be able to make it work (there is a method for loading one ApplicationContext but not for unloading a single one because it wasn't needed yet). I'd be interested to hear how it goes if you try it because I can imagine all sorts of potential memory leaks.


    • #3
      I've been prototyping this all day and I've come up with a solution based on overriding SimpleJob.

      By splitting the <batch:job ...../> defintion from the <step....> definitions, I was able to load all jobs (2 at the moment) at startup, but the steps were not loaded until a job instance was actually started.
      By subclassing the SimpleJob, I could create it in the spring context as:

      <bean id="jobA" class="ContextLoadingSimpleJob" parent="dailyJobParent" >
      		<property name="contexts">
      This is still picked up by the JobRegistrar and so still available for job creation, but does not have the overhead of having all of its objects created thus far.

      Then, once a Job Instance is created and the job is run, the doExecute method in Simplejob is overridden in ContextLoadingSimpleJob as:

      	protected void doExecute(JobExecution execution) throws JobInterruptedException, JobRestartException, StartLimitExceededException {
      		ConfigurableApplicationContext context = new ClassPathXmlApplicationContext(contexts.toArray(new String[0]), applicationContext);
      		setSteps((List<Step>) context.getBean("steps"));
      Therefore placing a contract on the spring contexts configured earlier that there must be a List<String> of Step objects to be used by the ContextLoadingSimpleJob.

      As I mentioned before, I am only prototyping today so haven't performed any memory analysis on this yet, but once the Job has finished, the context will be closed and everything that isn't being used will hopefully be garbage collected.


      • #4
        Interesting. You wouldn't have to give up the Batch XML namespace features if you used FlowJob and injected a Flow. Does that work?

        You probably want to put that context.close() in a finally block:

        ConfigurableApplicationContext context = ...
        try {
          setSteps((List<Step>) context.getBean("steps"));
        } finally {


        • #5
          Thanks for the tip. How would I use this by injecting a Flow?


          • #6
            FlowJob has a setFlow() method. You could define a <flow.../> with your step logic in the XML that you load in the overridden doExecute().

            You might need to think about what will happen when two executions happen concurrently. It seems like it would be safe as long as you are prepared to synchronize the execution, but maybe you can be more relaxed.


            • #7
              I think in this context I will want to prevent concurrent execution. As these are batch jobs putting data into a database, they are sometimes required to delete from data first, which wouldn't play nice if it was done concurrently. It could be useful for future projects though.

              Thanks for your tips.


              • #8
                I am wondering the reasons for using JobRegistrar? Is it for a "production run" configuration need or other?

                Particularly, we're struggling with Spring test contexts with adding a JTA manager (it is loading multiple times - by each test, which is not allowed of course) and it seems the only way to solve it is to load all jobs into JobRegistrar with a top-level @ContextConfiguration for all tests. But revamping to use it is taking time, we're not sure if we are on the correct path, and we have the same concern as you with loading all the jobs to run just one (for prod and tests). I'm still looking into how to arrange the config files for tests and prod run. If you have advice/examples/RTFM, would really appreciate it! This is my post attempting to ask about this.


                • #9
                  For integration testing I usually don't bother with the AutomaticJobRegistrar (just import the config files needed to get the job to test), but then I don't have to deal with JTA very often. You might need to use @DirtiesContext if you are having trouble with global application contexts interfering with each other in a test suite (hopefully whatever is causing your problem cleans itself up if the context is closed).