Announcement Announcement Module
No announcement yet.
Recovering from abnormal shutdown (with a persistent database) Page Title Module
Move Remove Collapse
Conversation Detail Module
  • Filter
  • Time
  • Show
Clear All
new posts

  • Recovering from abnormal shutdown (with a persistent database)

    We have a long running spring context that is a part of our batch + integration system, if the container that runs batch features shuts down abnormally while a job is running jobExplorer.findRunningExecutions("jobName") calls for that job result in atleast one result when the container is back up.

    We have our own pojo that is timed via cron to call jobOperator.startNextInstance("jobName") to run jobs that aren't already running (no results via .findRunningExecutions(String))

    Is there something built in spring batch to avoid this? Like marking all the endtimes and results for "open" jobs when the context is starting up?
    Or am I perhaps doing something wrong and this should be done automatically?

    Environment is a tomcat7 with a mysql db for batch (and generic storage for the jobs).

    edit: to clarify the situation, we have some jobs that run hourly, but can take over an hour, so we don't want to run the job again if the old instance is still running
    Last edited by deebo; Aug 28th, 2011, 03:04 AM.

  • #2
    This is a topic that crops up frequently on the forum and in JIRA. The anwser is in the reference guide (, but it is not that prominent I guess, and the discussion there is not very detailed. The key quote from the reference guide is "there is no way to automate it", but we are always open to suggestion for ways to make it easier to do manually. You can get a lot more from searching this forum or from JIRA, e.g.


    • #3
      I resolved this somewhat naively, by creating a ApplicationListener<ContextRefreshedEvent> implementation that only runs once (when the context is started). It loads all the "running" jobs (which in this case are interrupted executions) and sets the end times to new Date() and return values to UNKNOWN.

      This is nowhere near ideal for most cases, but in our case it works.

      Since job starts are asynchronous, I was wondering are there any ways to know (without the db that can get inconsistent) if a job is running in the current context? Only thing I could think of is the taskExecutor used to launch the jobs, but that would require some magic thats not really ideal. Is the job state available anywhere but the database if the jobRepository is not based on the MapJobRepositoryFactoryBean but a database implementation?


      • #4
        Your solution is, as you point out, naive, and is not going to work for many if not most deployments (e.g. where more than one JVM is, or might be, involved). But if it works for you that's great.

        If you want a reference to running executions in the current context you can get a JobExecution from the JobLauncher when you run it. Spring Batch doesn't store them anywhere, but Spring Batch Admin does (internally) in the SimpleJobService, so you could copy that code if you wanted to do something similar.