Announcement Announcement Module
Collapse
No announcement yet.
support for stopping a job Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • support for stopping a job

    As I migrated from M3 to M4, I noticed the stop() method in JobLauncher was moved to JobExecution. This has some consequences for me : in M3, the client of the jobLauncher could easily stop a running job by invoking stop on it.

    In M4, in order to implement my stop() method in the jobLauncher's client, I need to get a handle on the jobExecution, before the job completes, i.e. the SimleJobLauncher must run the job asynchronously so that it (almost) immediately returns a JobExecution, on which I can invoke stop().

    Is this the only possibility (besides rewriting SimpleJobLauncher) ?

  • #2
    Something has to be asynchronous (even in M3). It's the same in M4 as far as I know - you need to inject a TaskExecutor into the SimpleJobLauncher, so that you can spin off the job execution in another thread and get a reference to the JobExecution. If you weren't doing that already in M3, you must have had your own asynchronous wrapper for the JobLauncher.

    Comment


    • #3
      The client of the JobLauncher is the main application, similar to the CommandLineJobRunner, but with an additional support for interruptions, using non-public sun classes SignalHandler and Signal to register a callback that stops the jobLauncher when the process is interrupted from outside (kill command). So yes, there is another thread, but I do not have to manage it, and the jobLauncher's taskExecutor can still be synchronous.

      In BatchLauncher main class, I use something of the form :
      Code:
      private void listenToOsInterruption() {
        ...
        Signal.handle(someSignal, new SignalHandler() {
          public void handle(Signal s) {
            jobLauncher.stop();
          }
        }
        ...
      }
      while the main() method of the same class simply invokes jobLauncher.run(). None of the code above is aware of the jobLauncher being synchronous or asynchronous.

      When the java program executes, it invokes jobLauncher.run(), and if I kill the process using a registered signal, a thread is started by the JVM in the batch process, that eventually invokes the registered handle() method above, and stops the jobLauncher.

      Now, in order to be able to use this functionnality, as stop() is on JobExecution, I must get a handle on it, and therefore I need to return immediately from jobLauncher.run(), which implies that I must use an asynchronous JobLauncher.
      This implies that the main application must now wait for the job thread to end before reading and interpreting the status for the jobExecution object, and I don't know how to be notified of that event, so I have to wait for some time, then poll the object, until isRunning() is false.

      Code:
      public void execute() {
        ...
        // we should immediately return from run()
        // store the jobExecution, so that the listenToOsInterruption can find it
        this.jobExecution = jobLauncher.run(job, jobParameters);
        waitUntilCompleted();
        ...
      }
      
      private void waitUntilCompleted() {
        // synchronization/exception omitted for clarity
        while (true) {
          if (this.jobExecution.isRunning()) {
            this.jobExecution.wait(5000);
          } else {
            return;
          }
        }
      }
      
      private void listenToOsInterruption() {
        ...
        Signal.handle(someSignal, new SignalHandler() {
          public void handle(Signal s) {
            this.jobExecution.stop();
          }
        }
        ...
      }
      So, I was wondering there is a more elegant way of doing this ? I would be happy if I could configure the JobLauncher to be either synch or asynch without interfering on the the main application.

      As I write this post, I notice another issue : if the job runs asynchronously, it might be possible that job.execute(jobExecution) from taskExecutor's Runnable actually starts some times after I receive the JobExecution back to the main application. Therefore, if I invoke stop() on it, since the jobExecution did not start yet (no stepExecution is registered yet), nothing will be done, and when job.execute(jobExecution) eventually starts, the job will be started without being aware it was requested to stop. At least a flag must be set on it, in order to prevent it from starting if stop() is invoked inbetween.

      Further, the JobExecution startTime is initialized at instanciation, that is, in a delayed scenario, some times before the taskExecutor's Runnable executes (I initially thought I could use this value to test if we are between creation and execution). This is not correct, as execution time will not reflect the exact time spent during execution.

      Comment


      • #4
        I agree that a stopped JobExecution should never start in the scenario you describe first. I think it used to work properly, but that feature got lost somewhere in the last two milestones. (http://jira.springframework.org/browse/BATCH-344)

        I guess the accuracy of the start time might be a problem, but actually the "real" start time is the start time of the first step, so the data are there if you need them. Should we keep the start date null until the JobExecution actually starts, do you think?

        Your client for the job launcher looks like it never wakes up the JobExecution (wait but no notify), but since I don't follow the bit about the SignalHandler, maybe I'm missing something.

        Comment


        • #5
          I guess the accuracy of the start time might be a problem, but actually the "real" start time is the start time of the first step, so the data are there if you need them. Should we keep the start date null until the JobExecution actually starts, do you think?
          I think the JobExecution start time should be set as the time when the TaskExecutor's Runnable actually runs; but there might be another time (kind of launchTime) in JobExecution.

          Your client for the job launcher looks like it never wakes up the JobExecution (wait but no notify), but since I don't follow the bit about the SignalHandler, maybe I'm missing something.
          You are right, nobody notifies the JobExecution, which is the reason of the wait(timeout) instead of wait(), and the while(true) loop around it: I have to poll the JobExecution as I am not notified.
          But I think I will override the SimpleJob's execute() method, in order to add a notification on the JobExecution object just before returning :
          Code:
          public void execute(JobExecution execution) {
            try {
              super.execute(execution);
            } finally {
              synchronized(execution) {
                execution.notify();
              }
            }
          }
          It's a good thing I can just override it, then configure my context in order to use the subclass !!

          Regarding Signal and SignalHandler: sun.misc.Signal allows to register a callback (a sun.misc.SignalHandler) with a specific interruption signal (SIG), so that the handler executes when the OS process receives an interruption with the corresponding signal.

          Code:
          Signal.handle(new Signal("INT"), new SignalHandler() {
            public void handle(Signal signal) {
              // react to SIGINT, i.e. kill -2 or ctrl-c
            }
          }
          The handle method will be invoked by a JVM internal thread if the process receives SIGINT.
          Unfortunately, these classes are not in the public API, as there is no way to make them portable between different OSs

          Comment

          Working...
          X