Announcement Announcement Module
Collapse
No announcement yet.
handling items with ValidationExceptions Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • handling items with ValidationExceptions

    Hi,

    I'm using a RestartableItemProviderTasklet with a FlatFileItemProvider as its itemProvider. This provider throws a ValidationException from the DefaultFlatFileInputSource that it uses when it can't parse a line in my input file that I tokenize.
    Currently, this stops the execution. I can then fix the line that has the error and restart the job. Since I've configured my SimpleStepConfiguration to keep restart data, this works quite nicely: the batch simply proceeds with the chunk where it stopped.

    However, what I would like to do now is to configure the batch to continue running if a ValidationException occurs, where I get the opportunity to handle the error myself (for instance, by logging the broken line to an error log and skip the actual processing). I had a look at the site's page on partial processing (http://static.springframework.org/sp...s/partial.html), but I can't find a good way to configure this in my application.

    Can anyone give me a pointer to where one would enable this behaviour when using a RestartableItemProviderTasklet with a FlatFileItemProvider? It seems like I would need to write or configure some ExceptionHandler or use the Skippable interface somewhere, but I can't figure out the details.

    Thanks in advance,

    Joris

  • #2
    Joris,

    This is actually something I've been meaning to address, but wasn't able to get to before Milestone 2 was released. It's a combination a few issues that make skipping records a bit weird.

    The first issues is that file input sources need to be refactored:

    http://opensource.atlassian.com/proj...owse/BATCH-140

    Essentially, because the FieldSetMapper is outside of the input source intself, in the case of a parsing exceptions (some issue in fieldset.read*) there is no way to access the original line. This is less an issue with validation exceptions though, since the line was correctly parsed into an object, which can be packaged up with the exception.

    In order to keep the job from failing, you need to add an ExceptionHandler that will log the error, but not rethrow the exception. The problem in this scenario is that if you want to write the record out to the database, rather than a log file, then you need to do so in a separate transaction, unless you're okay with reprocessing the bad record again should a rollback occur. We originally targeted the Recoverable interface for this, and if you look at the DefaultStepExecutor, you can see where it calls recoverable before passing the exception on to the template. This technically works, but it's a bit ugly right now.

    The final issue is that the skipLimit that is settable in StepConfiguration has no way of being used right now. It should be used as part of a RethrowOnThresholdExceptionHandler, however, the StepExecutorFactory would need to be modified to take the skip limit and pass it in as a parameter to the handler itself.

    With all that being said, in your own Tasklet, you could implement the same functionality, by simply catching exceptions, logging them out, never throwing them up to the template, and calling inputSource.skip, so you don't get the same record again on rollback.

    Comment


    • #3
      For a Batch workshop that I gave last week, I've created a subclass of the RestartableItemProviderTasklet that lets you use an ExceptionHandler to handle exceptions thrown by the InputProvider and skips the item if the exception is handled successfuly. I'm including it here for others that need the same functionality using 1.0M2. The usage would be something like this:
      Code:
          <bean id="myTasklet" class="com.i21.batchsample.support.ExceptionHandlingItemProviderTasklet">
              <property name="itemProvider" ref="myProvider"/>
              <property name="itemProcessor" ref="myProcessor"/>
              <property name="exceptionHandler">
                  <bean class="org.springframework.batch.repeat.exception.handler.SimpleLimitExceptionHandler">
                      <property name="type" value="org.springframework.batch.io.exception.ValidationException"/>
                      <property name="limit" value="2"/>
                  </bean>
              </property>
          </bean>
      This will skip at most two invalid items before failing the job on ValidationExceptions. Here's the code:
      Code:
      package com.i21.batchsample.support;
      
      import java.util.Collections;
      
      import org.apache.commons.logging.Log;
      import org.apache.commons.logging.LogFactory;
      import org.springframework.batch.execution.tasklet.ItemProviderProcessTasklet;
      import org.springframework.batch.execution.tasklet.RestartableItemProviderTasklet;
      import org.springframework.batch.io.Skippable;
      import org.springframework.batch.repeat.ExitStatus;
      import org.springframework.batch.repeat.RepeatContext;
      import org.springframework.batch.repeat.exception.handler.DefaultExceptionHandler;
      import org.springframework.batch.repeat.exception.handler.ExceptionHandler;
      import org.springframework.batch.repeat.synch.RepeatSynchronizationManager;
      import org.springframework.util.ClassUtils;
      
      /**
       * Deals with Exceptions of a given type from the InputProvider, so processing 
       * can continue normally, by using a configured ExceptionHandler.
       * This class should be considered as a workaround: the final version of Spring Batch
       * will definitely make it easier to deal with Exceptions thrown by the InputProvider
       * in an easier and more straightforward manner.
       * 
       * @author Joris Kuipers
       *
       */
      public class ExceptionHandlingItemProviderTasklet extends RestartableItemProviderTasklet {
          
          private ExceptionHandler exceptionHandler = new DefaultExceptionHandler();
          private boolean skipOnHandling = true;
          private final Log log = LogFactory.getLog(getClass());
          
          /**
           * The ExceptionHandler to use to handle the Exception from the ItemProvider.
           * Defaults to a DefaultExceptionHandler.
           * @param exceptionHandler
           */
          public void setExceptionHandler(ExceptionHandler exceptionHandler) {
              this.exceptionHandler = exceptionHandler;
          }
          
          /**
           * Whether to call skip() on the InputProvider after handling an Exception.
           * Defaults to true.
           * @param skipOnHandling
           */
          public void setSkipOnHandling(boolean skipOnHandling) {
              this.skipOnHandling = skipOnHandling;
          }
          
          public ExitStatus execute() throws Exception {
              try {
                  return super.execute();
              } catch (Exception e) {
                  RepeatContext context = RepeatSynchronizationManager.getContext();
                  if (originatesFromProvider(context, e)) {
                      exceptionHandler.handleExceptions(context, Collections.singleton(e));
                      // the Exception was handled successfully
                      log.info("Handled " + ClassUtils.getShortName(e.getClass())
                              + ": " + e.getMessage());
                      if (skipOnHandling && this.itemProvider instanceof Skippable) {
                          ((Skippable) itemProvider).skip();
                      }
                      return ExitStatus.CONTINUABLE;
                  }
                  // rethrow all other Exceptions
                  throw e;
              }
          }
          
          /**
           * Determines if an Exception thrown by calling super.execute() from our execute()
           * implementation originated in the ItemProvider (and not the ItemProcessor).
           * Depends on the implementation detail that there won't be an item 
           * in the RepeatContext in that case.
           * @param context current RepeatContext
           * @param e the thrown Exception, can be used by overriding methods
           * @return true if it came from the ItemProvider
           */
          protected boolean originatesFromProvider(RepeatContext context, Exception e) {
              // unfortunately, ItemProviderProcessTasklet.ITEM_KEY is private. We use the same definition.
              return !context.hasAttribute(ItemProviderProcessTasklet.class + ".ITEM");
          }
          
      }

      Comment

      Working...
      X