Announcement Announcement Module
Collapse
No announcement yet.
Partitioned step restart Bug? Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Partitioned step restart Bug?

    We noticed that if a partitioner fails in the partition step (but before the partitions are completely setup) restart is broken as the SimpleStepExecutionSplitter does call partition method again..


    Code:
    private Map<String, ExecutionContext> getContexts(StepExecution stepExecution, int gridSize) {
    
    		ExecutionContext context = stepExecution.getExecutionContext();
    		String key = SimpleStepExecutionSplitter.class.getSimpleName() + ".GRID_SIZE";
    
    		// If this is a restart we must retain the same grid size, ignoring the
    		// one passed in...
    		int splitSize = (int) context.getLong(key, gridSize);
    		context.putLong(key, splitSize);
    
    		Map<String, ExecutionContext> result;
    On restart the context appears "clean" for some reason because the previous put turns it from dirty to clean..
    		if (context.isDirty()) {
    			// The context changed so we didn't already know the partitions
    			jobRepository.updateExecutionContext(stepExecution);
    			result = partitioner.partition(splitSize);
    		}
    		else {
    Code goes here on restart even though the partitio(splitSize) never completed successfully....
    			if (partitioner instanceof PartitionNameProvider) {
    				result = new HashMap<String, ExecutionContext>();
    				Collection<String> names = ((PartitionNameProvider) partitioner).getPartitionNames(splitSize);
    				for (String name : names) {
    					/*
    					 * We need to return the same keys as the original (failed)
    					 * execution, but the execution contexts will be discarded
    					 * so they can be empty.
    					 */
    					result.put(name, new ExecutionContext());
    				}
    			}
    			else {
    				// If no names are provided, grab the partition again.
    				result = partitioner.partition(splitSize);
    			}
    		}
    
    		return result;
    	}

  • #2
    We fixed this by implementing our own splitter that tracks if the partition was successful or not:

    Code:
     private Map<String, ExecutionContext> getContexts(StepExecution stepExecution, int gridSize) {
    
               ExecutionContext context = stepExecution.getExecutionContext();
               String key = SimpleStepExecutionSplitter.class.getSimpleName() + ".GRID_SIZE";
    
               // If this is a restart we must retain the same grid size, ignoring the
               // one passed in...
               int splitSize = (int) context.getLong(key, gridSize);
               context.putLong(key, splitSize);
    
               boolean partitionSuccess = false;
               if (context.containsKey("partitionSuccess")) {
                   partitionSuccess = (Boolean)context.get("partitionSuccess");
               }
    
               Map<String, ExecutionContext> result;
               if (context.isDirty() || !partitionSuccess) {
                   // The context changed so we didn't already know the partitions
                   jobRepository.updateExecutionContext(stepExecution);
                   result = partitioner.partition(splitSize);
               }
               else {
                   if (partitioner instanceof PartitionNameProvider) {
                       result = new HashMap<String, ExecutionContext>();
                       Collection<String> names = ((PartitionNameProvider) partitioner).getPartitionNames(splitSize);
                       for (String name : names) {
                           /*
                            * We need to return the same keys as the original (failed)
                            * execution, but the execution contexts will be discarded
                            * so they can be empty.
                            */
                           result.put(name, new ExecutionContext());
                       }
                   }
                   else {
                       // If no names are provided, grab the partition again.
                       result = partitioner.partition(splitSize);
                   }
               }
    
               context.put("partitionSuccess", true);
               return result;
           }
    Last edited by mminella; Apr 8th, 2013, 03:27 PM. Reason: Formatting

    Comment


    • #3
      Can you please open a Jira issue for this?

      Comment


      • #4
        Oooeeee, my first JIRA...I'll never be the same.

        BATCH-1992

        Comment

        Working...
        X