Announcement Announcement Module
Collapse
No announcement yet.
Batch Processing Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Batch Processing

    Hi. For our company I was implementing batch job with spring.
    I have a reader, a processor and a writer.
    Reader reads from xml file
    Processor finds objects from db and updates changes from xml
    Writes saves to db.

    Everything is good and working, but for processing I am hitting to db for every single item. I need to somehow run processor just like writing is being run, once for big chunk.

    i am surprised that framework does not allow this since it is common use case.

    how can i do that, ie process batch of items, not every single item, but reader returns one item in one time.

    thanks.
    Last edited by elbek; Feb 11th, 2013, 05:14 PM. Reason: comma

  • #2
    The way chunk based processing is implemented within Spring Batch is that the ItemReader and ItemProcessor are executed once per item and the ItemWriter is executed once per chunk. This is because the typical area for optimization is within writing. That being said, there are a number of options that you can do to limit the number of times the database is hit from an ItemProcessor. Options I've used in the past include:
    1. File sorting: Sort the file in the order of the item that needs to be retrieved from the database. As that item changes, you query the database (this concept is called a control break). In this case, you are hitting the database once each time the control value changes.
    2. Cache database values: If the dataset is small enough, using something like ehCache can be used to cache database results in memory so that they don't hit the database again.

    Another option would be to restructure your item to be an aggregation of multiple items as you currently have it and commit after each item. This requires you to write a lot of extra code to handle the aggregation on the read, looping in the processing and the line aggregation in the write...but it would probably work.

    Comment


    • #3
      exactly what i used to lookup keys in the database :

      Code:
      public class IdCountListener extends ItemListenerSupport<Map<String, Object>, Map<String, Object>> implements ChunkListener, InitializingBean {
      	
      	private IdKeyLookup lookup;
      	
      	private Logger logger = LoggerFactory.getLogger(getClass());
      	
      	public void setLookup(IdKeyLookup lookup) {
      		this.lookup = lookup;
      	}
      	
      	public void afterRead(Map<String, Object> item) {
      		lookup.addKey(UUIDConverter.getUUID(item.toString()).toString());
      	}
      	
      	public void beforeChunk() {
      		lookup.reset();
      	}
      
      	public void afterChunk() {
      	}
      
      	public void afterPropertiesSet() throws Exception {
      		Assert.notNull(lookup);
      	}
      
      }

      Comment

      Working...
      X