Announcement Announcement Module
Collapse
No announcement yet.
Writing a Spring Batch FileReader with a Header and Footer Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Writing a Spring Batch FileReader with a Header and Footer

    OK, newbie question here. I'm trying to write a simple FlatFileItemReader where each input file has a header line, several body lines, and a footer line. I configured a PatternMatchingCompositeLineMapper for the reader, since the header, footer, and body lines begin with a different sequence of characters, and I couldn't find a way for the reader to "skip" the first and last line of the input to process the body of the file. There may be another way to do this, but I haven't found an example of it yet.

    Assuming that this is the way to do it, my first question is this: If the step has a reader, processor, and writer like the following:

    Code:
    <batch:job id="job" job-repository="jobRepository">
    	<batch:step id="step">
    		<batch:tasklet transaction-manager="transactionManager">
    			<batch:chunk reader="reader" processor="processor" 
    				 writer="writer" commit-interval="1"/>
    		</batch:tasklet>		
    	</batch:step>
    </batch:job>
    and the reader is like the following:

    Code:
    <bean id="reader" class="org.springframework.batch.item.file.FlatFileItemReader">
    	<property name="resource" ... />
    	<property name="lineMapper" ref="mapper" />
    </bean>
    and the mapper is like the following (assuming the first line starts with 000, all body lines start with 001, and the last line starts with 002):

    Code:
    <bean id="mapper"	
    class="org.springframework.batch.item.file.mapping.PatternMatchingCompositeLineMapper">
    <property name="tokenizers">
    	<map>
    		<entry key="000*" value-ref="headerTokenizer" />
    		<entry key="001*" value-ref="bodyTokenizer" />
    		<entry key="002*" value-ref="footerTokenizer" />
    	</map>
    </property>
    <property name="fieldSetMappers">
    	<map>
    		<entry key="000*" value-ref="headerMapper" />
    		<entry key="001*" value-ref="bodyMapper" />
    		<entry key="002*" value-ref="footerMapper" />
    	</map>
    </property>
    </bean>
    and your processor is also a CompositeItemProcessor, like the following:

    Code:
    <bean id="processor" 
      class="org.springframework.batch.item.support.CompositeItemProcessor">
    	<property name="itemProcessors">
    		<list>
    			<bean class="com...HeaderProcessor"/>
    			<bean class="com...BodyProcessor"/>
    			<bean class="com...FooterProcessor"/>
    		</list>
    	</property>
    </bean>
    I'm assuming from the above that I have to use a CompositeItemWriter, and the the processors and writers are typed by the bean that each tokenizer and fieldSetMapper returns, in the form of :

    Code:
    public class HeaderFieldMapper implements FieldSetMapper<InputFileHeader> {
    
       @Override
       public InputFileHeader mapFieldSet(FieldSet fields) {
       }
    }
    If you're still with me, that would mean that the PatternMatchingCompositeLineMapper.mapLine() method would return different objects in the reader. can you infer from this that the processor and writer (themselves composites) delegate which processor/writer to call based on the type that mapLine() returns?

    I would really appreciate it if someone could tell me if this approach is sound. This is my first time around with Batch, and the learning curve (and lack of starting up docs) is a bit of a problem.

    Thanks!
    --
    Brian Gardner
    Java Architect, Verizon Wireless

  • #2
    As you will see in the source code, the CompositeItemProcessor works by passing the item into a series of delegate ItemProcessors. By contrast, you need an ItemProcessor that passes the item into *one* of the delegate ItemProcessors. To do this, you will need to write a custom composite ItemProcessor like this:
    Code:
    public class MyItemProcessor implements ItemProcessor<Object, Object>, InitializingBean {
    
        private ItemProcessor<TypeA, TypeA> typeAProcessor;
        private ItemProcessor<TypeB, TypeB> typeBProcessor;
        private ItemProcessor<TypeC, TypeC> typeCProcessor;
    
        public Object process(Object item) throws Exception {
            if (item instanceof TypeA) {
                return typeAProcessor.process((TypeA) item);
            }
            else if (item instanceof TypeB) {
                return typeBProcessor.process((TypeB) item);
            }
            else if (item instanceof TypeC) {
                return typeCProcessor.process((TypeC) item);
            }
            throw new RuntimeException("cannot handle: " + item);
        }
    
        public void afterPropertiesSet() throws Exception {
            Assert.notNull(typeAProcessor);
            Assert.notNull(typeBProcessor);
            Assert.notNull(typeCProcessor);
        }
    
        //setters omitted
    Last edited by DHGarrette; Jun 15th, 2009, 11:48 AM.

    Comment


    • #3
      Dan,

      Thanks for the help here. It didn't really occur to me that the Composite processes a succession of things on what is being passed to it, but now it makes total sense. In the case where I am processing a file with a header line and footer line (which are really being ignored for the most part) is a CompositeLineMapper really the right approach? For this particular app (just a POC at this point) something simpler would be great.

      Is there a better pattern for this?

      Thanks again!

      Brian

      Comment


      • #4
        The FlatFileItemReader has a "linesToSkip" property and a "skippedLinesCallback" property that when used in combination, processes the first n lines using the callback instead of the regular process. You could use this for your header, but it wouldn't help you with the footer.

        Comment


        • #5
          What you could do is write your own FileReader (and use FlatFileReader as a template) - the only thing you would need to change is readLine() to cater for the the footer row to be omitted - the FlatFileReader already has an option to skip the header. (lineCount attribute)

          Comment


          • #6
            Actually, the footer has something useful that I can least log (the number of entries and the dollar amount total in the file). I implemented the processor as you (Dan) indicated, and now I come to the writer. Since the write() method takes a List, which (depending on the chunk) could contain header, body, or footer elements, it seems I might have to iterate the list to see what kind of item I have here, like:

            Code:
            private ItemWriter<MyBody> bodyWriter;
                
            public void write(List<? extends Object> items) throws Exception {
            
              for (Object item : items) {
                if (item instanceof MyHeader) {
                   MyHeader hdr = (MyHeader) item;
                   // ignore or log
                } else if (item instanceof MyBody) {
                   bodyWriter.write((MyBody) item);
                } else if (item instanceof MyFooter) {
                  MyFooter footer = (MyFooter) item;
                  log.info("Footer contains : " + footer.toString());
                } else {
                  throw new RuntimeException("Item Processor cannot handle: " + item);  
                }
            The problem here is that the write() method would have to be implemented for a single item, which would undo things a bit... Any thoughts?

            Brian

            Comment


            • #7
              You can create a list of each item type and use those for the delegate writers

              Code:
              public void write(List<? extends Object> items) throws Exception {
               
                List<MyBody> bodyList = new ArrayList<MyBody>(); 
                
                for (Object item : items) {
                  if (item instanceof MyHeader) {
                     MyHeader hdr = (MyHeader) item;
                     // ignore or log
                  } else if (item instanceof MyBody) {
                     bodyList.add((MyBody) item);
                  } else if (item instanceof MyFooter) {
                    MyFooter footer = (MyFooter) item;
                    log.info("Footer contains : " + footer.toString());
                  } else {
                    throw new RuntimeException("ItemWriter cannot handle: " + item);  
                  }
                  bodyWriter.write(bodyList);

              Comment


              • #8
                Dan,

                I thought of that, but I didn't want to incur the overhead of creating and adding to a List. For now, that's probably the only way for this POC. I'll think of something more clever later.

                Thanks for all your help! You've been awesome!

                Brian

                Comment


                • #9
                  To whom it may concern,
                  I have a similar problem with readers but do not have any pattern to distinguish headers and footers from main body. Exactly, I know the numbers of lines at the first of the source (headers) and end of source (footers).
                  I could utilize "linesToSkip" and "skippedLinesCallback" to handle headers. However, in this manner, for handling the footers, I have to write alternative "FlatFileItemReader" contains "footerLinesToSkip" and "footerLinesCallback.”
                  Exactly I add a FIFO (First In First Out) buffer with size of footer to class. Here is the detailed information about mentioned buffer:


                  Code:
                  import org.apache.commons.collections.*;
                  ...
                  int footerLinesToSkip ;
                  ...
                  Buffer footerBuffer = BufferUtils.synchronizedBuffer(new BoundedFifoBuffer(footerLinesToSkip));
                  Then I fill buffer in "doOpen" method without increasing "lineCount”, remove an object from buffer for each request of reading in “doRead” method, and return it as read object. Then nextLine will be added to buffer.
                  Therefore, we always can see next "footerLinesToSkip" of resource. Whenever the nextLine would be null, removing items from buffer in doRead Method would be stopped and null would be returned to indicate end of input body and "handleFooter" method would be called.
                  Code:
                  private void handleFooter() {
                              if (footerSkippedLinesCallback == null)
                                          return;
                              else {
                                          for (int i = 0; i < footerLinesToSkip; i++) {
                                                      String footerLine = (String) footerBuffer .remove();
                                                      footerSkippedLinesCallback.handleLine(footerLine);
                                          }
                              }
                  }
                  I am new in SB so it is possible that “I reinvent the wheel!” If there was a better solution in SB I would be appreciate someone guiding me.

                  Yours sincerely
                  Atefeh Zareh

                  Comment


                  • #10
                    Writing a Spring Batch FileReader with a Header and Footer

                    I try to attach my code in site but I can not. So you can download it from below link:
                    http://dc255.4shared.com/download/Ka...45744-f651e52d

                    Here is a sample of using this reader:

                    Code:
                    <bean id="itemReader"
                    	class="FlatFileItemReader2"
                    	scope="step">
                    	<property name="headerLinesToSkip" value="#{jobParameters['headerSize']}" />
                    	<property name="headerSkippedLinesCallback" ref="headerCallback" />
                    	<property name="footerLinesToSkip" value="#{jobParameters['footerSize']}" />
                    	<property name="footerSkippedLinesCallback" ref="footerCallback" />
                    	<property name="resource" ref="resourceReader" />
                    	<property name="lineMapper" ref="lineMapper" />
                    </bean>
                    ...

                    Comment

                    Working...
                    X