Announcement Announcement Module
Collapse
No announcement yet.
Spring Batch Token Mismatch fails parsing the file Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Spring Batch Token Mismatch fails parsing the file

    We are using the FlatFile Item reader and trying to parse a file. Users open the CSV file in excel and save after modification and submit to the process that parses the file. Now as the last column had blank data, excel excludes the last comma which causes the tokens to be mismatched and parsing fails. Previously DelimitedTokenizer had a setStrict method which I thought would have avoided this problem. Is my understanding correct? Currently we are using version 2.x and I dont see that method in the DelimitedTokenizer. How can I achieve this and avoid the problem that I am having?

  • #2
    @dhupkars,

    Unfortunately, DelimitedLineTokenizer does not have a "strict?" property, probably because it is not length/size, but delimiter oriented.

    One thing I noticed is that if you set the "names" attribute to "DelimitedLineTokenizer", and then read in a line with less/more tokens that names.length, framework will throw an "IncorrectTokenCountException":

    Code:
    		if (names.length == 0) {
    			return fieldSetFactory.create(values);
    		}
    		else if (values.length != names.length) {
    			throw new IncorrectTokenCountException(names.length, values.length);
    		}
    So one thing you can do to avoid this constraint would be to set "names" to nothing:

    Code:
    	<bean id="yourFileItemReader" class="org.springframework.batch.item.file.FlatFileItemReader">
    		<property name="resource" value="classpath:data/input/${cvs.file.name}" />
    		<property name="lineMapper">
    			<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
    				<property name="lineTokenizer">
    					<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
    						<!-- just to illustrate the point: -->
                                                    <property name="names" value="" />
    					</bean>
    				</property>
    				<property name="fieldSetMapper">
    					<bean class="org.your.opensource.CustomFieldSetMapper" />
    				</property>
    			</bean>
    		</property>
    	</bean>
    It is not as clean of course as:

    Code:
       					<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
    						<property name="names" value="ID,lastName,firstName,rank" />
    					</bean>
    And also you would need to add some logic in your "CustomFieldSetMapper":

    Code:
    public class CustomFieldSetMapper implements FieldSetMapper<CustomThing> {
    
    	public CustomThing mapFieldSet(FieldSet fs) {
    		
    		if(fs == null){
    			return null;
    		}
    		
    		CustomThing customThing = new CustomThing();
    
                    int tokenCount = fs.getValues().length;
    
                    // and depending on "tokenCount", you would set
    
    		customThing.setId(fs.readRawString( 0 ));
    		customThing.setLastName(fs.readRawString( 1 ));
    		customThing.setFirstName(fs.readRawString( 2 ));
    		customThing.setRank(fs.readRawString( 3 ));
    		
    		return customThing;
    	}
    }
    Again this seems/feels like a workaround, and not as nicely typed (ID, firstName, etc.), but it'll work for malformed (missing coma) files.
    Last edited by litius; Nov 1st, 2009, 12:08 AM. Reason: formating

    Comment


    • #3
      created a JIRA for this problem: http://jira.springframework.org/browse/BATCH-1429

      Comment


      • #4
        Spring Batch Token Mismatch fails parsing the file

        Hi Litius,

        Thanks for the JIRA, do you know when this will be released as a Patch?

        Comment


        • #5
          @dhupkars,

          You can find the patch attached to the JIRA: http://jira.springframework.org/browse/BATCH-1429

          You can either apply it now, or wait until "2.1.0.M2"

          Comment

          Working...
          X