Announcement Announcement Module
Collapse
No announcement yet.
FlatFile to DB without POJO Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • FlatFile to DB without POJO

    Hi All,

    Is posible to define a mapping from a FlatFile into a DB without the need of a POJO, i.e. doing it declarativelly without the need to compile Java code?

    Thanks,
    GLoureiro

  • #2
    That kind of use case was never really the focus of Spring Batch (it'smore of a programmer's tool), but it might be useful if you can write down your thoughts in JIRA and see if anyone thinks they can be turned into a new feature. I know Lucas was keen to do some stuff in this area for 2.1, but he never had time. Maybe if you make a concrete suggestion or contribution it would move things forward?

    Comment


    • #3
      Iíve investigated a bit more on the subject, and I think a straight forward solution is to use predefined Apache Velocity Template to build a POJO, based on a XML chunk definition in a distinct namespace inside a the batch xml definition. On that section user define fields and datatypes, then runs an tool (to be done) that will run Velocity, generates the POJO and compile it.

      Do you have more ideas?

      Regards,
      GLoureiro

      Comment


      • #4
        What about FieldSet!
        This is a great new abstraction in Spring Batch, "represents" an anonymous POJO, or better to say a map!
        It can be configured declaratively in the config via FixedLengthTokenizer, and later can be used in the ItemWriter, like here:
        Code:
        	<bean id="giroLineTokenizer"
        		class="org.springframework.batch.item.file.transform.FixedLengthTokenizer">
        		<property name="names" value="G2,G3,origLine" />
        		<property name="columns" value="3-5,6-7,1-355" />
        		<property name="strict" value="false"/>
        	</bean>
        
        	<bean id="fieldSetWriter" class="org.springframework.batch.item.database.JdbcBatchItemWriter">
        		<property name="assertUpdates" value="true" />
        		<property name="itemSqlParameterSourceProvider" ref="fsSqlParamSourceProvider"/>
        		<property name="sql" 
        			value="INSERT INTO STAGING_GIRO (item_id, jobId, G2, G3, ORIG_LINE) 
        					VALUES (TEST_SEQ.nextval, :jobId, :G2, :G3, :origLine)" />
        		<property name="dataSource" ref="dataSource" />
        	</bean>
        
        	<bean id="fsSqlParamSourceProvider" class="example.FieldSetSqlParameterSourceProvider"/>
        Certainly some issues have arised:
        - ItemReader implementation should return FieldSet instead of business item.
        - In case of JdbcBatchItemWriter, a custom ItemSqlParameterSourceProvider is needed.

        I've chosen to write a custom LineMapper instead of a full ItemReader, which -in my case- has to support different line types of the input file, so I've borrow from PatternMatchingCompositeLineMapper:
        FieldSetLineMapper.java
        Code:
        public class FieldSetLineMapper implements LineMapper<FieldSet>, InitializingBean  {
        
        	private PatternMatchingCompositeLineTokenizer tokenizer = new PatternMatchingCompositeLineTokenizer();
        	
        	@Override
        	public FieldSet mapLine(String line, int lineNumber) throws Exception {
        		// delegate
        		return tokenizer.tokenize(line);
        	}
        
        	@Override
        	public void afterPropertiesSet() throws Exception {
        		this.tokenizer.afterPropertiesSet();
        	}
        	
        	public void setTokenizers(Map<String, LineTokenizer> tokenizers) {
        		this.tokenizer.setTokenizers(tokenizers);
        	}
        
        }
        in the config:
        Code:
        ...
        	<bean id="itemReader" class="org.springframework.batch.item.file.FlatFileItemReader">
        		<property name="resource" value="01001101.020" />
        		<property name="lineMapper" ref="fsLineMapper"/>
        	</bean>
        
        	<bean id="fsLineMapper" class="example.FieldSetLineMapper">
        		<property name="tokenizers">
        			<map>
        				<entry key="01*" value-ref="kotegFejTokenizer" />
        				<entry key="03*" value-ref="giroLineTokenizer" />
        				<entry key="05*" value-ref="kotegLabTokenizer" />
        			</map>
        		</property>
        	</bean>
        ...
        And the final piece is a custom sqlParam provider:
        Code:
        public class FieldSetSqlParameterSourceProvider implements
        		ItemSqlParameterSourceProvider<FieldSet>, StepExecutionListener {
        
        	private StepExecution stepExecution;
        
        	@SuppressWarnings("unchecked")
        	@Override
        	public SqlParameterSource createSqlParameterSource(FieldSet item) {
        
        		MapSqlParameterSource source = new MapSqlParameterSource();
        
        		Properties props = item.getProperties();
        		Set<String> keys = new HashSet(props.keySet());
        		for (String key : keys) {
        			source.addValue(key, props.get(key));
        		}
        
        		// add extra field, e.g. 'jobId' from execution context
        		addExtraSqlParameters(source);
        		
        		return source;
        	}
        
        	protected void addExtraSqlParameters(MapSqlParameterSource source) {
        		source.addValue("jobId", stepExecution.getJobExecution().getJobId());
        	}
        
        	
        	@Override
        	public void beforeStep(StepExecution stepExecution) {
        		this.stepExecution = stepExecution;
        	}
        ...
        }
        Actually there is an issue with this setup; every line in the input reaches the writer, (header/footer as well) but the writer can only handle one specific FieldSet type, (the body of the file), so in the processor a filtering must be done!

        On the other hand, this works pretty well for me
        Last edited by grenner; Mar 23rd, 2010, 08:03 AM. Reason: FieldSet url added

        Comment


        • #5
          Final piece: fieldsets are similar, I mean fieldset obtained from the file header is "equivalent" with fieldset from body line! More percisely, their difference is only in their content, but I would'nt differentiate on the content...
          So I introduced LabelledFieldSet (based on DefaultFieldSet) and a related LabelledFieldSetFactory (extending DefaultFieldSetFactory) in order to make "typed" FieldSet. Each Tokenizer instance might use its own custom factory, which creates labeled FieldSets with the configured label on them. These labels can be used later in the processor/writer to make decisions on the current fieldset item.

          in the config:
          Code:
          	<bean id="giroLineTokenizer"
          		class="org.springframework.batch.item.file.transform.FixedLengthTokenizer">
          		<property name="names" value="G2,G3,origLine" />
          		<property name="columns" value="3-5,6-7,1-355" />
          		<property name="strict" value="false"/>
          		<property name="fieldSetFactory">
          			<bean class="example.LabelledFieldSetFactory">
          				<property name="label" value="GIRO_LINE"/>
          			</bean>
          		</property>
          	</bean>
          LabelledFieldSetFactory.java:
          Code:
          public class LabelledFieldSetFactory extends DefaultFieldSetFactory {
          
          	private String label;
          	
          	public void setLabel(String label){
          		this.label = label;
          	}
          	
          	@Override
          	public FieldSet create(String[] values) {
          		LabelledFieldSet fs = new LabelledFieldSet(this.label, values);
          		return fs;
          	}
          	
          	@Override
          	public FieldSet create(String[] values, String[] names) {
          		LabelledFieldSet fs = new LabelledFieldSet(this.label, values, names);
          		return fs;
          	}
          }
          By the way, here I extending DefaultFieldSetFactory, but it could not provide the expected behaviour (with custom NumberFormat and DateFormat), since DefaultFieldSetFactory.enchance() is a private method! If it would be protected, then DefaultFieldSetFactory could be extended and reused!
          Dave, do you agree, or I missed something?

          Comment


          • #6
            If you must extend DefaultFieldSetFactory then this works fine without exposing the private method:

            Code:
            @Override
            public FieldSet create(String[] values, String[] names) {
                FieldSet fs = super.create(values, names);
                LabelledFieldSet result = new LabelledFieldSet(this.label, fs.getValues(), fs.getNames());
                return result;
            }

            Comment


            • #7
              Just to leave this topic clean and tidy, i'm afraid your suggestion wont work, since the (Default)FieldSet created by the parent factory has nothing to do with secondly instantiated LabelledFieldSet. Using your suggested code, the following test failed:

              Code:
              	@Test
              	public void testFieldSetFactory() throws Exception {
              		// set up factory
              		LabelledFieldSetFactory factory = new LabelledFieldSetFactory();
              		factory.setLabel("label");
              		DateFormat df = new SimpleDateFormat("yyyyMMdd");
              		factory.setDateFormat(df);
              		
              		// test factory
              		String dateStr = "20100406";
              		FieldSet fs = factory.create(new String[]{dateStr}, new String[]{"myDate"});
              		assertEquals(dateStr, df.format(fs.readDate("myDate")));
              	}
              Exception:
              Code:
              java.lang.IllegalArgumentException: Unparseable date: "20100406", format: [yyyy-MM-dd], name: [myDate]
              	at org.springframework.batch.item.file.transform.DefaultFieldSet.readDate(DefaultFieldSet.java:507)
              	at example.ExampleJobConfigurationTests.testFieldSetFactory(ExampleJobConfigurationTests.java:88)
              Created fieldset is expected to inherit the specified dateformat, but it had'nt.
              May I raise a jira issue to change the visibility of enchance()?
              Last edited by grenner; Apr 8th, 2010, 05:51 AM.

              Comment

              Working...
              X