Announcement Announcement Module
Collapse
No announcement yet.
how to convert large XML file to CSV Flat File Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • how to convert large XML file to CSV Flat File

    I am new to Spring Batch .

    I need to convert a large xml file to a comma delimeted flat file(.csv file), XML file will be input resource and expected ouput resource would be CSV flat file.
    I am using Spring Batch 1.4

    I am not sure how to archive this?

    If anyone has done anything similar or have any idea, could you please share with me,if possibel share some code how to do it.

    Thanks

  • #2
    The docs contain pretty detailed instructions on this, you essentially have an xml reader and a flat file writer configured for delimited files. The docs contain details on how to accomplish both.

    Comment


    • #3
      I am trying to use the xmlStax example provided in samples.

      I modified it to take the input from XML file and provide the output in a csv file,i am getting various exceptions.
      i am not very clear how to set mapper objects, how to set the record values to a flat file

      Code:
      <?xml version="1.0" encoding="UTF-8"?>
      <beans xmlns="http://www.springframework.org/schema/beans"
      	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      	xmlns:batch="http://www.springframework.org/schema/batch"
      	xmlns:aop="http://www.springframework.org/schema/aop"
      	xmlns:tx="http://www.springframework.org/schema/tx"
      	xmlns:p="http://www.springframework.org/schema/p"
      	xmlns:util="http://www.springframework.org/schema/util"
      	xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-2.5.xsd
      						http://www.springframework.org/schema/aop http://www.springframework.org/schema/aop/spring-aop-2.0.xsd
      						http://www.springframework.org/schema/tx http://www.springframework.org/schema/tx/spring-tx-2.0.xsd	
                                  http://www.springframework.org/schema/util http://www.springframework.org/schema/util/spring-util-2.5.xsd">
      
          <import resource="applicationContext.xml"/>
      
          <bean id="transactionManager" class="org.springframework.batch.support.transaction.ResourcelessTransactionManager"/>
      
          <bean id="step" class="org.springframework.batch.core.step.item.SimpleStepFactoryBean">
          
          	<property name="itemReader" ref="itemReader" />
          	<property name="itemWriter" ref="itemWriter" />
              <property name="transactionManager" ref="transactionManager" />
              <property name="jobRepository" ref="jobRepository" />
              <property name="commitInterval" value="10" />
              
          </bean>
      	
      	
          <bean id="itemReader" class="org.springframework.batch.item.xml.StaxEventItemReader">
              <property name="fragmentRootElementName" value="trade" />
              <property name="resource" value="file:D:/test1.xml" />
              <property name="fragmentDeserializer">
              	<bean class="org.springframework.batch.item.xml.oxm.UnmarshallingEventReaderDeserializer">
      				<constructor-arg>
      					<bean class="org.springframework.oxm.xstream.XStreamMarshaller">
      						<property name="aliases" ref="aliases" />
      					</bean>
      				</constructor-arg>
      			</bean>
      		</property>
          </bean>
      
          
      	<bean id="itemWriter"  class="org.springframework.batch.item.file.FlatFileItemWriter">
              <property name="lineAggregator" ref="lineAggregator"/>
              <property name="fieldSetCreator" ref="fieldSetMapper"/>
              <property name="resource" value="file:D:/test2.txt" />
          </bean>
      
          <bean id="lineAggregator" class="org.springframework.batch.item.file.transform.DelimitedLineAggregator">
               <property name="names"
              value="isin,quantity,price,customer" />
              <property name="delimiter" value=","/>
          </bean>
          <bean id="fieldSetMapper" class="org.springframework.batch.item.file.mapping.PassThroughFieldSetMapper"/>
              	
      	<util:map id="aliases">
      		<entry key="trade"
      			value="org.springframework.batch.sample.domain.Trade" />
      		<entry key="isin" value="java.lang.String" />
      		<entry key="quantity" value="long" />
      		<entry key="price" value="java.math.BigDecimal" />
      		<entry key="customer" value="java.lang.String" />
      	</util:map>
      	
          <bean id="simpleJob11" class="org.springframework.batch.core.job.SimpleJob">
              <property name="name" value="simpleJob11" />
              <property name="restartable" value="true" />
              
              <property name="steps">
                  <list>
                      <ref local="step"/>
                  </list>
              </property>
              <property name="jobRepository" ref="jobRepository"/>
          </bean>
      </beans>
      Can anyone look into this and help me out, i need to do it ASAP

      Thanks
      Last edited by javabee; Mar 19th, 2009, 06:14 AM.

      Comment


      • #4
        Perhaps your issue is with the "fieldSetCreator" property if the ItemWriter. The purpose of the FieldSetCreator is to take some Java object (the one passed into the ItemWriter) and convert it to a FieldSet. Usually this means calling getters on your Java object for each property that needs to be in the flat file and putting each of those properties, in order, into a new FieldSet.

        Code:
          public FieldSet mapItem(Object data) {
            User user = (User) data;
            String[] tokens = new String[] { user.getFirst(), user.getLast() };
            return new DefaultFieldSet(tokens);
          }
        You are currently using a PassThroughFieldSetMapper. This class expects a FieldSet as input, so if you ItemWriter is receiving something else, then you will have problems.

        Comment


        • #5
          Thanks, now i am able to convert the file with the help of detail given by you.

          I was trying the above with Trade example given with Spring Batch Sample with Trade.xml as input which has simple tags.

          Now i need to do the same for xml specified below, which will be using in my application.

          My real time XML file looks like below.

          Code:
          <?xml version="1.0" encoding="UTF-8"?>
          <Count>
          	<subCount version="" Date="2007-07-31T10:03:00" newDate="2007-08-01T10:03:00" id="A00001">
          		<Counter id="41015" Number="00001" value="Test">
          			<Code codeNumber="02000" id="">
          				<PurchaseSet>
          					  <Purchase type="0" count="1" code="ABC" value="1.00" /> 
          					  <Purchase type="0" count="1" code="XYZ" value="10.00" /> 
          					  <Purchase type="0" count="1" code="QWE" value="20.00" /> 
          			  </PurchaseSet>
          			  <CounterSet>
          				  <Counter strap="" countercode="123456789123456789" /> 
          				  <Counter strap="" countercode="123456789123456777" /> 
          			  </CounterSet>
          		  </Code>
          
          		  <Code codeNumber="02001" id="">
          			<PurchaseSet>
          				  <Purchase type="0" count="1" code="ABC" value="1.00" /> 
          				  <Purchase type="0" count="1" code="XYZ" value="10.00" /> 
          				  <Purchase type="0" count="1" code="QWE" value="20.00" /> 
          			  </PurchaseSet>
          			  <CounterSet>
          				  <Counter strap="" countercode="123456789123444444" /> 
          				  <Counter strap="" countercode="123456789123455555" /> 
          			  </CounterSet>
          		  </Code>
          		</Counter>
          	</subCount>
          </Count>
          In above xml i will have n number of <Code> set details and attributes along with them.
          I am not sure how to handle the attributes and same tag with different attributes(like i will have 3 <purchase> items with different attributes,how to handle them)?

          Can anyone share some idea on this, your response will be highly appreciated

          Comment


          • #6
            just an idea

            Hello,

            i have no experience with spring batch, but when i need to convert xml to another format, i immediately think of XSLT (Extensible Stylesheet Language Transformations), in combination with XPATH.

            Xslt was created to transform xml to another format.

            Anyway, i know it's not always possible to learn a new language, but it's just an idea.

            Comment


            • #7
              Thanks Davymeers.

              It will be possible using XSLT, but i am pretty sure using Spring batch it can be achived, as i am using Spring Batch for entire batch process i would like to explore a way out.

              Can any one give me some idea how to go on this.

              Comment

              Working...
              X