Announcement Announcement Module
Collapse
No announcement yet.
XSD Validation Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • XSD Validation

    I'm new to Spring Batch, and I am having some problems getting it to validate an XML file when it is parsing it.

    I have set it up so that it parses an XML file and generates a flat file. If the input XML file is correct, everything works fine. If I change the XML to contain an error, it skips the element with the error in it, but doesn't complain, and it generates the flat file with one line less than it did when the XML was correct.

    Below is the relevant section from the Spring file.

    What do I need to change to get it to validate the XML file? According to the user guide, it should be validating based on the XSD, yet it is not doing that right now.

    <batch:job id="creditBureauImportJob" >
    <batch:step id="step1">
    <batch:tasklet>
    <batch:chunk reader="creditBureauImportReader" writer="creditBureauImportWriter"
    commit-interval="1" />
    </batch:tasklet>
    </batch:step>
    </batch:job>

    <bean id="creditBureauImportReader" class="org.springframework.batch.item.xml.StaxEven tItemReader">
    <property name="fragmentRootElementName" value="CreditBureauImport" />
    <property name="resource" value="file:src/test/resources/CreditBureauImportTest.xml" />
    <property name="unmarshaller" ref="creditBureauImportMarshaller" />
    <property name="strict" value="true"/>
    </bean>

    <bean id="creditBureauImportWriter" class="org.springframework.batch.item.file.FlatFil eItemWriter">
    <property name="resource" ref="outputResource" />
    <property name="lineAggregator">
    <bean class="org.springframework.batch.item.file.transfo rm.FormatterLineAggregator">
    <property name="fieldExtractor">
    <bean class="org.springframework.batch.item.file.transfo rm.BeanWrapperFieldExtractor">
    <property name="names" value="customerAccountId,ficoScore" />
    </bean>
    </property>
    <property name="format" value="%-16s%-3s" />
    </bean>
    </property>
    </bean>

    <bean id="creditBureauImportMarshaller" class="org.springframework.oxm.jaxb.Jaxb2Marshalle r">
    <property name="classesToBeBound">
    <list>
    <value>com.billmelater.batch.creditbureauimport_1_ 0_0.BatchCreditBureauImport
    </value>
    <value>com.billmelater.batch.creditbureauimport_1_ 0_0.CreditBureauImport
    </value>
    </list>
    </property>
    <property name="schema" value="classpath:xsd/BatchCreditBureauImport_1_0_0.xsd" />
    </bean>

    <bean id="outputResource" class="org.springframework.core.io.FileSystemResou rce">
    <constructor-arg value="target/output.txt" />
    </bean>

    Thanks,

    Sigurd

  • #2
    Hi Sigurd,

    I had the same issue and solved this by adding a first step that validates against a xsd. This step is a taskletStep.

    The code for this step is as below (errors are passed to the executionContext...)

    Code:
    import javax.xml.stream.XMLInputFactory;
    import javax.xml.stream.XMLStreamReader;
    import javax.xml.transform.Source;
    
    import org.springframework.batch.core.step.tasklet.Tasklet;
    import org.springframework.batch.item.ExecutionContext;
    import org.springframework.batch.repeat.ExitStatus;
    import org.springframework.core.io.Resource;
    import org.springframework.xml.transform.StaxSource;
    import org.springframework.xml.validation.XmlValidator;
    import org.springframework.xml.validation.XmlValidatorFactory;
    import org.xml.sax.SAXParseException;
    
    public class ValidationTasklet implements Tasklet{
    
    	private Resource resource;
    	
    	private Resource schemaResource;
    	
    	private String schemaLang = XmlValidatorFactory.SCHEMA_W3C_XML ;
    	
    	private String[] validationExceptions; 
    			
    	public void setResource(Resource resource) {
    		this.resource = resource;
    	}
    	
    	public void setSchemaResource(Resource schemaResource) {
    		this.schemaResource = schemaResource;
    	}
    	
    	public void setSchemaLang(String schemaLang) {
    		this.schemaLang = schemaLang;
    	}
    
    	public ExitStatus execute() throws Exception {
    	
    		try {
    			final XmlValidator validator = XmlValidatorFactory.createValidator(schemaResource, schemaLang);		
    			final XMLStreamReader xmlStreamReader = XMLInputFactory.newInstance().createXMLStreamReader(resource.getInputStream());
            	
    			Source source = new StaxSource(xmlStreamReader);
    			SAXParseException[] saxParseExceptions = validator.validate(source);
    			
    			int i = 0;
    			for (SAXParseException saxParseException : saxParseExceptions) {
    				validationExceptions[i] = "Line " + saxParseException.getLineNumber() + 
    										" - Column " + saxParseException.getColumnNumber() + 
    										" - " + saxParseException.getMessage();
    			}
    		} catch (Exception e) {
    			validationExceptions[0] = "Error while validating input. " + e.getMessage();
    		}
    			
    		return null;
    	}
    	
    	protected void close(ExecutionContext ctx) throws Exception {		
    		ctx.put("validationExceptions", validationExceptions);
    	}
    
    }
    Last edited by Dave Syer; Jun 21st, 2009, 02:24 PM. Reason: converted < to [

    Comment


    • #3
      Hi Batchman,

      Thanks for the reply. I was afraid I was going to have to be something like that. It makes it a lot less efficient because now it will have to parse the XML twice; once to validate it and once to process it.

      Sigurd

      Comment


      • #4
        I think you can configure the Unmarshaller out of the box to validate its input. Spring Batch just delegates to the OXM piece (http://static.springframework.org/sp.../html/oxm.html), so if you can't do validation it must mean that the underlying unmarshaller doesn't support it. You could always add it if it wasn't there just be writing your own Unmarshaller.
        Last edited by Dave Syer; Jun 23rd, 2009, 09:49 AM. Reason: spelling

        Comment


        • #5
          An example of how to configure the Jaxb Marshaller to apply schema validation:

          Code:
          <bean id="marshaller" class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
            <property name="classesToBeBound">
              <value>jaxb.Customer</value>
            </property>
            <property name="schema" value="classpath:/schemas/jaxb_demo.xsd"/>
          </bean>
          
          <bean id="itemReader" class="org.springframework.batch.item.xml.StaxEventItemReader">
            <property name="fragmentRootElementName" value="customer" />
            <property name="resource" ref="fileInputLocator" />
            <property name="unmarshaller" ref="marshaller" />
          </bean>
          
          <bean id="fileInputLocator" class="org.springframework.core.io.FileSystemResource" scope="step">
            <constructor-arg type="java.lang.String" value="#{jobParameters[file.name]}" />
          </bean>

          Comment


          • #6
            Hi Markymiddleton,

            That looks identical to what I have in the first message, and it was not validating the XML document when I ran it. Am I missing a difference between your code and mine?

            Thanks,

            Sigurd

            Comment


            • #7
              Hi Dave,

              Thanks for your feedback. I'll take writing my own unmarshaller into consideration.

              Sigurd

              Comment


              • #8
                Originally posted by markymiddleton View Post
                An example of how to configure the Jaxb Marshaller to apply schema validation:

                Code:
                <bean id="marshaller" class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
                  <property name="classesToBeBound">
                    <value>jaxb.Customer</value>
                  </property>
                  <property name="schema" value="classpath:/schemas/jaxb_demo.xsd"/>
                </bean>
                
                <bean id="itemReader" class="org.springframework.batch.item.xml.StaxEventItemReader">
                  <property name="fragmentRootElementName" value="customer" />
                  <property name="resource" ref="fileInputLocator" />
                  <property name="unmarshaller" ref="marshaller" />
                </bean>
                
                <bean id="fileInputLocator" class="org.springframework.core.io.FileSystemResource" scope="step">
                  <constructor-arg type="java.lang.String" value="#{jobParameters[file.name]}" />
                </bean>

                I'm running this on jdk1.5 and using the following maven dependencies:

                com.sun.xml.bind - jaxb-impl - 2.1.6
                javax.xml.bind - jaxb-api - 2.1
                javax.xml.stream - stax-api 1.0-2
                stax stax - 1.2.0
                stax stax-api - 1.0.1
                xerces xercesImpl - 2.0.2
                xml-apis xml-apis - 1.0.b2

                I'm also explicitly telling stax to use the factories:
                com.bea.xml.stream.EventFactory
                com.bea.xml.stream.MXParserFactory
                com.bea.xml.stream.XMLOutputFactoryBase

                Comment


                • #9
                  Hi,

                  I'm trying to validate the structure of a XML file with a xsd file

                  <bean id="xmlReader" class="org.springframework.batch.item.xml.StaxEven tItemReader" scope="step">
                  <property name="resource" value="file:FileToValidate.xml"/>
                  <property name="unmarshaller" ref="marshaller">
                  <property name="strict" value="true" />
                  </bean>

                  I'm not sure about how to create the marshaller from an existing xsd file. I was trying something like:
                  <bean id="marshaller" class="org.springframework.oxm.jaxb.Jaxb2Marshalle r">
                  <property name="schema" value="file:Validator.xsd/>
                  </bean>

                  Could some one know how to create a marshaller from an existing xsd file?

                  Thanks

                  Luis

                  Comment


                  • #10
                    Having same problem too...

                    I am having the same problem. I set one of my nodes to have a minimum length and I deleted the corresponding data from the xml file. When I run the batch chunk, it gets to the processor with the field being blank. I expected batch to throw a validation exception for that node (before batch gets to the processor I would imagine). Here's my code:

                    Code:
                    	@Bean
                    	public MultiResourceItemReader XMLMultiResourceReader() throws Exception {
                    		ExtendedMultiResourceItemReader multiResourceItemReader = new ExtendedMultiResourceItemReader();
                    
                    		multiResourceItemReader.setStrict(true);
                    		multiResourceItemReader.setResources(new PathMatchingResourcePatternResolver()
                    			.getResources(fileReadPath));
                    		multiResourceItemReader.setDelegate(getLegalZoomXMLReader());
                    
                    		return multiResourceItemReader;
                    	}
                    	
                    	public StaxEventItemReader<PartnerPurchaseOrder> getLegalZoomXMLReader() {
                    		StaxEventItemReader<PartnerPurchaseOrder> staxEventItemReader = new StaxEventItemReader<PartnerPurchaseOrder>();
                    		
                    		staxEventItemReader.setFragmentRootElementName("order");
                    		staxEventItemReader.setUnmarshaller(getXMLUnmarshaller());
                    		
                    		return staxEventItemReader;
                    	}
                    	
                    	@SuppressWarnings("rawtypes")
                    	private Unmarshaller getXMLUnmarshaller() {
                    		Jaxb2Marshaller unmarshaller = new Jaxb2Marshaller();
                    		
                    		Class[] classesToMap = {com.dbcc.ecomm.core.vo.PartnerPurchaseOrder.class,
                    								com.dbcc.ecomm.core.entity.ApplicationUser.class};
                    		unmarshaller.setClassesToBeBound(classesToMap);
                    		Resource classResource = new ClassPathResource("sample-schema.xsd");
                    		unmarshaller.setSchemaLanguage(XMLConstants.W3C_XML_SCHEMA_NS_URI);
                    		unmarshaller.setSchema(classResource);
                    		
                    		return unmarshaller;
                    	}
                    Here is my xsd file:
                    Code:
                    <?xml version="1.0" encoding="UTF-8"?>
                    <xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
                      <xs:element name="orders">
                        <xs:complexType>
                          <xs:sequence>
                            <xs:element name="order">
                              <xs:complexType>
                                <xs:sequence>
                                  <xs:element name="status">
                    	              <xs:simpleType>
                    		              <xs:restriction base="xs:string">
                    			              <xs:minLength value="5"/>
                    		              </xs:restriction>
                    	              </xs:simpleType>
                                  </xs:element>
                                  <xs:element name="user">
                                    <xs:complexType>
                                      <xs:sequence>
                                        <xs:element type="xs:string" name="email"/>
                                        <xs:element type="xs:string" name="first-name"/>
                                        <xs:element type="xs:string" name="last-name"/>
                                      </xs:sequence>
                                    </xs:complexType>
                                  </xs:element>
                                </xs:sequence>
                              </xs:complexType>
                            </xs:element>
                          </xs:sequence>
                        </xs:complexType>
                      </xs:element>
                    </xs:schema>
                    Here is a sample xml file:
                    Code:
                    <?xml version="1.0"?>
                    <orders xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                    		xsi:schemaLocation="sample-schema.xsd">
                    	<order>
                    		<status></status>
                    		<user>
                    			<email>[email protected]</email>
                    			<first-name>sdfsdf</first-name>
                    			<last-name>sdfsdfsdf</last-name>
                    		</user>
                    	</order>
                    </orders>
                    As you can see, there is validation to check that the minimum length for the status node is 5. It is 0 in the xml file. I do not get any validation exceptions of any kind and the code moves on to the batch processor with that value as "" in the object that this xml file is mapped to. I checked that the code is checking the right location for the xsd file (I just outputted resource.getUrlPath() right after the resource value is set. What am I doing wrong?

                    Comment


                    • #11
                      bump?

                      I tried a number of things and I still can't get the xml schema to catch an invalid xml file. I read online that I also have to set the ValidationEventHandler for the unmarshaller as well which I have now set, but it is still not validating. Can anyone give me some advice on why this is not working? I've looked around and have tried numerous suggestions, but none has worked so far. Thanks!

                      Comment


                      • #12
                        Figured it out. Apparantly, Spring is calling the afterPropertiesSet method before I have a chance to set the properties of the unmarshaller. This method is what actually sets the actual Schema in the marshaller. All I did is call the method again after I set all the Jax2b properties and it is now working, it is now failing validation on the invalid xml file.

                        Comment


                        • #13
                          Hi Sigurd,

                          are you done with writing your own unmarshaller?? it is definitely more resource intensive approch.
                          better if you do validation in before job. once job is triggered for reading, it will consume more and more resource.
                          other fact is XSD validation will not parse all node, it will check for minimum values only . in XSD you can define if any element or attribute is mandatory. once it will find it will not move further. but if you will go by unmarshaller , internally it use SAX parser , isn't it?? so it will go serially .
                          otherwise you can go by your own way as well....

                          Comment

                          Working...
                          X