Announcement Announcement Module
Collapse
No announcement yet.
Missing Records after increasing commitInterval Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Missing Records after increasing commitInterval

    Hello Everyone,

    I'm just starting to get into Spring Batch. I have to load and process over 10.000.000 records into a database. Because I'm still learning I've put together a simple job, containing one step, which reads the records from a source table and moves them to a target table. I have configured a JdbcCursorItemReader for the reading and a BatchSqlUpdateItemWriter as a writer. This setup works fine when the commitInterval uses the default value. This is rather slow, to improve performance I started to increase the commitInterval. The performance improved by a factor of 30, but records start to disappear when I do this, for a commitInterval of 10000 I'm missing 11783 out of 41003 test records.

    Can someone help me?

    Cheers Herman

    Below is the configuration I am using:

    Code:
    <?xml version="1.0" encoding="UTF-8"?>
    <beans xmlns="http://www.springframework.org/schema/beans"
    	xmlns:aop="http://www.springframework.org/schema/aop"
    	xmlns:tx="http://www.springframework.org/schema/tx"
    	xmlns:p="http://www.springframework.org/schema/p"
    	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    	xsi:schemaLocation="
    		http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-2.0.xsd
    		http://www.springframework.org/schema/aop http://www.springframework.org/schema/aop/spring-aop-2.0.xsd
    		http://www.springframework.org/schema/tx http://www.springframework.org/schema/tx/spring-tx-2.0.xsd">
    
    	<bean id="bargeCoordinateStreamingJob" class="org.springframework.batch.core.job.SimpleJob">
    		<property name="steps">
    			<list>
    				<bean id="bargeCoordinateStreamingStep" class="org.springframework.batch.core.step.item.SimpleStepFactoryBean">
    					<property name="itemReader" ref="coordinateReader"/>
    					<property name="itemWriter" ref="coordinateWriter"/>
    					<property name="jobRepository" ref="jobRepository"/>
    					<property name="transactionManager" ref="transactionManager"/>
    					<property name="commitInterval" value="10000"/>
    				</bean>
    			</list>
    		</property>
    		<property name="restartable" value="true"/>
    		<property name="jobRepository" ref="jobRepository"/>
    	</bean>
    	
    	<bean id="coordinateReader" class="org.springframework.batch.item.database.JdbcCursorItemReader">
    		<property name="dataSource" ref="dataSource"/>
    		<property name="mapper" ref="coordinateMapper"/>
    		<property name="sql">
    			<value>
    				SELECT	Date, Longitude, Latitude ,Idx
    				FROM		dbo.source_positions
    				ORDER BY	Idx
    			</value>
    		</property>
    	</bean>
    	
    	<bean id="coordinateWriter" class="org.springframework.batch.item.database.BatchSqlUpdateItemWriter">
    		<property name="itemPreparedStatementSetter" ref="coordinateMapper"/>
    		<property name="sql">
    			<value>
    				INSERT INTO dbo.target_positions (Date ,Longitude, Latitude, Idx)
         				VALUES (?, ?, ?, ?)
    			</value>
    		</property>
    		<property name="jdbcTemplate">
    			<bean class="org.springframework.jdbc.core.JdbcTemplate">
    				<property name="dataSource" ref="dataSource"/>
    			</bean>
    		</property>
    	</bean>
    	
    	<bean id="coordinateMapper" class="xx.yyyy.batch.TemporalCoordinateMapper"/>
    	
    	<!-- Datasource -->
    	<bean id="dataSource" class="com.mchange.v2.c3p0.ComboPooledDataSource" destroy-method="close">
    		<property name="user" value="****"/>
    		<property name="password" value="****"/>
    		<property name="driverClass" value="net.sourceforge.jtds.jdbc.Driver"/>
    		<property name="jdbcUrl" value="jdbc:jtds:sqlserver://LOCALHOST/TEST_DB"/>
    		<property name="initialPoolSize" value="2"/>
    		<property name="maxPoolSize" value="5"/>
    		<property name="minPoolSize" value="2"/>
    		<property name="acquireIncrement" value="1"/>
    		<property name="acquireRetryAttempts" value="0"/>
    	</bean>
    	
    	<!-- Transaction Manager used to ensure consistency of transactions. -->
    	<bean id="transactionManager" class="org.springframework.jdbc.datasource.DataSourceTransactionManager">
    		<property name="dataSource" ref="dataSource"/>
    	</bean>
    	
    	<!-- The Job Launcher, this defines the runtime behavior of the jobs. -->
    	<bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
    		<property name="jobRepository" ref="jobRepository"/>
    	</bean>
    	
    	<!-- The Job Repository, currently the state is stored in memory and won't be available after the job execution. -->
    	<bean id="jobRepository" class="org.springframework.batch.core.repository.support.SimpleJobRepository">
    		<constructor-arg index="0">
    			<bean class="org.springframework.batch.core.repository.dao.MapJobInstanceDao"/>
    		</constructor-arg>
    		<constructor-arg index="1">
    			<bean class="org.springframework.batch.core.repository.dao.MapJobExecutionDao"/>
    		</constructor-arg>
    		<constructor-arg index="2">
    			<bean class="org.springframework.batch.core.repository.dao.MapStepExecutionDao"/>
    		</constructor-arg>
    	</bean>
    	
    	<!-- The Job registry is used for easy job resolution. All jobs are automagically added to the registry by the post processor. -->
    	<bean id="jobRegistry" class="org.springframework.batch.core.configuration.support.MapJobRegistry"/>
    	<bean class="org.springframework.batch.core.configuration.support.JobRegistryBeanPostProcessor">
    		<property name="jobRegistry" ref="jobRegistry"/>
    	</bean>
    </beans>

  • #2
    One problem I see is that you haven't declared any transactions around your JobRepository. The user guide contains directions for this:

    http://static.springframework.org/sp...n.html#d0e3156

    This shouldn't cause any records to be missing though. From looking at your configuration I can't see anything that stands out. Usually when records are missing there is something related to skipping records, but you're using a setup that will cause any exceptions encountered to cause the step to fail. Without being able to debug through it myself, it seems like there could potentially be an issue with the BatchSqlUpdateItemWriter. Could you try writing a normal DAO to insert the records to see if that helps? (you could also use the SimpleJdbcInsert class from spring proper as well)

    Comment


    • #3
      Solved...

      Hello Everyone,

      Well after digging around in my code and that of the BatchSqlUpdateItemWriter, I found the problem. The coordinate class is based on the java.awt.geom.Point class which overrides the equals() and hashcode() methods. The BatchSqlUpdateWriter uses a java.util.Set to collect the items in before flushing A lot of the coordinates are apparently on the some location, this caused the equals method to evaluate to true which implicitly deleted the coordinate. After creating my own equals() and hashcode() methods everthing works fine.

      My only question is why the batchSqlUpdateWriter uses a Set for collecting items while a List seems more suitable for the job? Anyway thanks for the quick response!

      Cheers,

      Herman

      Comment


      • #4
        Originally posted by Westerflyer View Post
        My only question is why the batchSqlUpdateWriter uses a Set for collecting items while a List seems more suitable for the job? Anyway thanks for the quick response!
        I'm glad you were able to get it to work for you. I'll leave the above question to Dave to answer, since he wrote that particular writer. However, I will say that having a correct equals and hashcode is important in Spring Batch (besides being a good practice) If you use Skip functionality, the framework will use equals to identify skipped records as well. The same thing with retry.

        Comment

        Working...
        X