Announcement Announcement Module
Collapse
No announcement yet.
How To: Rollback all the previous chunks Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • How To: Rollback all the previous chunks

    Hello Guys

    I am working with Spring Batch 2.1.9

    I have mostly the follow configuration

    Code:
    <bean id="transactionManager"
          class="org.springframework.jdbc.datasource.DataSourceTransactionManager" >
       <property name="dataSource" ref="dataSource" />	   
    </bean>
    
    <batch:job-repository id="jobRepository"
    			      data-source="dataSource"
            	          transaction-manager="transactionManager"
    			      isolation-level-for-create="SERIALIZABLE"
    				table-prefix="BATCH_"
    		                   />
    And

    Code:
    <bean id="clienteFlatFileItemReader"
    		  class="org.springframework.batch.item.file.FlatFileItemReader">
    			<property name="resource" value="classpath:/inputdata/csv/clientes.csv"/>
    			<property name="linesToSkip" value="1"/>
    			<property name="recordSeparatorPolicy" ref="defaultRecordSeparatorPolicy"/>
    			<property name="lineMapper" ref="clienteLineMapper"/>
    </bean>
    
    <bean id="clienteErrorChunk0101FlatFileItemReader"
    	  parent="clienteFlatFileItemReader" >
    			<property name="resource" value="classpath:/inputdata/csv/clientes-error-chunk01-01.csv"/>			
    </bean>
    The clientes-error-chunk01-01.csv file has 500 records like

    Code:
    idCliente,nombreCliente,apellidoCliente
    41007001,Manuel01,Jordan01
    41007002,Manuel02,Jordan02
    41007003,Manuel03,Jordan03
    41007004,Manuel04,Jordan04
    41007005,Manuel05,Jordan05
    41007006,Manuel06,Jordan06
    41007007,Manuel07,Jordan07
    41007008,Manuel08,Jordan08
    41007009,Manuel09,Jordan09
    41007010,Manuel10,Jordan10
    41007011,Manuel11,Jordan11
    41007012,Manuel12,Jordan12
    41007013,Manuel13,Jordan13
    .....
    41007497,Manuel497,Jordan497
    41007498,Manuel498,Jordan498
    41007499,Manuel499,Jordan499, EXTRA EXTRA!!!!!!!
    41007500,Manuel500,Jordan500
    See how my record number 499 has an extra field.

    Now I have a simple Job

    Code:
    <import resource="classpath:/jobs/definitions/beans/job-*-beans.xml"/>
    
    <batch:job id="simpleErrorChunk0101ImportDataJob" 
    		   job-repository="jobRepository" 
    		   >
    		<batch:step id="clienteStep" >
    			<batch:tasklet >
    				<batch:chunk reader="clienteErrorChunk0101FlatFileItemReader" 
    				             writer="clienteJdbcBatchWriter" 
    				             commit-interval="50"
    				             
    				             />
    			</batch:tasklet>		
    		</batch:step>
    </batch:job>
    Check that my commit-interval is 50, therefore 500/50 = 10 chunks

    When I execute this code, all work fine until the last chunk where it fails how is expected.

    Therefore 450 of 500 records are inserted.

    If my memory doesnt fail me, this behavior about commit/insert the others previous chunks until one chunk fail is for if we re-start/re-run later the job from the failed point, all the previous data is keep it and the job go forward from the failed point.

    I have a question, what happen if for some reason I want the follow behavior.

    If any chunk fail, all the previous commited chunks must be rollback too.

    I know would be a pain about performance but I want to know if I have the option and how I would do that

    Is Possible?

    Thanks in advanced

  • #2
    Hi Manuel,

    Spring batch will not allow you to do that out-of-the-box but if for whatever reason you want this behavior you could built it yourself. I assume that your batch is reading the CSV file and inserting the content of each item in a table or something.

    Chunk processing allows to process large amount of data in a transactional way. There would be the naive option of increasing the chunk size but that's not going to work (this will increase the necessary size of your rollback segment and your job will not scale with large files).

    Another option is to split your job in two steps: one that reads the file and insert the data in a staged area and one that reads from the staged area and impact your actual table. Or you could have a pre-processor that reads your whole file first to validate that the content is ok.

    HTH,
    S.

    Comment


    • #3
      Hi Stéphane

      Thanks for the reply

      Spring batch will not allow you to do that out-of-the-box but if for whatever reason you want this behavior you could built it yourself.
      Has sense, some bosses have some weird requirements and never listen the reason to avoid them

      I assume that your batch is reading the CSV file and inserting the content of each item in a table or something.
      Yes, you're correct

      Chunk processing allows to process large amount of data in a transactional way. There would be the naive option of increasing the chunk size but that's not going to work (this will increase the necessary size of your rollback segment and your job will not scale with large files).
      I am fine with the chunk's commit-interval

      Another option is to split your job in two steps: one that reads the file and insert the data in a staged area and one that reads from the staged area and impact your actual table
      Mmm, interesting idea.

      Or you could have a pre-processor that reads your whole file first to validate that the content is ok.
      I think it is the classic approach.

      Thanks for the ideas.

      Practically restart a job is the most obvious solution.

      Comment

      Working...
      X