Announcement Announcement Module
Collapse
No announcement yet.
Partition and grid size Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Partition and grid size

    Hi All,

    I need to read 10,000 records from the database and then i process it and update to the database.

    for this job i used Partition concept.
    i am using spring batch 2.0

    its working fine if my gridSize = 2. if i increase the grize size to 3 its get strucked.
    For gridSize = 3, three threads are created,each thread executed onces and then threads strucked.

    can we create 0..N number of gridSize? what is the relation to specify the gridSize ?

    if i am using ThreadPoolTaskExecutor as taskExecutor means (same issue as specifed above)

    if am using ConcurrentTaskExecutor as taskExecutor means it process all the 10,000 records and all the threads are active and job still active, we need to manually terminate the job.

    If i am using SyncTaskExecutor as taskExecutor means its working for gridSize = 0..N number.


    ******************** my XML ******************************

    <batch:job id="Prices_LoadJob" restartable="true">
    <batch:step id="Prices_LoadStep" parent="Prices_Load:master" />
    </batch:job>

    <bean name="Prices_Load:master"
    class="org.springframework.batch.core.partition.su pport.PartitionStep">
    <property name="jobRepository" ref="jobRepository" />
    <property name="stepExecutionSplitter">
    <bean
    class="org.springframework.batch.core.partition.su pport.SimpleStepExecutionSplitter">
    <constructor-arg ref="jobRepository" />
    <constructor-arg ref="PriceLoad" />
    <constructor-arg>
    <bean
    class="org.springframework.batch.core.partition.su pport.SimplePartitioner">
    </bean>

    </constructor-arg>
    </bean>
    </property>

    <property name="partitionHandler">
    <bean
    class="org.springframework.batch.core.partition.su pport.TaskExecutorPartitionHandler">
    <property name="taskExecutor">

    <bean
    class="org.springframework.core.task.SimpleAsyncTa skExecutor" />

    <!--
    <bean
    class="org.springframework.scheduling.concurrent.T hreadPoolTaskExecutor"
    p:queueCapacity="20" p:corePoolSize="20" p:maxPoolSize="20" />
    -->
    </property>
    <property name="step" ref="PriceLoad" />
    <property name="gridSize" value="3"></property>
    </bean>
    </property>
    </bean>


    <batch:step id="PriceLoad">
    <batch:tasklet job-repository="jobRepository"
    transaction-manager="transactionManager">
    <batch:chunk reader="pricesLoadReader" processor="pricesLoadProcessor"
    writer="pricesLoadWriter" commit-interval="6">
    </batch:chunk>
    <batch:listeners>
    <batch:listener ref="stepExecutionListener" />
    </batch:listeners>
    </batch:tasklet>
    </batch:step>

    <!--
    Step:Prices Load Reader : To retrieve the Records from the DaysPrice
    Table and OrderBy Based on StartDate
    -->


    <bean id="pricesLoadReader"
    class="priceload.reader.PriceLoadReader"
    scope="prototype" />

    <!--
    Step:Prices Load Processor : To the Each Record of the DaysPrice Table
    the Corresponding data is matched with Prices Table and Validation
    Process is Done
    -->
    <bean id="pricesLoadProcessor"
    class="priceload.processor.PriceLoadProcessor"
    scope="prototype" />

    <!--
    Step:Prices Load Writer : To Update the Record in the Prices Table and
    in the Price History Table
    -->
    <bean id="pricesLoadWriter"
    class="priceload.writer.PriceLoadWriter"
    scope="prototype" />

    ************************************************** *******

    thanks for your reply in advance.
    Last edited by arun4; Mar 14th, 2010, 07:50 AM.

  • #2
    What do you mean by stuck exactly? Can you trace your applications? Maybe a deadlock at the db level. Increasing the concurrency of your job might reveal that sort of things.

    Regarding your question, if your gridsize is 10, then the 10.000 records are split up in 10 manageable pieces that you may or may not run concurrently. If you're using an AsyncTaskExecutor, it means it'll create one thread per partition available. You probably want to constraint the number of threads using the ThreadPoolTaskExecution.

    So partitioning is about dividing the work. After that, your threading configuration will define the concurrency of the actual processing.

    Comment


    • #3
      Hi ,
      Thanks for your reply.

      I changed the taskExecutor to
      *********************** code *****************************
      <bean id="taskExecutor" class="org.springframework.scheduling.concurrent.T hreadPoolTaskExecutor">
      <property name="corePoolSize" value="10" />
      <property name="maxPoolSize" value="20" />
      <property name="queueCapacity" value="30" />
      </bean>
      ************************************************** *******

      but still I am facing the same problem.


      For Example, If I am processing 1,000 records, and if my Commit-Interval is 6 and Grid-Size = 3 , then I was able to process only the first 18 records. After that processing of records is stop abruptly with no errors and exceptions but
      my thread is still alive(Execution is not stopped).

      Comment


      • #4
        Of course you're facing the same error. Configure it with 3 instead of 10 and you'll get the exact same error probably since you have only 3 partitions anyway.

        The problem is most probably in your code.

        Comment


        • #5
          Hi,
          Thanks for quick reply...

          1. It is woking fine with ThreadPoolTaskExecutor. Processing of records is done , But It couldnt exit from the Job.

          2. Even if have 10 threads to perfom 10 partitioning, It is using only 3 threads at the maximum to process all records.

          I have attached the Batch_Step_Execution table details for your reference

          Comment


          • #6
            Read my replies please. The grid size determines the number of partitions. If you set the grid size to 3, you can have a thread pool task execution with a zillion threads and it will only use 3 of them (since that's the limit it can use since one thread = one partition).

            I am confused by your various post. I recommend you to put your logs to debug and check what your code is doing. If it fully works in single threaded mode and not with concurrency, there's good chance there's a deadlock somewhere

            Comment


            • #7
              Hi ,

              we changed the code. Now there is no ThreadPoolTaskExecutor. we are using only SimpleAsyncTaskExecutor.

              Even then we are facing some issues.

              For grid-Size = 2 its working fine(I have attached the batch_step_execution table as jpg). for grid-size more than 2 we face same issue like
              1. processing of records is stopped abruptly with no errors and exceptions but
              my thread is still alive(Execution is not stopped).

              2. No records are updated in Batch_Step_Execution table.

              Could you please suggest as what should be my grid-size if I need to process 1000 records.
              is there any logic in giving the grid-size ???

              my xml code:

              ******************************* code ********************

              <batch:job id="Prices_LoadJob" restartable="true">
              <batch:step id="Prices_LoadStep" parent="Prices_Load:master" />
              </batch:job>

              <bean name="Prices_Load:master"
              class="org.springframework.batch.core.partition.su pport.PartitionStep">
              <property name="jobRepository" ref="jobRepository" />
              <property name="stepExecutionSplitter">
              <bean
              class="org.springframework.batch.core.partition.su pport.SimpleStepExecutionSplitter">
              <constructor-arg ref="jobRepository" />
              <constructor-arg ref="PriceLoad" />
              <constructor-arg>
              <bean
              class="org.springframework.batch.core.partition.su pport.SimplePartitioner">
              </bean>

              </constructor-arg>
              </bean>
              </property>

              <property name="partitionHandler">
              <bean
              class="org.springframework.batch.core.partition.su pport.TaskExecutorPartitionHandler">
              <property name="taskExecutor">
              <bean class="org.springframework.core.task.SimpleAsyncTa skExecutor" />
              </property>
              <property name="step" ref="PriceLoad" />
              <property name="gridSize" value="3"></property>
              </bean>
              </property>
              </bean>

              <batch:step id="PriceLoad">
              <batch:tasklet job-repository="jobRepository"
              transaction-manager="transactionManager">
              <batch:chunk reader="pricesLoadReader" processor="pricesLoadProcessor"
              writer="pricesLoadWriter" commit-interval="8">
              </batch:chunk>
              <batch:listeners>
              <batch:listener ref="stepExecutionListener" />
              </batch:listeners>
              </batch:tasklet>
              </batch:step>


              ************************************************** ******
              Last edited by arun4; Mar 15th, 2010, 10:10 AM.

              Comment

              Working...
              X