Announcement Announcement Module
Collapse
No announcement yet.
Flow definition and partitionned steps Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Flow definition and partitionned steps

    Hello !
    I've created a job with partitionned steps in this way :
    HTML Code:
            <batch:job id="partitionJob">
                    <batch:step id="step1" next="step2">
                            <batch:partition step="xml_to_tsv_step" partitioner="partitioner">
                                    <batch:handler grid-size="5" task-executor="taskExecutor" />
                            </batch:partition>
                    </batch:step>
                    <batch:step id="step2">
                            <batch:partition step="tsv_to_somethingelse_step"
                                    partitioner="partitioner2">
                                    <batch:handler grid-size="5" task-executor="taskExecutor" />
                            </batch:partition>
                    </batch:step>
            </batch:job>
    
            <bean id="partitioner"
                    class="org.springframework.batch.core.partition.support.MultiResourcePartitioner"
                    scope="step">
                    <property name="resources" value="file:#{jobParameters['input.file.name']}" />
            </bean>
    
            <bean id="partitioner2"
                    class="org.springframework.batch.core.partition.support.MultiResourcePartitioner"
                    scope="step">
                    <property name="resources" value="file:myfiles*.tsv" />
            </bean>
            [...]
    This works fine in a way that :
    1. My first step is executed in parallel for all the files I pass-in as input.file.name jobParameter and produces a file named myfileSomething.tsv for each input file
    2. My second step is THEN executed in parallel for all the tsv files generated in step1

    But instead, I would like to chain step1 and step2 as a partionned flow. i.e.: a single multiresourcepartitioner would split all my input files and have step 1 and step2 applied in order to each of the input files, so that I could do some file cleaning in a further step, removing my tsv files on the go as they are being processed...

    I tried to do it as described in http://static.springsource.org/sprin...external-flows in the second example :
    HTML Code:
    <job id="job">
        <step id="job1.flow1" flow="flow1" next="step3"/>
        <step id="step3" parent="s3"/>
    </job>
    ... unfortunately STS complains that the flow property is not allowed for a step...
    Otherwise I couldn't find a way to partition a flow rather than a step. Is there an easy way ? Sorry if the question looks obvious... and thanks for your help ;-)

    Best regards,

    Philippe
    Last edited by psoares; Oct 3rd, 2011, 06:47 AM.

  • #2
    Hi, I think I'm doing the same thing you're trying to achieve - I ran into some problems and posted about it in this thread, perhaps the batch definitions there could help you get a bit further?
    In my case I'm partitioning over all files in a directory, then executing a conditional flow for every file (handle file, then conditionally moving it to some "done" or "error" folder depending on the outcome).

    -- Ton
    Last edited by tonvanbart; Oct 3rd, 2011, 03:27 PM. Reason: wrong URL

    Comment


    • #3
      Hi Ton,
      Thank you very much ! I'll look into this first thing tomorrow at work and compare it with what I did.
      I managed to solve my problem by defining a job with sequencial steps, corresponding to what I want to do with a single file (it's a linear flow... nothing conditional for now here).

      Then, I defined a second job, with a single partitionned step, that uses the partitionner for processing all the files, and has my first job as an inner bean.
      I also put a step listener on that step, which takes the current filename from the stepExecutionContext and stores the filename in the jobexecutioncontext. That way, in job #1, I'm able to retrieve the fileName at every step.

      I'll post my file here tomorrow to share what I did.

      Best regards and thanks again.

      Philippe

      Comment

      Working...
      X