Announcement Announcement Module
No announcement yet.
Flow definition and partitionned steps Page Title Module
Move Remove Collapse
Conversation Detail Module
  • Filter
  • Time
  • Show
Clear All
new posts

  • Flow definition and partitionned steps

    Hello !
    I've created a job with partitionned steps in this way :
    HTML Code:
            <batch:job id="partitionJob">
                    <batch:step id="step1" next="step2">
                            <batch:partition step="xml_to_tsv_step" partitioner="partitioner">
                                    <batch:handler grid-size="5" task-executor="taskExecutor" />
                    <batch:step id="step2">
                            <batch:partition step="tsv_to_somethingelse_step"
                                    <batch:handler grid-size="5" task-executor="taskExecutor" />
            <bean id="partitioner"
                    <property name="resources" value="file:#{jobParameters['']}" />
            <bean id="partitioner2"
                    <property name="resources" value="file:myfiles*.tsv" />
    This works fine in a way that :
    1. My first step is executed in parallel for all the files I pass-in as jobParameter and produces a file named myfileSomething.tsv for each input file
    2. My second step is THEN executed in parallel for all the tsv files generated in step1

    But instead, I would like to chain step1 and step2 as a partionned flow. i.e.: a single multiresourcepartitioner would split all my input files and have step 1 and step2 applied in order to each of the input files, so that I could do some file cleaning in a further step, removing my tsv files on the go as they are being processed...

    I tried to do it as described in in the second example :
    HTML Code:
    <job id="job">
        <step id="job1.flow1" flow="flow1" next="step3"/>
        <step id="step3" parent="s3"/>
    ... unfortunately STS complains that the flow property is not allowed for a step...
    Otherwise I couldn't find a way to partition a flow rather than a step. Is there an easy way ? Sorry if the question looks obvious... and thanks for your help ;-)

    Best regards,

    Last edited by psoares; Oct 3rd, 2011, 05:47 AM.

  • #2
    Hi, I think I'm doing the same thing you're trying to achieve - I ran into some problems and posted about it in this thread, perhaps the batch definitions there could help you get a bit further?
    In my case I'm partitioning over all files in a directory, then executing a conditional flow for every file (handle file, then conditionally moving it to some "done" or "error" folder depending on the outcome).

    -- Ton
    Last edited by tonvanbart; Oct 3rd, 2011, 02:27 PM. Reason: wrong URL


    • #3
      Hi Ton,
      Thank you very much ! I'll look into this first thing tomorrow at work and compare it with what I did.
      I managed to solve my problem by defining a job with sequencial steps, corresponding to what I want to do with a single file (it's a linear flow... nothing conditional for now here).

      Then, I defined a second job, with a single partitionned step, that uses the partitionner for processing all the files, and has my first job as an inner bean.
      I also put a step listener on that step, which takes the current filename from the stepExecutionContext and stores the filename in the jobexecutioncontext. That way, in job #1, I'm able to retrieve the fileName at every step.

      I'll post my file here tomorrow to share what I did.

      Best regards and thanks again.