Announcement Announcement Module
No announcement yet.
Iterate/Loop over job? Page Title Module
Move Remove Collapse
Conversation Detail Module
  • Filter
  • Time
  • Show
Clear All
new posts

  • Iterate/Loop over job?

    hi all,

    I have a graph algorithm that I need to run a mapreduce job iteratively. Each time, setting the output path to the input path of the next iteration. Currently, I can run iteratively using org.apache.hadoop.mapreduce.Job, and org.apache.hadoop.util.ToolRunner and running from commandline.

    How can I do this with Spring Hadoop?
    I found I can chain different mapreduce jobs using Spring Batch. But I don't know how to iterate through the same mapreduce job using Spring Hadoop/Batch? (each time setting the output path to the input path of the next iteration)

    Any suggestions and references will be greatly appreciated.

    Kind Regards,
    Anthony Mak

  • #2
    Loops are supported by Spring Batch - see the user guide for more info [1].
    As for the dynamic nature, you can use SpEL (Spring Expression Language) or a dedicated FactoryBean for programmatic generation - see the late-binding section.
    In short, rather then passing a 'hard-coded' value to each step you can either pass a SpEL expression that gets evaluated each time (the easiest route) or a reference to a factory that returns the proper value.
    I recommended the SpEL route, since it provides first class support for invoking method on beans or arbitrary classes without forcing any interface to be implemented or class inheritance.