Announcement Announcement Module
No announcement yet.
Parallel Job Scaling Question Page Title Module
Move Remove Collapse
Conversation Detail Module
  • Filter
  • Time
  • Show
Clear All
new posts

  • Parallel Job Scaling Question

    I'm currently working on a Spring Batch POC and have got a pretty good handle on most of the actual Spring Batch features. I've currently got a program that uses Spring Integration to receive an HttpRequest and use message channels to eventually send the job executions to the job launcher in a queue. What we'd really like to do is implement some kind of "scheduler/load balancer" (not quite sure what to call it) before the job launcher that will look at the currently running worker nodes and the size of the input file and make a decision on how many worker nodes the job should be allowed. We would probably also want to be able to change the amount of worker nodes a job has while it is running to allow more jobs to run.

    The idea is that we'd have a server running that could accept many job requests at any time, and a large cluster of machines that jobs will be partitioned onto. We'd like to be able to scale horizontally, so whenever the server isn't busy it can make full use of the hardware, as well as being able to make sure that small jobs don't get constantly blocked by larger jobs.

    From my research it seems like we'd have to implement another framework to do this (do GridGain and Hadoop allow this?), but I figured I'd ask to see what people recommended to do something like this, and if there's a way to do it without implementing another large framework.

    Sorry if anything is unclear or confusing, I'm just a lowly intern who started learning Spring and Spring Batch last month and I'm far from completely understanding everything, especially this scaling stuff. Just ask and I'll try to clear things up.

    Thanks for any help!

  • #2
    GridGain certainly does it (and has been for the last 4 years). The best way is too look at examples and our JavaDoc as starting points.


    • #3
      I'm using it without GridGain, just using Remote Chunking from Spring Batch. You can start as many slaves you want.
      But you still need something to deal your messages, using ActiveMQ here. The samples from spring source in github are configured to use it too.