Announcement Announcement Module
Collapse
No announcement yet.
Another style of Batch to support 24 x 7 Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Another style of Batch to support 24 x 7

    Take financial system as an example, a job normally consists of a number of processes and each process will complete transactions on a set of accounts (that fits the criteria) before it can start the next process in the batch. Like this:

    Process A (Account 1,2) -> Process B (Account 1,2) -> Process C (Account 1,2)

    To support 24 X 7 requirement, my system needs to invoke processes of an account in sequence, concurrently with other accounts. Like this:

    Account 1 (Process A - > B -> C)
    Account 2 (Process A - > B -> C)


    And the batch needs to support end and pause job.

    Does Spring Batch provide any feature on my need? Or I can simply look into Spring (e.g. TaskExecutor)? And I don't know if Spring framework can config to invoke a sequence of methods.

  • #2
    I think what you are describing is a very common pattern which we call a pipeline batch execution. Spring Batch 1.0 will not have explicit support for such a processing pattern as an execution service, but we already have early adopters using Spring Batch infrastructure components to implement their own, and it is a very good fit overall. The plan is to provide a pipeline model as soon as possible out of the box, and get as much input as possible from the community in the meantime.

    Have you read Matt Welsh's thesis on Staged Event Driven Architectures (google SEDA)? There is also a lot of overlap with the ESB space (e.g. Mule), which takes things more into the multi-process terrirtory, and that seems natural for Spring Batch.

    Comment


    • #3
      I'm currently working on different kinds of parallelism functionality in Prometheus and pipelined parallelism is one of them.

      The goal is to extract all plumbing (threading, blocking, message dispatching) from the process itself so it only contains the 'business logic' (generating messages transforming messages, consuming messages). This makes them a lot easier to use in different contexts, easier to test, easier to understand.

      Code:
      class Account{
      	int amount;
      }
      
      class ProcessA{
      	
      	void receive(Account account){
      		//do some logic:
      		account.amount+=10;	
      	}
      }
      The same goes for process B and C.

      The processes can be connected in different ways:

      You could let them execute synchronous (so only a single thread that executes the complete chain).

      Code:
      InputChannel in = ...
      OutputChannel out = ...
      Processor processor = new StandardProcessor(in,new Object[]{processA,processB,processC},out);
      If the processor is run by a single thread, you will have your classic single threaded solution. If the processor is run by multiple threads, complete processing chains are run in parallel. But you also can decide to create a 'real' pipeline: each step is running asynchronous, so doesn't depend on the completion of the next step in the chain.

      example of a pipeline:

      Code:
      InputChannel processA_in = ...
      OutputChannel processA_out = ...
      InputChannel processB_in = ...
      ....
      
      Processor processorA = new StandardProcessor(processA_in, processA, processA_out)
      Processor processorB = new StandardProcessor(processB_in, processB, processB_out)
      Processor processorB = new StandardProcessor(processC_in, processC, processC_out)
      If you let every processor run on its own thread, you will have the 'classic' pipeline. But ofcourse, you can also let a single processor be run by multiple threads which gives you parallelism in a different direction.

      As you can see the same process can be used in different contexts and this gives a lot of freedom looking for the correct configuration for a specific situation.

      At the moment my main focus is providing 'low' level concurrency structures and there is no direct support for batches etc (you have to build it on top). Maybe it could be useful in Spring Batch?
      Last edited by Alarmnummer; Jul 9th, 2007, 07:32 AM.

      Comment


      • #4
        Method sequence

        Instead of defining the sequence in the constructor like this:

        processor = new StandardProcessor(in,new Object[]{processA,processB,processC},out);

        Is there some existing support in Spring that can configure such sequence in an XML file?

        Comment


        • #5
          Originally posted by ballsuen View Post
          Instead of defining the sequence in the constructor like this:

          processor = new StandardProcessor(in,new Object[]{processA,processB,processC},out);

          Is there some existing support in Spring that can configure such sequence in an XML file?
          If you can construct it in Java, you can also construct it in Spring

          This is a part of configuration of a demo application that uses the functionality. It calculates pi/Fibonacci (to simulate load) on a grid distributed by Terracotta, although I'm not too happy with the terracotta integration for Spring.

          Code:
           <bean id="fileWritingProcessor"
                    class="org.codehaus.prometheus.processors.standardprocessor.StandardProcessor">
                  <constructor-arg index="0"
                                   ref="fileWritingProcessor-input"/>
                  <constructor-arg index="1">
                      <list>
                          <ref bean="resequenceProcess"/>
                          <ref bean="fileWritingProcess"/>
                      </list>
                  </constructor-arg>
              </bean>
          You can find more detailed information here:
          http://pveentjer.wordpress.com/2007/...to-the-resque/
          Last edited by Alarmnummer; Jul 9th, 2007, 07:22 AM.

          Comment


          • #6
            Originally posted by Alarmnummer View Post
            If you can construct it in Java, you can also construct it in Spring

            This is a part of configuration of a demo application that uses the functionality. It calculates pi/Fibonacci (to simulate load) on a grid distributed by Terracotta, although I'm not too happy with the terracotta integration for Spring.
            Hi Alarmnummer,

            I'm interested to know what you're missing in the Terracotta integration for Spring. Are there some use-cases for you that are tricky to set up or that you have difficulties getting to scale out as you want? Please give some more information, so that we can see if there's not something we've overlooked in our Spring support. A better place for this would of course be our support forums: http://forums.terracotta.org. It would be nice if you could start a thread there about this.

            Terracotta is very suited for distributed task execution, and there's even a demo of that in our download package. Granted, that one doesn't integrate with Spring and doesn't do any of the more complex task scheduling, but it does show how a master-worker pattern can easily be brought to a cluster.

            Anyway, I'd like to get your input about the issues that are troubling you. Real-world user stories like this are very important to continue to drive making Terracotta a better product.

            Thanks
            Last edited by gbevin; Jul 19th, 2007, 02:20 AM.

            Comment


            • #7
              You *could* create different Jobs altogether, each one with a single step. The "chaining" will be provided by the availability of data (i.e. JobA will only pick up data available for process A, jobB will only pick up data available to process B, and so on).

              Comment

              Working...
              X