Announcement Announcement Module
Collapse
No announcement yet.
How To: Read Once and Write 3 times in parallel Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • How To: Read Once and Write 3 times in parallel

    Hello guys

    I have the follow situation working with these tables for an exposition.

    Code:
    Customer 
    id      firstname lastname
    000001  manuel01  jordan01
    ....
    000500  manuel500  jordan500
    and

    Code:
    Currency
    id    description
    1     sol  
    2     dolar  
    3     euro
    I want generate in batch for each Customer three CustomerAccount records
    I mean

    Code:
    CustomerAccount
    id           ammount
    000001-1     500
    000001-2     500
    000001-3     500
    ....
    000500-1     500
    000500-2     500
    000500-3     500
    a total of 1500 records for CustomerAccount table

    BTW
    idCustomerAccount = idCustomer - idCurrency

    Practically three independent combinations of 500 for each group, making the total of 1500

    Therefore

    I need read the 500 Customers just once and write in parallel the three possible combinations

    000500-1 500
    000500-2 500
    000500-3 500

    I think make it in parallel but I need mandatory a writer, the processor is optional, I mean

    Code:
    <job id=""
    	<step id="alfa"
        	  read="500 customers"
          	  write <---------------- mandatory, but I dont need it
          	  next="parallel"
    	/>
    	<split id="parallel" next="...">
    		<flow>	
    			<step id="A1"
    			      reader="" <---  mandatory, I dont need it, I need work with each Customer retrieved in "alfa"   
    			      processor="" <-- I need it to create CustomerAccount related with Customer and Currency(1) Sol
    			      writer=""<--- I need it to write CustomerAccount
    			/>			
    		<flow>
    		<flow>	
    			<step id="A2"
    			      reader="" <---  mandatory, I dont need it, I need work with each Customer retrieved in "alfa"   
    			      processor="" <-- I need it to create CustomerAccount related with Customer and Currency(2) Dolar
    			      writer=""<--- I need it to write CustomerAccount
    			/>
    		<flow>	
    		<flow>	
    			<step id="A3"
    			      reader="" <---  mandatory, I dont need it, I need work with each Customer retrieved in "alfa"   
    			      processor="" <-- I need it to create CustomerAccount related with Customer and Currency(1) Euro
    			      writer=""<--- I need it to write CustomerAccount
    			/>
    		<flow>
    	
    	</split>
    	....
    </job>
    Practically A1,A2,A3 makes the 1500 rows for CustomerAccount (500 each A#), I hope you see my point.

    Code:
    000001-1 to 000500-1  Sol
    000001-2 to 000500-2  Dolar
    000001-3 to 000500-3  Euro
    I didnt try it yet, because the attributes read, write are mandatory and I dont know how pass the
    Customer from Step Alfa to each A#


    How I can handle this situation with Spring Batch?
    Is Possible?

    Thanks in advanced
    Last edited by dr_pompeii; Oct 29th, 2012, 03:02 PM.

  • #2
    What you are describing is called pipelining and is not supported in Spring Batch currently. Within Spring Batch, each step is responsible for obtaining it's own input which is why the reader is required.

    It sounds to me like you can do all of the above in a single multithreaded step (not knowing anything about your processing to confirm) using the reader from alpha, some form of composite of the processors in A1, A2 and A3 and the writer that you are using across A1, A2 and A3.

    Comment


    • #3
      Hello Michael

      Thanks for the reply

      What you are describing is called pipelining and is not supported in Spring Batch currently.
      I see.

      Obvious question, some plans to be implemented in the future?

      Within Spring Batch, each step is responsible for obtaining it's own input which is why the reader is required.
      I understand and has sense, but my original case has sense about avoid read the data twice unnecessarily

      It sounds to me like you can do all of the above in a single multithreaded step (not knowing anything about your processing to confirm) using the reader from alpha, some form of composite of the processors in A1, A2 and A3 and the writer that you are using across A1, A2 and A3.
      I have considered about composite of the processors too.

      But my problem was how to write each different result for each composite item (because the composite of the processors approach work like a chain for each composite item) if the writer is called in the final of each step, therefore I thought in a list.

      Therefore my solution was
      How To: Write a List of Items with a ItemWriter

      Kind Regards

      Comment


      • #4
        With regards to how to write each different result of the composite item, would each processor return a different type that needs to be handled differently? I thought they all generated the same type of output. If they have different types of output, you still could do it in a single step with a ClassifierCompositeItemWriter. That would allow you to choose what writer to use for each of the items generated.

        Comment


        • #5
          Thanks for the reply

          would each processor return a different type that needs to be handled differently?
          I didnt work with the composite item approach yet

          The point is that the processor must create three diferent instances of the same Object (CustomerAccount), and I am saving these 3 objects in a collection

          I thought they all generated the same type of output
          Yes, the same type CustomerAccount, but three different instances.

          I am working with the jdbcTemplate.batchupdate for this situation yet.

          Since I am able to work with ItemWriterAdapter, well I am not breaking a rule.

          I will consider of course your approach, but my time is running out

          Comment


          • #6
            Originally posted by dr_pompeii View Post
            The point is that the processor must create three diferent instances of the same Object (CustomerAccount), and I am saving these 3 objects in a collection
            I haven't entirely followed what you're trying to do but if you want to do this, just return a Collection<CustomerAccount> with your three instances in your processor.

            You writer, therefore, will take a Collection<Collection<CustomerAccount>>

            S.

            Comment


            • #7
              Hello snicoll

              I haven't entirely followed what you're trying to do but if you want to do this, just return a Collection<CustomerAccount> with your three instances in your processor.

              You writer, therefore, will take a Collection<Collection<CustomerAccount>>
              Yes I already have done that, read here.
              How To: Write a List of Items with a ItemWriter

              Seems I was right.

              Let me know your thoughts

              Comment


              • #8
                Originally posted by dr_pompeii View Post
                Yes I already have done that, read here.
                How To: Write a List of Items with a ItemWriter

                Seems I was right.

                Let me know your thoughts
                Well, Michael is right, your write count will not match the exact number of items you have written. If you read 10 items and write 30 custom accounts, you have to consider your list of 3 accounts as one item. If you're fine with that, then I don't see what the problem will be.

                That being said, I would try to materialize this list of 3 customer accounts as a first class object so that it's clear that when you wrote 10 items, you wrote 10 items of that type and not "30 customer accounts".

                HTH,
                S.

                Comment


                • #9
                  Hi snicoll

                  If you read 10 items and write 30 custom accounts, you have to consider your list of 3 accounts as one item. If you're fine with that, then I don't see what the problem will be.
                  Yes, I am fine with this.

                  That being said, I would try to materialize this list of 3 customer accounts as a first class object so that it's clear that when you wrote 10 items, you wrote 10 items of that type and not "30 customer accounts".
                  Could explain the bold part?

                  Remember these three objects are of the same type but they are three different instances

                  Thank You

                  Comment


                  • #10
                    Originally posted by dr_pompeii View Post
                    Hi snicoll


                    Yes, I am fine with this.


                    Could explain the bold part?

                    Remember these three objects are of the same type but they are three different instances

                    Thank You
                    I don't know your use case but something like:

                    Code:
                    public class MyConcept {
                    
                      private CustomerAccount firstAccount;
                      private CustomerAccount secondAccount;
                      private CustomerAccount thirdAccount;
                    
                      ...
                    }
                    And you write a "MyConcept" and not a list of 3 accounts. That way your metadata are correct. But if you don't care, well, it's up to you.

                    Comment


                    • #11
                      Hello snicoll

                      Thanks again for the reply

                      Code:
                      public class MyConcept {
                      
                        private CustomerAccount firstAccount;
                        private CustomerAccount secondAccount;
                        private CustomerAccount thirdAccount;
                      
                        ...
                      }
                      I thought the same approach before, but in this way arise an obvious question

                      How I can save/insert each CustomerAccount in a batch approach?
                      The SB's "writer" expects a single object that represent a single record to be inserted

                      Thats why I have chosen the List<CustomerAccount> approach to let me re use the jdbcTemplate.batchUpdate where it expects a list

                      Even more, If my customer created a new Currency (Real - from Brasil) I must update or add

                      Code:
                        private CustomerAccount fourthAccount;
                      Where a List<CustomerAccount> fit better.

                      And you write a "MyConcept" and not a list of 3 accounts
                      Seems I am missing something obvious about to "write" working with your approach.

                      That way your metadata are correct.
                      Agree


                      Thanks for your time

                      Comment

                      Working...
                      X