Announcement Announcement Module
Collapse
No announcement yet.
How bad is it to use a correlation strategy this way Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • How bad is it to use a correlation strategy this way

    Our ERP will send messages to our e-commerce system via JMS that may contain message elements about 1 to n products. There are 7 different kinds of messages (add/update, inventory level, categorization, etc). The ERP side will send them in order and the e-commerce system MUST process the product information in order the ERP sends them.

    I have designed a dispatching process to be very similar to that which was presented by Gary Russell and David Turanski at SpringOne2GX in 2011. The biggest addition I have is that since the messages from the ERP can have more than one product message within them I have introduced a Message Splitter out in front of the Dispatcher. The Splitter splits the messages into individual product messages using XPath. We are running a single consumer on a message-driven-channel-adapter because we have to maintain order and multiple consumers would break that. It works beautifully, but it doesn't perform very well. I'm only seeing 40-50 messages processed per second. If I ratchet the consumers up I can increase easily to 200 messages per second which starts to get closer to tolerable (Note this is all local workstation stuff with JBoss and ActiveMQ) but then the messages get out of sequence.

    So, I need to get some parallel processing going on at the splitter level so that I can have lots of little workers breaking up the messages from the ERP but how do I make the sequence remain intact (or come back into being) when it gets to the dispatcher? My current thought is to add a header enhancer in front of the splitter that 'enhances' the messages with 'the next' global sequence number as a header then pipe that into the splitter with some sort of multi threaded processing. Then, after the splitter I'd need to add a resequencer with a custom correlation strategy that used the some static correlation id... basically to the resequencer all messages would have the same correlation id. Then the resequencer would basically be using the sequence number only to know when to release messages (I would set release-partial-sequences to true).

    This feels wrong, but I can't think of another way. How bad is this? What negative things could happen if I did this?

    Any other options that you can think of?

    Thanks.

    Edit: On further consideration, I don't even think my current thought would work as the 'global' sequence would be applied to the unsplit message.
    Last edited by carbonMike; May 15th, 2012, 01:03 PM.

  • #2
    The splitter will automatically assign sequenceNumber, sequenceSize, and correlationId headers, even if only one message comes out of the splitter. That means a downstream resequencer with default strategies, will simply work.

    Comment


    • #3
      In other words...

      assign global sequence (to unsplit message)
      ->splitter
      ->parallel process
      ->minor resequencer
      ->global resequencer
      ->dispatcher

      Comment


      • #4
        I don't think I was very good on establishing the issue I'm having. Its a hard one to explain but let me try a 'do over'.

        Say the ERP sends the following 3 messages with the messages to be split inside of them:

        Big Message a
        - ProductMsg1
        - ProductMsg2
        - ProductMsg3
        - ProductMsg4

        Big Message b
        - ProductMsg5
        - ProductMsg6
        - ProductMsg7
        - ProductMsg8

        Big Message c
        - ProductMsg9
        - ProductMsg10
        - ProductMsg11
        - ProductMsg12

        I need to get the messages downstream in the dispatcher in this sequence ProductMsg1, ProductMsg2, ProductMsg3, ProductMsg4, ProductMsg5... and so on. When the splitter has only one consumer it works fine but is too slow. So say I want to have 3 concurrent consumers in order to support the traffic from the ERP. If consumer 2 gets 'Big Message b' and is faster than consumer 1 which gets 'Big Message a' then downstream we might see ProductMsg5, ProductMsg6, ProductMsg7, ProductMsg8, ProductMsg1, ProductMsg2... which would be a breakdown in our data integrity.

        Does that make more sense about the issue? I need to figure out a way to get the sequence into the split messages as if they came in from the ERP already split in sequence.

        Thanks for taking the time to give it some thought.

        Mike

        Comment


        • #5
          How about

          Code:
          inbound adapter
          ->assign global sequence
            ->split
               ->multi thread work
                   ->resequencer
                      ->aggregator (builds a List payload)
                         ->global resequencer
                            ->splitter
                               ->single-thread work

          Comment


          • #6
            Originally posted by Gary Russell View Post
            How about

            Code:
            inbound adapter
            ->assign global sequence
              ->split
                 ->multi thread work
                     ->resequencer
                        ->aggregator (builds a List payload)
                           ->global resequencer
                              ->splitter
                                 ->single-thread work
            OK. I'll have to ponder that for a bit. Are you thinking this is all configuration based with 'out of the box' SI components?

            Thanks,
            Mike

            Comment

            Working...
            X