Announcement Announcement Module
No announcement yet.
Spring Integration and MDB Page Title Module
Move Remove Collapse
Conversation Detail Module
  • Filter
  • Time
  • Show
Clear All
new posts

  • Spring Integration and MDB

    I'm just taking an initial look at the Spring Integration project. I really love what I see. This could shave weeks off of our development time.

    I'm working on an MDB-based messaging application and I need functionality like splitting/aggregating. I was wondering if I could somehow use SI in an MDB context, somehow reuse the classes and interfaces built for splitting/aggregation.

    I was wondering if anyone had any opinions on this.

    It might be as simple as loading up the application context on the MDBs initialization, fetching the aggregate/splitting bean in the onMessage and passing it the message. Or maybe this doesn't work. Or maybe there is a way of doing this that is even simpler.

    Any and all thoughts and comments would be appreciated.

  • #2
    You don't really need to use MDBs since the Spring Framework provides a Message-Driven POJO alternative. Spring Integration actually builds on top of that support in the core to provide a message-driven-channel-adapter:
    <jms:message-driven-channel-adapter destination="someDestination" channel="someChannel"/>
    That way you will have JMS Messages at the entry point, but then you can follow that with splitters, aggregators, or any of the other Spring Integration components.

    Hope that helps.


    • #3

      Yes, that does help. Thank you. The reason we are looking at MDBs is that we are considering deploying to a Glassfish cluster, and if we can wrap this aggregation logic inside EJBs, the usual old suspects of management and monitoring tools become available to us.

      But actually as I'm thinking things through, I realize that an MDB is stateless and therefore inappropriate for aggregation. For aggregation, maybe I can have my MDB look up and pass on the message to a singleton Stateful session bean that would wrap the SI aggregation logic. Ugh.

      My goal here is to deploy SI in a Glassfish cluster to get some of the failover and high availability benefits that a cluster would provide. Many on the forums have said that SI is really mostly for messaging within a JVM, but the implementations of EIP are just so useful and intuitive that I'd hate to go to something like Mule or Camel just for ease of managing the container.


      • #4
        In my opinion, it would be unfortunate to add complexity to the application (especially the stateful EJB) if it does not need that functionality otherwise.

        As far as failover and load-balancing, you do get that for free with JMS as long as you have multiple consumers sharing JMS Destinations. For management of a cluster, there are other lightweight options as well that work great with stateless Spring applications, such as tc Server:


        • #5

          I agree on the SSB. It's a kludge.

          As far as failover in JMS for free, it does seem that competing consumers give you as much horizontal scalability as your broker can handle.

          But take the example of an aggregator. Let's say the messages it is aggregating is messages that items of an order have shipped. When all the messages for items of that order have come through the aggregator, it fires off a message to a JMS destination to charge the account. It seems to me that one would need to ensure is that there can only be one instance of that aggregator across an enterprise cluster, or else one runs the risk of firing off two messages and double charging the account. But, of course, a single listener on a node some runs the risk of node failure.

          Are these assumptions faulty? Is that bad design? Is there no point to talking about availability once we've assumed that a service can only reside on a single node at once? Should even aggregators be designed for the n+1 case? How? The aggregator just seems to me like a thorny and yet critical EIP that causes a lot of problems.


          • #6
            The Aggregator is indeed an interesting case, because it is maintaining state.

            Are the inbound JMS Messages actually that fine-grained (i.e. representing individual line-items of a given order)? If each Message corresponded to a single order, then you could still split into line-items, process those, and then aggregate the results back into a single order confirmation - and you could do all of that within the scope of a JMS transaction so that no other node would receive a Message from the same order unless this node were to fail and rollback the JMS Message.


            • #7

              Thank you for the suggestion. It is an interesting idea. If I am understanding you correctly, a node listens on a queue for a message to process the order items, processes them all and sends a confirmation to another destination, all within the same JMS transaction, thereby ensuring that no other node process the line items on that order unless the node fails.

              For our particular application, the problem is this (and I think this is a common use case for split/aggregate): let's say each line item had to be processed by a different vendor, so that one node could not process all the line items. All those vendors know what JMS destination to post their responses to, the idea being that the aggregator listens on that destination. Which leaves me where I started: figuring out how to ensure reliability for that aggregator, given that having more than one such aggregator would be disastrous.


              • #8
                I see. Ultimately, it is the *state* of the Aggregator that needs to be available across the cluster, right? If it were, then any node's Aggregator instance could contribute to the unit-of-work. This is something we will definitely keep in mind when designing the MessageStore strategy in Spring Integration 2.0. See: (and feel free to add comments).

                If all of the state for Aggregator were managed by the MessageStore, and we were to provide a cluster-aware MessageStore implementation, then do you think that would solve your problems? This is a very important and interesting use-case, so I hope you are able to provide feedback while we implement that in 2.0.

                By the way, have you considered something like Terracotta for this role?



                • #9
                  Yes, Mark. You hit the nail right on the head. It is the state of the aggregator that needs to be available across the cluster. And yes if the MessageStore managed all of the state of the aggregator and if the message store were cluster-aware, that would solve this problem. It will go a long way towards making the Aggregator pattern scalable. I have to say, that of all the EIPs, the aggregator is one of the trickiest to get right if you have to code your own, let alone if you have to code a cluster-aware version. If SI does this and does it in a scalable way, I think it will be a major cause for adoption of SI in message-driven architectures.

                  I at least don't know how aggregation can be successfully done without synchronization at some level of the application infrastructure. Take the simplest example of an aggregator that has to count to 10 before it aggregates the last 10 messages. If multiple aggregators were reading and then writing the counter, then they could read the same value and write the same value unless there was some kind of lock on the counter.

                  Most of the other components of EIP are easily horizontally scalable, as you pointed out. Competing workers, splitters, service locators and so forth. Even if aggregation imposes some kind of scalability burden (by requiring synchronization somewhere), I would rather for the sake of ease of deployment push that off into some other layer outside of Spring Integration. In other words, if aggregation is going to be a problem, rather than deploying just one SI without failover, I would rather deploy 5 SIs and let a database handle row locking on the aggregator counter. If aggregation is going to be a bottleneck, so be it. Better off that way than hamstringing the entire deployment.

                  I'll take a look at Terracotta. I've only read articles in passing., so I'm not immediately sure of what it provides. I suppose if I can create a distributed and consistent hash map, I can roll my own logic to store the list of messages to be aggregated by their correlation id as key. But then I have to consider timeouts. I might have to start from scratch with the EIP book at my side. Maybe taking a peek into the SI source code will help.