Announcement Announcement Module
No announcement yet.
Http inbound gateway failing Page Title Module
Move Remove Collapse
Conversation Detail Module
  • Filter
  • Time
  • Show
Clear All
new posts

  • Http inbound gateway failing

    I have SI project which has flow like this

    <http:inbound-gateway /> --> serviceactivator (mainprocessor) which parses xml in reqeust, -->splitter-->router-->aggregator--> responseprocessor endpoint--> which pushes the response to a randezvous queue-->

    mainprocessor is wiating for a message on the randezvous queue then polls it and sends it to gateway reply channel.

    now under load / after load testing few hundred concurrent reqeuests , it fails and returns just the 200 http code. Once the failure starts , the system never recovers.

    We are running this on WAS 7 app server.

    Our initial POC's we did for testing SI did not have this issue. POC was don on the same set of libs.

    Any help is greatly appreciated.

  • #2
    returns just the 200 http code
    HTTP 200 is a success status code. Since you say it starts failing, what exactly fails? Any log evidences with exceptions you can give?

    Our initial POC's we did for testing SI did not have this issue. POC was don on the same set of libs.
    What has changed since your initial POC where things were working fine?


    • #3
      I should've been more specific about the paylod that goes into the polled channel. payload that the http client is expecting is xml, + the 200 http code. when i say it is failing i mean it fails to return the xml. Only thing that the client is getting is the 200HTTP code.

      if i were to start the server and send 1 reqeust at a time for some time , i have no issues, when i do a load test with 100 concurrent requests , things get messy, and i see the above issue.

      In the logs i can see that the webcontainer thread that is polling to get the response actually gets the xml message. It just never makes it out of the container when the failure occurs.


      • #4
        And what has changed since the last POC that worked fine?


        • #5
          we added all our business logic and backend calls, do keep in mind, i can see the processed result in the logs.

          in our system configuration, i have a webserver in front of the WAS server, webserver writes a log after it receives a reply from WAS server. i have verified that the web server log is written after our response is compiled by SI and before client receives the response.

          i have verified that the issue is not with web server by sending requests directly to the WAS server.
          I have verified that the thread that received the request has received the response xml in the mainprocessor which then sends the message to the gateway reply channel.


          • #6
            1. Is your custom code (mainProcessor, responseProcessor) thread-safe?
            2. How are you correlating responses arriving on the rendezvous channel to specific requests?
            3. Why are you even using a rendezvous channel? You should be able to simply have your responseProcessorEndpoint have no output-channel, and the framework will take care of routing it's reply directly to the http gateway.
            Last edited by Gary Russell; Jan 30th, 2012, 01:09 PM.


            • #7
              1, yes it is threadsafe
              2. we are not doing any manual correlation. when we did our poc we sent thread specific request data to see if would get the same back in the response, and the framework seem to manage to do the correlation on its own. we did a lot of benchmarks along with the validation of the thread specific data, we had no issues.
              3. we have some default request validation failure replies we have to handle before we do anything else with the request. we also need to send a custom response in case of tiemout. so we did not want to do that. Also during the initial trials we noticed that if we did not have the webcontainer request thread polling for response , that thread would immediately return a 200 as mentioned in the documentation.
              Last edited by ejkp; Jan 30th, 2012, 01:47 PM.


              • #8
                1, we have not done anything specific to verify that it is threadsafe
                Should be easy to check. Do the classes have any mutable instance variables? If so, are they protected against concurrent use?

                You must be doing some correlation; otherwise, if you are using the same rendezvous channel for all requests, there is no way to guarantee a specific reply belongs to a specific request.

                Or, are you instantiating a new Rendezvous channel per request and passing it downstream in a header?

                I think you need to show us your actual configuration and how this rendezvous stuff is being used. I am sure you can avoid all this complexity.

                But your classes still need to be thread-safe - it is a common mistake where code works fine in single-threaded tests and falls over fast when put under load in a multi-threaded environment such as a web container.


                • #9
                  everything is threadsafe based on the our code. only time i change data in mutable object is in the postprocessor(after aggregation). by that time there is only one thread processing the request.

                  only correlation we have is the correlation we do in the aggregator.

                  I am not passing any rendezvous channel in the header.

                  unfortunately configuration file wont be of any help because most of our message routing happens in the code.

                  lets say i am wrong about all of the above, when the system has reached a state where it fails 100% of the time and it is not processing anything and i send a single request, i see the same failure exactly as i described before , now that does not make sense to me.


                  • #10
                    Well, the way you've described it, I am not surprised you are having difficulty - it should not be necessary for your application code to have any tight coupling to the framework. Your mainprocessor and responseprocessor can (and should) probably be POJOs, with zero dependencies on SI, unless you have some very unusual requirements.

                    If you only have a single RV channel, it just can't work unless you only have one request at a time. You must have just been "lucky" in your PoC. Rendezvous channels are not designed for the stuff you are doing.

                    If you can't provide more details (config and code), it is unlikely we can help further.