Announcement Announcement Module
No announcement yet.
A bug in StatefulRetryStepFactory? Page Title Module
Move Remove Collapse
Conversation Detail Module
  • Filter
  • Time
  • Show
Clear All
new posts

  • A bug in StatefulRetryStepFactory?

    After reading the documentation for the "StatefulRetryStepFactoryBean", I was under the impression that by using that step it would retry failed items as long as:

    1) the number of retries did not exceed the "retryLimit" and
    2) the thrown exceptions were an instance of "retryableExceptionClasses"

    However, as long as the retry limit is set to a value greater than 0, it appears this class will always retry. For instance, using the test class "StatefulRetryStepFactoryBeanTests" and modifying the ItemReader to unconditionally throw an Exception causes the test to infinitely loop and never complete.

    Am I misunderstanding how this class is used? Thanks for your help!

  • #2
    If you are in an infinite loop, you probably are still using 1.0.0, right? 1.0.1 will exit the loop with an error that tries to explain what probably went wrong. Read the documentation section on ItemKeyGenerator in relation to the stateful retry, and see if it applies to your case.

    You might be the exception, but 9 times out of 10 if your symptoms are presented it turns out that the key generation strategy is faulty and/or the items do not have a stable identity.


    • #3
      Using release 1.0.1, I can still trigger the infinite loop by doing two things in the test class:

      1) Make the writer unconditionally throw an exception.
      2) Comment out the the "setSkipLimit()" call.

      With this situation, I would expect the class to retry after 10 iterations (the retry limit). Is that the wrong expectation? Thanks for your help!


      • #4
        What Dave is saying is that there is likely an issue with your items and how they are identified. Have you overriden equals so that it will be garunteed to be unique even if the item is regenerated by the reader? (which it generally will be in a rollback scenario) If not, that's probably the issue.


        • #5
          Well, I recreated the infinite loop using your own JUnit test. And unless I'm misunderstanding, I don't think that configuration should act in that way - it's not a strange configuration and one I would expect the framework to handle.

          With my own item reader I'm using the stock HibernateCursorItemReader, but the issue is that I want to handle the Writer possibly throwing an exception.


          • #6
            Right, how you created the loop is irrelevant. The real issue is how your items are identified. The framework uses object comparison to determine if it is seeing the same item that should be retried. In the case of a write failure, the whole chunk is rolled back, and the ItemReader is 'reset'. The items are then recreated. If your domain objects don't override equals, then the comparison will fail, because the default behaviour for Object equals uses the memory location. Unless the reader buffers items itself, there's no way around this (the HibernateCursorItemReader doesn't)


            • #7
              Originally posted by lucasward View Post
              Right, how you created the loop is irrelevant. The real issue is how your items are identified.
              How I created the loop is irrelevant? This behavior is not a bug?


              • #8
                Sorry, poor wording (I'll blame it on late night posting) The loop is generated because the framework can't figure out which item was the problem. If you have a chunk of items, say 5: {A,B,C,D,E}, and you got an exception while trying to write out C, the whole transaction is rolled back, and the ItemReader regenerates them again. Once the retry limit has been exhausted, the framework says 'C can't be retried anymore, so skip it'. It does this by saying 'does the item I'm about to write out equal C?" (i.e. item.equals(C)) Now, what happens if you regenerated those items, but because they don't have an appropriate equals, the check of 'does the item I'm about to write equal C?' will be false, even though it *should* be true. The framework will try and write it out anyway, get the same exception it always gets, rollback, regenerate, and continue on, ad infinitum. There isn't anyway around this, you have to create an appropriate equals for the items you're attempting to write out. We're still looking at some ways to buffer more so that a comparison by reference will work, but there's some challenges in doing this generically that we'll have to overcome in future releases.


                • #9
                  Perhaps it might be useful then to have a totalRetryLimit property such that if:

                  retryLimit = 3 (per item)
                  totalRetryLimit = 5 (per step)

                  then it will stop after 3 retries of a single item it can identify, or 5 retries total for the entire step. This way, you can address the infinite-loop and the fact you may not be able to re-identify the item.


                  • #10
                    That's actually sort of the way it is already implemented (except the total limit is per commit interval, not per step, and it is not configurable right now). So you shouldn't be seeing any infinite loop in 1.0.1 or trunk - it should bomb out after the limit is reached with a message that suggests you check you ItemKeyGenerator and/or equals/hashCode implementations. I think that applies to skip and retry (altough the fixes for each went in at different times).