Announcement Announcement Module
No announcement yet.
Limit on MAP_INPUT_BYTES when using CascadingTasklet Page Title Module
Move Remove Collapse
Conversation Detail Module
  • Filter
  • Time
  • Show
Clear All
new posts

  • Limit on MAP_INPUT_BYTES when using CascadingTasklet


    I am currently working on a Spring Batch job that runs a Cascading job as one of the steps. In order to run the Cascading job I have made use of the CascadingTasklet class.

    The problem I am facing is as follows. While my Cascading job completes successfully, the step in my Spring Batch job that calls the Cascading job fails with a IllegalArgumentException being thrown.

    The message displayed is

    4148563648 cannot be cast to int without changing its value.
    I took a look at the source code for CascadingTasklet in GitHub, and saw that if the value of MAP_INPUT_BYTES exceeds the maximum value of Integer, the above error is thrown.

    Now, the maximum value of the Integer is approximately 2 GB. My input data is currently around 3 GB with the potential to increase. So while the Cascading job completes, my Spring Batch job fails.

    Currently, I am using a workaround in which I am extending CascadingTasklet. So my ExtendedCascadingTasklet class had the exact code as CascadingTasklet except that instead of throwing an IllegalArgumentException I am just logging if the MAP_INPUT_BYTES exceeds the upper limit.

    I was wondering if there is some other better, more elegant workaround that I could use. I am currently using spring-data-hadoop 1.0.0.RELEASE.


  • #2
    I'm having the same issue. Were you able to find a solution to this problem other than extending the CascadingTasklet class? Thanks!


    • #3

      Sorry but I couldn't find any way around this issue apart from extending the CascadingTasklet. It's not the best solution but its the only one I could find.

      Hope it helps