Announcement Announcement Module
Collapse
No announcement yet.
Limit on MAP_INPUT_BYTES when using CascadingTasklet Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Limit on MAP_INPUT_BYTES when using CascadingTasklet

    Hi,

    I am currently working on a Spring Batch job that runs a Cascading job as one of the steps. In order to run the Cascading job I have made use of the CascadingTasklet class.

    The problem I am facing is as follows. While my Cascading job completes successfully, the step in my Spring Batch job that calls the Cascading job fails with a IllegalArgumentException being thrown.

    The message displayed is

    Code:
    4148563648 cannot be cast to int without changing its value.
    I took a look at the source code for CascadingTasklet in GitHub, and saw that if the value of MAP_INPUT_BYTES exceeds the maximum value of Integer, the above error is thrown.

    Now, the maximum value of the Integer is approximately 2 GB. My input data is currently around 3 GB with the potential to increase. So while the Cascading job completes, my Spring Batch job fails.

    Currently, I am using a workaround in which I am extending CascadingTasklet. So my ExtendedCascadingTasklet class had the exact code as CascadingTasklet except that instead of throwing an IllegalArgumentException I am just logging if the MAP_INPUT_BYTES exceeds the upper limit.

    I was wondering if there is some other better, more elegant workaround that I could use. I am currently using spring-data-hadoop 1.0.0.RELEASE.

    Thanks!
Working...
X