Announcement Announcement Module
No announcement yet.
MultiResourceItemReader init problems Page Title Module
Move Remove Collapse
Conversation Detail Module
  • Filter
  • Time
  • Show
Clear All
new posts

  • MultiResourceItemReader init problems

    I'm having two problems with MultiResourceItemReader in Batch 1.1.2. Perhaps someone can suggest workarounds.

    Our job is a 3-step job. The first step fetches 0-n files to be processed (saving them as mydata*.csv). The second step processes the files (this is the step that uses MultiResourceItemReader). The third step archives the processed files to an archive directory.

    Problem 1: The MultiResourceItemReader takes a Resource[] as a configured property (value=mydata*.csv). The problem is that the property is set by the Spring property editor before step one has fetched the files (indeed, before the job has launched), so there are no files in the directory. Is there any way to ask MultiResourceItemReader to reinitialize its properties after step one has run, so it can find the files?

    Problem 2: Step three, our file archiving tasklet, needs the list of files processed by the MultiResourceItemReader. However, MRIR does not have a getResources() method. Is there some way to get the list of files, perhaps in the ExecutionContext or some other manner?

    Also, as a suggestion, I think the "blow up if no files found" behavior should be configurable. Sometimes not having files to process is a normal occurence and shouldn't result in a job failure.

    Thanks for any help!

  • #2
    I think the first problem was fixed and will be in the next point release:

    For the second problem, can't you wire in the same Resource[]?


    • #3 appears to fix the "blow up on no files found", but our problem is even deeper than that: the first step of the job fetches the files from remote locations and drops them in the work directory. So even if MRIR doesn't blow up, it still finds no work (since on context initialization mydata*.csv resolved to an empty Resource[]).

      For the second problem, setting "mydata*.csv" for different tasklets works ok, but spring resolves that Resource[] twice. There's a finite (but tiny) chance that the two arrays are not the same (someone/thing manually dumping files in the dir as the job starts). I was just hoping to bulletproof the job. We actually ended up creating a third bean with a Resource[] property, setting that to mydata*.csv, and then using a PropertyPathFactoryBean to set the Resource[] into the other two beans.


      • #4
        The MRIR only checks for resources on open(), which is only called when the step starts, I don't see any reason why it would blow up sooner, unless Spring Core has an issue with passing through the empty array?

        If the files in your directory changes between executions in a restart scenario, you're going to have issues, the reader expects the pattern to stay the same. I would recommend keeping your upload directories different from your input directories.


        • #5
          It's not really a restart scenario.

          When the job is created by Spring, there are no files in the working directory. The first Tasklet Step in the job actually fetches the files and downloads them into the working directory.

          But it's too late: MultiResourceItemReader has already been initialized with an empty Resource[], so it finds no work, and the job exits.

          What I'm wondering is if I can somehow refresh the properties on MRIR between step 1 and 2? I guess the answer is no. In that case, what else can we do?

          We've got a hokey workaround where we remove Step 1, from the job, make it a simple POJO with an init() method, and make the MRIR-using-step depends-on the POJO. We don't like this since we are removing the file download from the job and since we are doing significant work (donwloading a bunch of big files) during the context initialization, which is a bit goofy.

          However, we are stuck on what other workarounds might be available.


          • #6
            Ah, I think I see....because there's no resources when the array is created it will always been null because your first step is creating them. Now I'm following you. We'll probably have to proxy it someway, similar to what's done with StepExecutionResourceProxy, so that it doesn't look for the resources until the step starts. Can you create a jira issue for this?


            • #7


              • #8
                I have just encountered the same issue with 2.0.2.
                I got around it by setting scope="step" against the MRIR

                <bean id="multiResourceReader" class="org.springframework.batch.item.file.MultiResourceItemReader" scope="step" >
                	<property name="delegate" ref="passThroughFlatFileReader" />
                	<property name="resources" value="file:///path/file*.csv" />
                	<property name="saveState" value="true" />
                The MRIR will now get initialised at the beginning of the step.

                I know it has been a while since the last post, but I figured it would be handy to get it onto the forums.