Announcement Announcement Module
No announcement yet.
Need help in designing for my requirement Page Title Module
Move Remove Collapse
Conversation Detail Module
  • Filter
  • Time
  • Show
Clear All
new posts

  • Need help in designing for my requirement

    Hello All,

    I am new to the Spring Framework. I have got a requirement of processing a huge file, one row at a time. But i am not sure if batch will help me achieve this.

    Here is the requirement:
    1. I have a folder say input-folder containing one or more cvs file. I need to pickup oldest file from that folder, move it to another folder say process-folder.

    2. Pick that cvs file from process-folder and convert it to XML file (each row of cvs to one xml record per row).

    cvs file data:


    xml file data:


    3. Validate that XML record, one row at a time and put it onto a jms queue.

    4. Once all the record has been processed successfully from xml file, move cvs file to another folder say output-folder and delete temp xml file from process-folder.

    5. Repeat step 1 to 4 indefinitely. If there is no cvs file is present in input-folder for processing then wait for sometime before looking for file again.

    This program should never stop, until manually stopped.

    I have most of the code available for step 1 to 4 in bits and pieces. I was thinking if i can use Batch to integrate the code as one component.

    Here are my doubts:
    1. Can I use batch for above scenario? if yes, then how can i design job?

    2. I was planning to create one job for point 1 to 4 with 4 steps, one for each point. If correct, then how can I keep the job running forever. I tried to run few sample programs for indefinite time but java program terminated everytime job got completed. And if i return RepeatStatus as continuable then wouldn't it create issue for memory, as same job instance will be used for multiple files.

    3. How can i pass output of 1st step to 2nd step? I got one solution as "save it in the jobExecutionContext and pass this as parameter to 2nd job". Correct me if this will not work or suggest better option.

    4. I have a code for reading cvs file, parsing and converting to XML using SuperCVS API. But i saw that batch also support reading cvs file. Will it be useful if i change code to use Batch for cvs file processing instead of SuperCVS?
    Note: CVS file may contain thousands of records with around 20 to 25 columns.

    5. I have one validation for duplicate records. For this I need to compare previous and next record with current record (assume xml file is sorted one). But as far as i understood from Batch documentation, ItemReader reads 1 line at a time and process it then reads second line. Does batch support keeping multiple line of a file in memory? if yes then can you please share some examples.
    Note: I still want to process 1 record at time.

    6. Should i use Flat file reader or XML reader to read records from XML file? I dont have any requirement of parsing XML record.

    I am looking for more and more suggestion because I am still going through the batch reference documentation to understand all feature of batch.

    Thanks in advance