Announcement Announcement Module
Collapse
No announcement yet.
Merge multiple input source as one Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merge multiple input source as one

    Hi,
    Two features of batch that I am interested in are Merge Multiple Input and Sort Input. I saw both features have been identified by Spring Batch, but I do not find out when you will implement/release it.
    Could you please let me know when we will implement these feature? thanks.
    • Merge - A program that reads records from multiple input files and produces one output file with combined data from the input files. Merges can be tailored or performed by parameter-driven standard system utilities.
    • Sort - A Program that reads an input file and produces an output file where records have been re-sequenced according to a sort key field in the records. Sorts are usually performed by standard system utilities.

  • #2
    It would probably be more efficient to use the "standard system utilities" you mentioned, rather than try and code merge or soret in Java. You can still launch them from Spring Batch if you want to. Or do you need something in Java for a particular reason?

    Comment


    • #3
      Hi Dave,
      Thank you for your reply.
      I am thinking the situation: we have two different input resources (one is the data in table, another is a file), I want to merge both input as one input in spring batch, and then sort them base on PK. Next, I will process all items. In this case, I need to extract all data from table and them write to a file, and then use *system utilities* (cat in unix) to merge them as one file.
      Is it the solution that you prefer? Do we have a java solution? What about Windows, as far as I know windows does not have this kind of command?

      Comment


      • #4
        Why don't you just load the data into the database and use SQL to sort it then?

        Comment


        • #5
          If your sorting and merging is going to be pretty straight-forward, you can use shell script and use sort functionality (I have never tried, but the documentation says you can call your script from Tasklet):
          http://unixhelp.ed.ac.uk/utilities2/sort.html

          Also there is a more powerful tool recommended by another member in this forum, but that will cost per license:
          http://forum.springframework.org/sho...ight=sort+file

          File I/O should be faster than Database I/O if you are talking about millions of records.

          Comment


          • #6
            thank you very much for your reply.
            Yes, I can load data into database and use SQL to sort them. but what about performance? using file to operate millions of records is much faster than database. As I knew a lot of ETL tools suggest the job flow would be,
            a) load data from database -> b) write to file -> c) merge files or join them -> d) sort them base on file
            It will improve a lot of performance since there are no database lock and it will not impact other database process.
            What do you think?

            Comment


            • #7
              I think you are correct. So it comes back to the original suggestion that I made - if you want it to be as fast as possible, and you want the Spring Batch meta data, you could use a job to launch your ETL tool. If you don't care about the meta data, use the ETL tool directly.

              Comment


              • #8
                Sorry, what do you mean about *Spring Batch meta data*, is it JobExecutionContext, StepExecutionContext?

                And I agree with the point that you suggested, we use ETL tool directly if we want it to be as fast as possible.

                Comment


                • #9
                  The "Spring Batch meta data" here would be JobInstance, JobExecution and StepExecution - basically a log in the database (start & end time, outcome completed/failed etc.)

                  Comment


                  • #10
                    thanks a lot.

                    Comment

                    Working...
                    X