Announcement Announcement Module
No announcement yet.
Massive Batch Fetch - 20 mn rows. Page Title Module
Move Remove Collapse
Conversation Detail Module
  • Filter
  • Time
  • Show
Clear All
new posts

  • Massive Batch Fetch - 20 mn rows.

    I have a scenario where we need to fetch 20 mn rows from Oracle and write them to a data file after processing.
    The application is deployed in clustered Weblogic env. (2-4 instances).
    What would be the best approach to to have 1 process - multiple threads or parallel processes on multiple managed server instances? Also, since the records length can vary b/n a mere 100 - 20 mn, the no. of threads / processes has to be determined dynamically.
    I do not know much of Spring Batch, but I guess I should be looking at Partitioning.
    Any suggestions?

    Last edited by springinstride; Jul 8th, 2009, 11:44 PM.

  • #2
    Partitioning sounds about right. It is certainly possible (indeed recommended) to dynamically alter the number or size of partitions according to the size of the data set.

    If you are writing to a single file isn't that going to be a bottleneck? If you distribute the RDBMS queries, each node/thread will have to produce a separate file. Then if the desired output is a single file the partitioned files have to be fetched and concatenated.


    • #3
      Oh, absolutely it will be a bottleneck! The idea is not one physical file, but one/few per node depending on the number of nodes and the data.

      Thanks for your suggestion.