Announcement Announcement Module
Collapse
No announcement yet.
Transaction Management batch metadata newbie question Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Transaction Management batch metadata newbie question

    Hello,

    (db - mySQL)

    In Spring Batch in Action (3rd Edition) under Considerations on Job Repository Implementations is a question: Can I use a different database for the persistent job repository and my business data?. Part of the reply is:

    'If transactions don't span the two databases, batch execution metadata and business data can get unsynchronized on failure. Data such as skipped items could then become inaccurate, or you could see problems on restart. To make your life easier (and your jobs faster and reliable), store the batch metadata in the same database as the business data'.

    [I don't have the atm scenario where two different databases are involved in a single CUD transaction. Multiple tables are updated within a database but not across databases at the same time.]

    I have a database (db1) on one machine (mc1). I create the BATCH_ tables in db1. I run my batch job, of which one step inserts a file in a blob field in another database (db2) on another machine (mc2). In this scenario do I need JTA? The above section indicates yes as I run the risk of the Batch_ tables (mc1) not being in sync of the transactions (mc2) on power failure etc...

    In order to NOT use JTA I think I would need to have 2 batch jobs: 1st with db1 moving the file over to a directory on mc2 (no need for transaction management in this job as transferring files). 2nd job with db2 needing its own BATCH_ tables doing the insert (as transaction management needed).

    Since for file inserts alone I have about 35 Manufacturers and each month 35 new Manufacturer file dbs are created - each one would need BATCH_ tables created in it in the local scenario? (don't like)

    If I was using Batch Admin - instead of one central place where I could see state of jobs run (because I used JTA) (and re-run etc) - I would need to look at each individual db Batch Admin web interface if I had no alert system, correct?

    If I continue the logic though - once it is in the db2, another job is going to process the contents of the file and insert them in yet another database which may or may not be on the same machine so I will have to break up the job in order for it to be 'local' and repeat the BATCH_ table creation scenario... (messy quickly)

    So I need to use JTA, yes? If so, I have 5 databases that will be used in the course of a job. I need 5 XA mySQL drivers even though 2 of the databases are only lookup in the full job?

    [Yet in 9.4 Transaction Management Patterns it says the difference between Global transactions and Local Patterns is: 'Global transactions are different from local transactions, where only one resource is involved and the application directly communicates with the resource to demarcate transactions' which seems to suggest my jobs are local and so don't need JTA?]

    Thank you,
    Peter
    Last edited by pgtips; Dec 18th, 2012, 07:04 PM.

  • #2
    Originally posted by pgtips View Post
    Hello,
    I have a database (db1) on one machine (mc1). I create the BATCH_ tables in db1. I run my batch job, of which one step inserts a file in a blob field in another database (db2) on another machine (mc2). In this scenario do I need JTA?
    Yes. In order to keep two transactional resources in sync, you need to use JTA. The only way around this would be to abandon restartability/etc all together by using the Map based repository.
    Since for file inserts alone I have about 35 Manufacturers and each month 35 new Manufacturer file dbs are created - each one would need BATCH_ tables created in it in the local scenario? (don't like)
    If you need restartability, you either need your batch metadata tables in the same database as the business schema or use JTA.

    My question would be...why is JTA an issue?

    Comment


    • #3
      Thank you for clarifying the transaction options.

      I have no experience with Spring / Spring Batch and so am relying heavily on Spring Batch in Action for both understanding and 'experience' + Spring Batch Reference Documentation. In the book the writers state:

      Make no mistake: global transactions are tricky. First, the configuration can be difficult. Second, some implementations (transaction managers and XA drivers) remain buggy. Third, XA is inherently slower than local transactions because the strong transactional guarantees it provides imply some overhead.

      They then shortly afterward say:

      We're not saying that using JTA for global transactions is a bad solution. It provides strong guarantees, but they come at a price. JTA has the advantage of working in all cases, as long you meet its requirements: a transaction manager and XA drivers

      So the first paragraph (tricky, difficult, buggy) impressed me to avoid JTA as a newbie. I see by your question I am mistaken.

      Thank you,

      Comment

      Working...
      X