Announcement Announcement Module
Collapse
No announcement yet.
Spring Batch DAOs with MongoDB Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Spring Batch DAOs with MongoDB

    Hi, folks!

    I am using Spring Batch with NoSQL database, so I couldn't use the standard JDBC DAOs for Spring Batch metadata.

    Here's MongoDB implementation - http://jbaruch.wordpress.com/2010/04...-spring-batch/
    The new implementation passes all the tests from dao package.

    I think it can be beneficial in some use-cases, and will be glad to contribute the new implementation to the codebase. Just tell me if you want it and what to do (JIRA issue, probably?)

  • #2
    Thanks for the link. NoSql is definitely a future direction for Spring and Spring Batch, but we aren't sure yet how to organize the code since there are so many options out there. Mongo is a nice product but it isn't the only one.

    People are going to want to use noSql for their business data, so we want to try and accommodate that, but it's not clear yet that JobRepository is a good fit for noSql, and there is also the JobExplorer and JobService (from Spring Batch Admin) to consider. The JobRepository in particular will probably get a major overhaul in Spring Batch 3.0, so there is a good opportunity to feed in requirements from noSql there.

    Not all noSql products can support all the use cases from Spring Batch and Spring Batch Admin. Mongo is pretty good in this regard: it probably permits implementation of the Searchable*Dao interfaces from Admin. Some other products would be tougher.

    Transactions are the big sticking point. Spring Batch users basically have to give up some restartability guarantees when using non-transactional databases. The Map implementation we have is transaction synchronized, but not really 100% reliable since there is no transactional resource. I recommend not to use it in production, but that doesn't stop some people. NoSql repositories might follow the same pattern, and maybe if we see some traction we can be more explicit about what the tradeoffs are.

    Does your implementation pass all the tests in the project (including samples)?

    Comment


    • #3
      Indeed, it passes all the tests from the relevant org.springframework.batch.core.repository.dao package.

      Comment


      • #4
        That wasn't what I asked. Can you switch the repository implementation and still pass all the tests in all the packages and all the modules? Think if it as an integration test. No sweat if you don't have time to try it now - it doesn't seem urgent unless a lot of people are asking for it.

        Comment


        • #5
          Of course I can't pass all tests, many of them validate metadata tables state in database as assersions;
          E.g. org.springframework.batch.core.launch.JobLauncherI ntegrationTests - it checks BATCH_JOB_INSTANCE table to check for new record in it after launch. Naturally, there won't be new record.
          I am not sure I will be able to check all the 4000 tests for database dependencies in them. If you could scope the mission to real integration tests, I'll invest gradly to make them pass.

          Comment


          • #6
            Hmm, that does make it difficult. A good starting point would be all the tests in spring-batch-samples/src/test/java/org/springframework/sample. Some of them might make assertions about the business data in a SQL database, but probably not the meta-data.

            What are your restartability requirements in your real system?

            Comment


            • #7
              Looks good!

              There are 2 tests that assert metadata state in db:
              CustomerFilterJobFunctionalTests.testFilterJob and SkipSampleFunctionalTests.testJobIncrementing.
              After changing those all 134 tests in spring-batch-samples pass .
              Here is the patch file with all the changes I had to make.
              The springbatch-over-mongodb dependency is the build artifact of the sources on github, and it is deployed to Artifactory at http://repo.jfrog.org

              P.S. Dave, the version update you made in spring-batch-infrastructure contains an extra space (line 5 in pom, line 323 in the patch), and it breaks it.

              Comment


              • #8
                Originally posted by Dave Syer View Post
                Thanks for the link. NoSql is definitely a future direction for Spring and Spring Batch, but we aren't sure yet how to organize the code since there are so many options out there. Mongo is a nice product but it isn't the only one.

                People are going to want to use noSql for their business data, so we want to try and accommodate that, but it's not clear yet that JobRepository is a good fit for noSql, and there is also the JobExplorer and JobService (from Spring Batch Admin) to consider. The JobRepository in particular will probably get a major overhaul in Spring Batch 3.0, so there is a good opportunity to feed in requirements from noSql there.

                Not all noSql products can support all the use cases from Spring Batch and Spring Batch Admin. Mongo is pretty good in this regard: it probably permits implementation of the Searchable*Dao interfaces from Admin. Some other products would be tougher.

                Transactions are the big sticking point. Spring Batch users basically have to give up some restartability guarantees when using non-transactional databases. The Map implementation we have is transaction synchronized, but not really 100% reliable since there is no transactional resource. I recommend not to use it in production, but that doesn't stop some people. NoSql repositories might follow the same pattern, and maybe if we see some traction we can be more explicit about what the tradeoffs are.

                Does your implementation pass all the tests in the project (including samples)?
                Following up on the above discussion - can someone kindly please provide a status on this implementation and/or if it's being considered at all for nosql databases in general. Thanks.

                Comment

                Working...
                X