Announcement Announcement Module
Collapse
No announcement yet.
distributed transactions vs database replication Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • distributed transactions vs database replication

    hi I have a couple of data access questions - if you feel these questions are inappropriate for this forum please suggest another more appropriate one - thanks.

    i'm developing a website (using spring) that sells services on behalf of independent businesses (tickets to local events,gift coupons for services (beauty treatments etc))
    my first question is this : would it be better to have one large database with all the businesses information in or would it be better to have a seperate database for each business. One large database would give better performance but you would have to restrict access to tables(to ensure one business does not update anothers details). Many smaller databases would be easier to set up but the performance would be worse as you would have to use distributed transactions.

    second question : i would like to keep some information locally at each business so they can (at least) verify the purchase (speed of verification is important), and perhaps enhance the service using the customer's profile (language preference etc). i am unsure of the strategy i should use to populate/maintain these 'local' databases (distributed across a WAN). should this be performed in the data layer by the database (using features such as replication) or should this be performed in the application layer (resulting in a distributed transaction)?my gut would say if there is any business logic involved it should be in the application layer, but i dont want to create distributed transactions 'just because i can'.

    if you have experience in this area your two cents would be like gold to me ;-)

  • #2
    Re your first question...

    When you say multiple businesses, how many are you talking about? 20 or 30, or 20,000 or 30,000? Will these businesses be self-provisioned (ie they setup themselves) or will your system administrator be manually configuring each? Will you need to provide separate webapps for each, or will you be able to get away with a single webapp for all (perhaps with individualised styling/branding)? Will the total data storage exceed the capacity of one database server (favouring separate databases) or is it only going to be comfortably storable on one database server? Or do you have budget for a database like Oracle RAC? Will customers be able to write their own SQL queries (or otherwise have access to the raw database tables) or will they be using pre-written reports etc?

    At different stages I've used each option (shared DB vs separate DBs). Presently I prefer shared DBs, with a new DB for each cluster (each cluster comprising one database server combined with several web servers running the same single webapp). If you're going to deploy separate webapps for each customer, probably you're best off deploying separate DBs as well. In a shared database situation you can achieve good application-level data partitioning using Acegi Security's ACL capabilities. Obviously if customers can directly access the database for reporting etc this will not help, and you'll need to resort to views or database-level security.

    Re question two...

    There is a book called Enterprise Integration Patterns which discusses at length approaches to sharing data, in particular using messaging. Of course you can also use database replication but in my experience it's prone to problems over relatively unreliable Internet connections as subscriptions expire etc. This of course is all configurable, but there are better solutions such as messaging. You're only taking about a small amount of data after all. Indeed you might even be able to write a client application using Spring Rich Client which polls a web service to download new information, if you wanted to avoid the messaging infrastructure and you had relatively few topics. It all depends on your data, how often it's updated, whether you control your customer's database choices or want a self-contained application etc.

    Alternatively, there might be other ways to communicate the information via the token your webapp issues to the user which is presented to the customer's site. For example, you could issue a signed message which the customer site could easily validate using PKI. If the certificate is only redeemable at the customer site, they could maintain a local list of certificates they have redeemed.

    I hope this helps in some way. Specific advice is difficult without really understanding your use cases and technical requirements. You might also find more advice in the Architecture forum.

    Comment


    • #3
      What database are you using?

      If Oracle - you can just use Oracle schema's to separate data and have each business use a different login. You can still access data across schemas if this becomes necessary. You can share a single global connection pool by using programmatic login if this is acceptable.

      Comment


      • #4
        thanks for the replies ben and katentim.
        firstly in response to ben's questions:
        When you say multiple businesses, how many are you talking about?
        the number of businesses would be in the hundreds at best, not thousands.Initially between 60 -100.

        Will these businesses be self-provisioned (ie they setup themselves) or will your system administrator be manually configuring each?
        at the start our system administrator will be required to do most of the setup work this may change as the service matures.

        Will you need to provide separate webapps for each, or will you be able to get away with a single webapp for all (perhaps with individualised styling/branding)?
        the later, 'like' businesses (for example all cinemas) will use the same webapp with individualised styling.

        Will the total data storage exceed the capacity of one database server (favouring separate databases) or is it only going to be comfortably storable on one database server?
        I would like to warehouse data soon after the performance of the service (lets say one month) so the database(s) should stay at a 'managable' size.

        Or do you have budget for a database like Oracle RAC? We intend to provide different levels of service. Businesses with higher enterprise demands will be serviced with better quality enterprise tools (where Oracle RAC might be an option), smaller businesses (such as cinemas) with less demanding needs may opt for a lower (and cheaper) level of service (where open source tools maybe used )

        Will customers be able to write their own SQL queries (or otherwise have access to the raw database tables) or will they be using pre-written reports etc?
        We would want to limit access initially, a client would allow queries within defined parameters.

        Thanks for the book reference - sounds good.i was considering messaging but i though asynchronous messaging wouldnt be possible given real-time restrictions (for example if you buy a ticket to the cinema online those details MUST be at the cinema immediately for verification purposes). the other options you mentioned sounded interesting and i will investigate.
        great reply - thanks

        katentim - thanks for the oracle tip, we'll definately investigate that if we settle on housing multiple businesses in a single database.

        Comment


        • #5
          Sounds like you could live with a single webapp, single database, with messaging to send ticket data to the cinemas that gets absorbed into their local database. Indeed you could write a Spring Rich Client app which polls your webapp for new tickets, writing results into the cinema's local database via JDBC. No messaging at all. Just fine-tune your poll interval so you get the tickets being delivered quickly enough. It's not too hard to put a "messagesWaiting" column in your cinena database so such polls can be answered very quickly. A bonus of this architecture is there is no administrative provisioning required. Each extra client is merely a database row (probably with some config data as well). They can launch the Rich Client app via Java Web Start, login, configure their JDBC connection (which is stored via the Preference API) and it's done. Of course, you'll need to review whether your long-term requirements would be met, but KISS is a time-tested strategy.

          Comment

          Working...
          X