Announcement Announcement Module
Collapse
No announcement yet.
Spring 3.1 + Spring Data JPA 1.0.2 + Hibernate 3.6.9 Slow Transaction Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Spring 3.1 + Spring Data JPA 1.0.2 + Hibernate 3.6.9 Slow Transaction

    I'm facing a problem with a new project. I have a scheduled task (@Scheduled) that keep synchronized two tables of different datasources.

    The source table is on a mssql server and the destination table on a mysql server. I have about 30.000 entities in the source table. In a precedent project I do the same task with quartz and it takes about 80secs.

    In current project it takes about 1500secs, but I cannot find a solution. I'm using same libraries and same libraries version. The difference is that in new project I'm using Spring Data JPA, but in order to exclude that it could be the cause I inject the EntityManager directly in my Service class. Here is the code:

    Code:
    @PersistenceContext(unitName="persistenceUnit")
    EntityManager em;
    
    @PersistenceContext(unitName="persistenceUnitI24")
    EntityManager emI24;
    
    @Transactional(readOnly=false)
    @Scheduled(cron="0 20 * * * ?")
    public void synk() throws DataAccessException {
        Session session = (Session) emI24.unwrap(Session.class);
        ScrollableResults customersI24 = session.createQuery("from CustomerI24")
            .setCacheMode(CacheMode.IGNORE)
            .scroll(ScrollMode.FORWARD_ONLY);;
    
        LOG.info("Begin.");
        long startTime = System.currentTimeMillis();
        int count = 0;
        int added = 0;
        int updated = 0;
        boolean isNew;
    
        while (customersI24.next()){
            isNew = false;      
            CustomerI24 customerI24 = (CustomerI24)customersI24.get(0);
    
            Customer customer = null;
            try{ 
                em.createQuery("from Customer c where c.idCustomer = :idCustomer",Customer.class)
                    .setParameter("idCustomer", customerI24.getId())
                    .getSingleResult();
            }catch(NoResultException e){}
    
            if (customer == null){
                isNew = true;
                customer = new Customer();
            }
    
            copyProperties(customerI24, customer);
    
            if(isNew){
                added ++;
                em.persist(customer);
            }else{
                updated ++;
                em.merge(customer);
            }   
    
            if ( ++count % 20 == 0 ) {
                //flush a batch of updates and release memory:
                session.flush();
                session.clear();
            }
        }
        session.flush();
        session.clear();
        LOG.info("Added: "+ added + " - updated: "+ updated);
        LOG.info("Elsapsed time:"+((System.currentTimeMillis() - startTime)/1000));
    }
    This is my persistence.xml file:

    Code:
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <persistence xmlns="http://java.sun.com/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="2.0" xsi:schemaLocation="http://java.sun.com/xml/ns/persistence http://java.sun.com/xml/ns/persistence/persistence_2_0.xsd">
        <persistence-unit name="persistenceUnit" transaction-type="RESOURCE_LOCAL">
            <provider>org.hibernate.ejb.HibernatePersistence</provider>        
            <class>...</class>
            <exclude-unlisted-classes />
            <properties>
                <property name="hibernate.dialect" value="org.hibernate.dialect.MySQL5InnoDBDialect"/>
                <property name="hibernate.hbm2ddl.auto" value="update"/> 
                <property name="hibernate.ejb.naming_strategy" value="org.hibernate.cfg.ImprovedNamingStrategy"/>
                <property name="hibernate.connection.charSet" value="UTF-8"/>
                <property name="hibernate.jdbc.batch_size" value="20"/>
                <property name="hibernate.show.sql" value="true" />
    
                <property name="hibernate.ejb.event.post-insert" value="org.hibernate.ejb.event.EJB3PostInsertEventListener,org.hibernate.envers.event.AuditEventListener" />
                <property name="hibernate.ejb.event.post-update" value="org.hibernate.ejb.event.EJB3PostUpdateEventListener,org.hibernate.envers.event.AuditEventListener" />
                <property name="hibernate.ejb.event.post-delete" value="org.hibernate.ejb.event.EJB3PostDeleteEventListener,org.hibernate.envers.event.AuditEventListener" />
                <property name="hibernate.ejb.event.pre-collection-update" value="org.hibernate.envers.event.AuditEventListener" />
                <property name="hibernate.ejb.event.pre-collection-remove" value="org.hibernate.envers.event.AuditEventListener" />
                <property name="hibernate.ejb.event.post-collection-recreate" value="org.hibernate.envers.event.AuditEventListener" />
    
                <property name="org.hibernate.envers.audit_table_suffix" value="_H" />
                <property name="org.hibernate.envers.revision_field_name" value="AUDIT_REVISION" />
                <property name="org.hibernate.envers.revision_type_field_name" value="ACTION_TYPE" />
                <property name="org.hibernate.envers.audit_strategy" value="org.hibernate.envers.strategy.ValidityAuditStrategy" />
                <property name="org.hibernate.envers.audit_strategy_validity_end_rev_field_name" value="AUDIT_REVISION_END" />
                <property name="org.hibernate.envers.audit_strategy_validity_store_revend_timestamp" value="true" />
                <property name="org.hibernate.envers.audit_strategy_validity_revend_timestamp_field_name" value="AUDIT_REVISION_END_TS" />
    
            </properties>         
        </persistence-unit>    
        <persistence-unit name="persistenceUnitI24" transaction-type="RESOURCE_LOCAL">
            <provider>org.hibernate.ejb.HibernatePersistence</provider>
            <class>...</class>
            <exclude-unlisted-classes />
            <properties>
                <property name="hibernate.dialect" value="org.hibernate.dialect.SQLServerDialect"/>
                <property name="hibernate.hbm2ddl.auto" value="validate"/>
                <property name="hibernate.connection.charSet" value="UTF-8"/>
                <property name="hibernate.showsql" value="true"/>           
            </properties>
        </persistence-unit> 
    </persistence>
    Here definition of mysql datasource:
    
    <bean id="dataSource" class="org.apache.commons.dbcp.BasicDataSource" destroy-method="close">
                <property name="driverClassName" value="com.mysql.jdbc.Driver" />
                <property name="url" value="jdbc:mysql://xxx:3306/xxx?autoReconnect=true" />
                <property name="username" value="..." />
                <property name="password" value="..." />
                <property name="validationQuery" value="select 1" />
                <property name="testWhileIdle" value="true" />
                <property name="testOnBorrow" value="true" />
                <property name="testOnReturn" value="false" />
                <property name="initialSize" value="5" />
                <property name="maxActive" value="10" />
                <property name="maxIdle" value="2" />
            </bean>
    And the jpa tx config:

    Code:
    <bean id="transactionManager" class="org.springframework.orm.jpa.JpaTransactionManager">
            <property name="entityManagerFactory" ref="emf"/>
        </bean>
    
        <tx:annotation-driven transaction-manager="transactionManager" />
    
        <bean id="emf" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
            <property name="dataSource" ref="dataSource" />
            <property name="persistenceUnitName" value="persistenceUnit"/>
        </bean>
    
        <jpa:repositories base-package="it.cpmapave.frigoristi.repository"
                          entity-manager-factory-ref="emf"
                          transaction-manager-ref="transactionManager"/>
    
        <jpa:auditing auditor-aware-ref="auditorAwareBean"/>
    
        <bean id="auditorAwareBean" class="it.cpmapave.frigoristi.auditor.AuditorAwareBean"/>
    Have any of you ever had this problem? Do you have any idea on how to solve it?

    thanks Marco
    Last edited by mserioli; Jan 17th, 2013, 05:30 AM.

  • #2
    Ehrm you have an entitymanager but revert to using hibernate ?! WHy? What is wrong with directly using the entitymanager (i ran into more issues that it is worth when using hibernate over an entitymanager). I would start with changing that.

    Also you have hibernate Envers which adds overhead the same for the Auditing stuff from spring data.

    Your flushing and clearing the wrong stuff... You should be flushing/clearing the writing side because now everytime you write something each object in the cache is checked and this is taking longer and longer depending on the amount of objects in the first level cache.

    Comment


    • #3
      Originally posted by Marten Deinum View Post
      Ehrm you have an entitymanager but revert to using hibernate ?! WHy? What is wrong with directly using the entitymanager (i ran into more issues that it is worth when using hibernate over an entitymanager). I would start with changing that.

      Also you have hibernate Envers which adds overhead the same for the Auditing stuff from spring data.

      Your flushing and clearing the wrong stuff... You should be flushing/clearing the writing side because now everytime you write something each object in the cache is checked and this is taking longer and longer depending on the amount of objects in the first level cache.
      Thank you for your answer!

      I'm using Hibernate in order to get an iterator that allows moving around results. I don't know how to do it with entity manager! But in the original attempt I didn't using hibernate directly. I had wildly calling findAll on my jpa repository to get the entire list (and it takes about 40~50seconds to populate the list!)


      Envers is disabled on the involved entity.. But in other project (where same kind of transaction is more fast), with envers enabled it takes about 50 ~ 60 seconds for the same number of entity (not 1500!!)!

      I don't understand what you mean about flushing/clearing the writing side. Could you tell me in other words what should I do?

      thanks again for the answer!

      Comment


      • #4
        I don't understand what you mean about flushing/clearing the writing
        Just as I say..

        You are reading using hibernate (the session) and this is what you flush/clear whereas you should flush/clear the entitymanager. As mentioned each time you call persist on the entitymanager each object already persisted and available is checked if it needs persisting. With 10 objects this isn't problematic however with 20000 this check adds up and the time expands on each persist/merge operation...

        Comment


        • #5
          oooh now I've understand (session.flush() -> em.flush(). I change my code into:

          Code:
          @PersistenceContext(unitName="persistenceUnit")
          EntityManager em;
          
          @PersistenceContext(unitName="persistenceUnitI24")
          EntityManager emI24;
          
          @Transactional(readOnly=false)
          @Scheduled(cron="0 20 * * * ?")
          public void synk() throws DataAccessException {
              Session session = (Session) emI24.unwrap(Session.class);
              ScrollableResults customersI24 = session.createQuery("from CustomerI24")
                  .setCacheMode(CacheMode.IGNORE)
                  .scroll(ScrollMode.FORWARD_ONLY);;
          
              LOG.info("Begin.");
              long startTime = System.currentTimeMillis();
              int count = 0;
              int added = 0;
              int updated = 0;
              boolean isNew;
          
              while (customersI24.next()){
                  isNew = false;      
                  CustomerI24 customerI24 = (CustomerI24)customersI24.get(0);
          
                  Customer customer = null;
                  try{ 
                      em.createQuery("from Customer c where c.idCustomer = :idCustomer",Customer.class)
                          .setParameter("idCustomer", customerI24.getId())
                          .getSingleResult();
                  }catch(NoResultException e){}
          
                  if (customer == null){
                      isNew = true;
                      customer = new Customer();
                  }
          
                  copyProperties(customerI24, customer);
          
                  if(isNew){
                      added ++;
                      em.persist(customer);
                  }else{
                      updated ++;
                      em.merge(customer);
                  }   
          
                  if ( ++count % 20 == 0 ) {
                      //flush a batch of updates and release memory:
                      em.flush();
                      em.clear();
                  }
              }
              em.flush();
              em.clear();
              LOG.info("Added: "+ added + " - updated: "+ updated);
              LOG.info("Elsapsed time:"+((System.currentTimeMillis() - startTime)/1000));
          }
          I think I found the problem.

          The query:
          Code:
          from Customer c where c.idCustomer = :idCustomer
          idCustomer is not Customer table's primary key. The table primary key is id.

          I have this layout because Customers imported from external database and Customers created by webapp need to coexist! And during scheduled task I try to hook customer defined in webapp to customer defined in external db searching it by a unique and common field.

          The solution is easy.. yust create an @Index on column

          sorry for the banality of the problem...

          Marco

          Comment


          • #6
            Both changes should improve the speed. Also you might consider creating the query only once and reuse. Also if you don't really do anything with the entities you can use merge always also already loaded entities are automatically managed so a merge wouldn't be needed.

            Code:
            EntityManager emI24;
            
            @Transactional(readOnly=false)
            @Scheduled(cron="0 20 * * * ?")
            public void synk() throws DataAccessException {
                Session session = (Session) emI24.unwrap(Session.class);
                ScrollableResults customersI24 = session.createQuery("from CustomerI24")
                    .setCacheMode(CacheMode.IGNORE)
                    .scroll(ScrollMode.FORWARD_ONLY);;
            
                LOG.info("Begin.");
                long startTime = System.currentTimeMillis();
                int count = 0;
                int added = 0;
                int updated = 0;
            
                while (customersI24.next()){
                    isNew = false;      
                    CustomerI24 customerI24 = (CustomerI24)customersI24.get(0);
            
                    Customer customer = null;
                    Query<Customer> query = em.createQuery("from Customer c where c.idCustomer = :idCustomer",Customer.class)
                    try{ 
                        customer = query.setParameter("idCustomer", customerI24.getId()).getSingleResult();
                    }catch(NoResultException e){}
            
                    if (customer == null){
                        added++;
                        customer = new Customer();
                    } else {
                    	updated++;
                    }
            
                    copyProperties(customerI24, customer);
            
            		if (!em.contains(customer) ) {
            			em.merge(customer);
            		}
            
                    if ( ++count % 20 == 0 ) {
                        //flush a batch of updates and release memory:
                        em.flush();
                        em.clear();
                    }
                }
                em.flush();
                em.clear();
                LOG.info("Added: "+ added + " - updated: "+ updated);
                LOG.info("Elsapsed time:"+((System.currentTimeMillis() - startTime)/1000));
            }
            Something like the above might help improve it a little more. Also beware that logging adds additional overhead!

            Comment


            • #7
              thank you!

              Now the job is done in 90 seconds on a scrap computer (instead of 1500 seconds of the first attempt).

              The changes you suggested in previous post save about 20 seconds.

              Thank You!!

              Comment

              Working...
              X