Announcement Announcement Module
Collapse
No announcement yet.
Criteria Query returning Duplicate Records / Objects when using Eager Fetching Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Criteria Query returning Duplicate Records / Objects when using Eager Fetching

    I have an application that generates and stores directed graph data, then needs to reconstruct graphs from persisted data and perform analysis on them.

    When I try to read in and graph data (nodes and links) using a Criteria Query containing nested joins I'm getting duplicate records/objects returned from the database.

    My code, and console output of results are included below.

    Why is GORM returning duplicate Node objects? I thought that including the outLinks collection in my criteria enabled eager fetching, not a SQL JOIN type of result set.

    Why are additional (N+1) queries being executed when I access a Link in the outLinks collection (and also if I access the link.targetNode.id that an outLinks Link refers to)?

    I had this working better using Node.findAllByDateCreatedBetween(startDate, endDate, [ Criteria Map ]) and using other LAZYish methods (e.g. requiring one additional query per Node to get its outLinks collection), but I couldn't get it to do eager fetching of the outLinks collection and their link.targetNode objects, which is what I'm trying to accomplish.

    I really want to optimize this process of graph reconstruction because I'm dealing with very large graphs, on the order of 10^8 nodes (and an order of magnitude or two more links).

    Code:
    ***** Grails Domain Classes ********************
    
    package graphanalysis
    
    class Node {
    
        NodeCreationTask creationTask
        String name
        Date dateCreated
        
        static belongsTo = NodeCreationTask
        
        static mapping = {
            table: 'nodes'
            id generator: 'sequence'
            version false
            creationTask index: 'CreationTask_Name_Idx,CreationTask_Idx'
            name index: 'creationTask_Name_Idx,Name_Idx'
            dateCreated index: 'DateCreated_Idx'
        }
    
        static hasMany = [outLinks: Link,
                          inLinks: Link,
                          comunities: NodeCommunityMap]
        
        static mappedBy = [outLinks: 'sourceNode',
                           inLinks: 'targetNode',
                           comunities: 'node']
    
        static constraints = {
            name maxSize: 1024, unique: ['creationTask']
            dateCreated maxSize: 13
        }
    
    }
    
    class Link {
    
        Node sourceNode
        Node targetNode
        
        static belongsTo = Node
    
        static mapping = {
            table: 'links'
            id generator: 'sequence'
            version false
            sourceNode index: 'Source_Idx'
            targetNode index: 'Target_Idx'
        }
        
        static constraints = {
            targetNode unique: ['sourceNode']
        }
    }
    
    class NodeCreationTask {
        ...
    }
    
    class NodeCommunityMap {
        ...
    }
    
    
    ***** Grails Service ********************
    
    class CreateGraphService {
        
        def sessionFactory
        def propertyInstanceMap = org.codehaus.groovy.grails.plugins.DomainClassGrailsPlugin.PROPERTY_INSTANCE_MAP
        
        def run(Date startDate, Date endDate) {
            
            def PAGE_SIZE = 10
            def curOffset = 0
            def List<Node> nodes
            def totNodes = 0
            
            // create criteria instance
            def criteria = Node.createCriteria()
    
            // get result set 
            nodes = criteria.list {
                between('dateCreated', startDate, endDate)
                maxResults PAGE_SIZE
                firstResult curOffset
                join outLinks {
                    join targetNode {}
                }
            }
            
            // -- do pagination here when ready --
            
            // count the results
            println( ">>>>> Criteria Query has " + webNodes.size() + " results <<<<<" )
    
            // iterate through results
            for (node in nodes) {
                def size = node.outLinks.size()
                println node
                println "  has " + size + " outlinks to:"
                if (size > 0) {
                    node.outLinks.each { link ->
                        println "    --> [ " + link.targetNode + " ]"
                    }
                }
                ++totNodes
            }
            
            // -- loop pagination here when ready --
    
            log.info("Loaded a total of " + totNodes + " nodes for the provided date range")
        }
    }
    
    
    ***** Console output ********************
    
    Hibernate: select this_.id as id11_2_, this_.date_created as date2_11_2_, this_.name as name11_2_, this_.node_creation_task_id as nodecreation4_11_2_, outlinks_a1_.id as id0_0_, outlinks_a1_.source_node_id as source2_0_0_, outlinks_a1_.target_node_id as target3_0_0_, targetnode2_.id as id11_1_, targetnode2_.date_created as date2_11_1_, targetnode2_.name as name11_1_, targetnode2_.node_creation_task_id as web4_11_1_ from node this_ inner join link outlinks_a1_ on this_.id=outlinks_a1_.source_node_id inner join node targetnode2_ on outlinks_a1_.target_node_id=targetnode2_.id where this_.date_created between ? and ? limit ?
    >>>>> Criteria Query has 10 results <<<<<
    Hibernate: select outlinks0_.source_node_id as source2_11_1_, outlinks0_.id as id1_, outlinks0_.id as id0_0_, outlinks0_.source_node_id as source2_0_0_, outlinks0_.target_node_id as target3_0_0_ from link outlinks0_ where outlinks0_.source_node_id=?
    graphanalysis.Node : 6495462
      has 5 outlinks to:
    Hibernate: select node0_.id as id11_0_, node0_.date_created as date2_11_0_, node0_.name as name11_0_, node0_.node_creation_task_id as web4_11_0_ from node node0_ where node0_.id=?
        --> [ graphanalysis.Node : 6495568 ]
    Hibernate: select node0_.id as id11_0_, node0_.date_created as date2_11_0_, node0_.name as name11_0_, node0_.node_creation_task_id as web4_11_0_ from node node0_ where node0_.id=?
        --> [ graphanalysis.Node : 6495527 ]
    Hibernate: select node0_.id as id11_0_, node0_.date_created as date2_11_0_, node0_.name as name11_0_, node0_.node_creation_task_id as web4_11_0_ from node node0_ where node0_.id=?
        --> [ graphanalysis.Node : 6495617 ]
        --> [ graphanalysis.Node : 6495502 ]
    Hibernate: select node0_.id as id11_0_, node0_.date_created as date2_11_0_, node0_.name as name11_0_, node0_.node_creation_task_id as web4_11_0_ from node node0_ where node0_.id=?
        --> [ graphanalysis.Node : 6495618 ]
    graphanalysis.Node : 6495462
      has 5 outlinks to:
        --> [ graphanalysis.Node : 6495568 ]
        --> [ graphanalysis.Node : 6495527 ]
        --> [ graphanalysis.Node : 6495617 ]
        --> [ graphanalysis.Node : 6495502 ]
        --> [ graphanalysis.Node : 6495618 ]
    graphanalysis.Node : 6495462
      has 5 outlinks to:
        --> [ graphanalysis.Node : 6495568 ]
        --> [ graphanalysis.Node : 6495527 ]
        --> [ graphanalysis.Node : 6495617 ]
        --> [ graphanalysis.Node : 6495502 ]
        --> [ graphanalysis.Node : 6495618 ]
    graphanalysis.Node : 6495462
      has 5 outlinks to:
        --> [ graphanalysis.Node : 6495568 ]
        --> [ graphanalysis.Node : 6495527 ]
        --> [ graphanalysis.Node : 6495617 ]
        --> [ graphanalysis.Node : 6495502 ]
        --> [ graphanalysis.Node : 6495618 ]
    
    ...
    
    graphanalysis.Node : 6495462
      has 5 outlinks to:
        --> [ graphanalysis.Node : 6495568 ]
        --> [ graphanalysis.Node : 6495527 ]
        --> [ graphanalysis.Node : 6495617 ]
        --> [ graphanalysis.Node : 6495502 ]
        --> [ graphanalysis.Node : 6495618 ]
    29 Jun 2012 13:37:20,419 [INFO ] grails.app.services.graphanalysis.CreateGraphService- Loaded a total of 50 nodes for the provided date range
Working...
X