Announcement Announcement Module
No announcement yet.
Lazy/Eager related data and Spring Data Repos Page Title Module
Move Remove Collapse
Conversation Detail Module
  • Filter
  • Time
  • Show
Clear All
new posts

  • Lazy/Eager related data and Spring Data Repos

    OK, so I am working on my repositories and I am getting a little confused.

    I want some requests where I am just going from the domain to a list of related domains mapped with @RelatedTo.

    When I load one object like a User which has a list of Events that they are hosting or another list of Events they are attending. And I want to get the list of Events. I am not using @Eager as I don't want them eager fetched. So do I have to create a Cypher query in my Repository for User, or if I have the User and call getEventsHosting() it will lazy load the Events like JPA does with proxies?

    And if I want to go the other way, where I have the Event and I want the List of Users attending. Would I put this query in the User repository or the Event repository. It is returning a List of Users. Does it depend on the direction of the relationship?

    Thanks for the help.


  • #2
    You can use template.fetch(user.getEventsHosting()) which should load the events.

    Right now there is no lazy loading with proxies, I'm scared to go along the route of JPA due to the ever increasing complexity in this part of the mapping (cascading reads, updates and deletes).




    • #3
      Yeah, I agree the moment you start trying to create your own ORM tool the way more work you will find yourself doing. I wrote one long long ago in VB 6.0 and spent years always maintaining it and adding new features.



      • #4
        Which would be faster. Using template.fetch(), which probably would be a line in a Service class of mine, or creating a repository interface method with a Cypher query. I would assume fetch creates a cypher query, probably the same one I would put on my repo interface method.

        OK, as I was posting, I went and did some very very basic testing. I mean small scale, so larger scale could be quite different. But this is what I found. I have an integration JUnit test that queries three users for their friends. One node away. The three users are friends amongst themselves. auser friends cuser, buser friends cuser and therefore cuser is friends with both a and b users.

        Here is that test code

            public void testFindFriends() {
                User aUser = userRepository.findByLogin("auser");
                User bUser = userRepository.findByLogin("buser");
                User cUser = userRepository.findByLogin("cuser");
                List<User> friends = userRepository.findFriends(cUser, new PageRequest(0,10)).getContent();
                //Set<User> friends = template.fetch(cUser.getFriends());
                System.out.println("cusers friends: " + friends);
                assertEquals("cUser should have two friends", 2, friends.size());
                assertTrue("cuser should be friends with buser", friends.contains(bUser));
                assertTrue("cuser should be friends with auser", friends.contains(aUser));
                friends = userRepository.findFriends(aUser, new PageRequest(0,10)).getContent();
                //friends = template.fetch(aUser.getFriends());
                System.out.println("ausers friends: " + friends);
                assertEquals("aUser should have one friend", 1, friends.size());
                assertTrue("auser one friend should be cuser", friends.contains(cUser));
                friends = userRepository.findFriends(bUser, new PageRequest(0,10)).getContent();
                //friends = template.fetch(bUser.getFriends());
                System.out.println("busers friends: " + friends);
                assertEquals("bUser should have one friend", 1, friends.size());
                assertTrue("buser one friend should be cuser", friends.contains(cUser));
        When running it via a repo method

        @Query("start user=node({0}) " +
                   "match (user)-[:FRIEND]-(friends) " +
                   "return friends " +
                   "order by friends.lastName asc")
            public Page<User> findFriends(User user, Pageable page);
        I saw unit test run times between 4.2 ms to 4.5 ms

        When I tried the same but using the template.fetch(auser.getFriends()) etc, I had some faster times, but also a bigger range of times that were slower than a repo.

        So with template I had run times between 4.0ms to 5.0ms.

        Remember all, metrics like this should always be taken with a grain of salt. I tried to run it at least 20 times each, but the sampling rate and data is small.




        • #5
          Runtime differences can be caused by data being already in the hot cached dataset or not, aka first or subsequent run that touches that data.

          Probably it runs just into the error factor of micro-benchmarks too many longer run-time things happening as well, which affect your data retrieval. Also you still have the Spring startup + Test runner overhead.

          If you want to see what the maximum performance is, write a micro-benchmark app, that uses the native neo4j API, let it run to warm-up and then just measure the inner loop performing some thousand times so that hotspot has kicked in etc.



          • #6
            All the numbers have Also you still have the Spring startup + Test runner overhead.

            So that is consistent. I didn't mean to post those numbers as meaning that is how long Neo4J takes to do a simple query. The numbers should be taken into account that they all have other overheads that are running, but those overheads are always the same for every run. So the numbers were more interesting in terms of variance and difference between them. Not the actual number itself.

            Like all the tests have the same overhead, the only difference between them is the use of Cypher query in Repository, versus using the Neo4J Template.fetch method. Whereas using the template I had a wider range of times compared to the Cypher query which was a tighter range.



            • #7
              Template.fetch just iterates over the collection and fetches each of the items. So the variance is perhaps GC related or other things that happen meanwhile?