Announcement Announcement Module
No announcement yet.
Spring data graph state history Page Title Module
Move Remove Collapse
Conversation Detail Module
  • Filter
  • Time
  • Show
Clear All
new posts

  • Spring data graph state history

    I'm working on a project that requires node state histories be saved. The use case being that a user may need to see what the system looked like at a given datetime, ex: September 3rd at 11:42am.

    My design concept to achieve this in Neo4j is the following:
    • Each node that has a history has a relationship of type “was”. The was relationship points to a node that contains a difference of what the values currently are and what they were prior to the “was” relationships modDate. The “was” node contains no relationships other than to the node that it became.
    • Relationships will contain a createDate and a deleteDate.
    Fig 1. Small example graph

    To traverse the graph as it is currently, you use the top nodes and only follow relationships that don't have a deleteDate. If you want to see the graph at a given datetime, you follow relationships where the createDate is before the given date, and the deleteDate is after the given date, or does not yet exist. To get the valid properties of a node, you follow the was relationships down overwriting properties as you go until the modDate is before the given date.

    I'm looking at implementing this in Spring Data Graph. I was thinking about creating annotations along the lines of @NodeHistoryEntity, and giving it a property annotation like @IgnoreHistory for properties that shouldn't be historically stored.

    Looking for thoughts / suggestions before I spend a ton of time diving into the Spring Data Graph code.

    Thanks in advance, Tye.
    Attached Files

  • #2

    that's an very interesting approach. Would love to support you with that.

    Could you outline the operations, lifecycle that you'd like to have covered in SDG ?

    Parts of it could be handled by having a TransactionEventHandler that takes care of node deletion, update or creation.

    Right now I don't wrap that in SDG. I'm working in a separate branch on a session concept.
    Within the Event Handler you could access the EntitySession to get the entities for the nodes that have been updated or created.
    You can check your annotations @NodeHistoryEntity and @IgnoreHistory in that handler (and probably the traversals too).

    To create relationships on the fly you should use entity.relateTo().

    For your traversals you could either use the cypher querying facilities or appropriate traversals, that would for instance return the Subgraph you're interested in.

    It would be the best if you created a small proof of concept project and share it on github so that we both could look at the code.




    • #3
      Thank you for the quick reply,

      Any support would be great as I'm very new to Spring Data Graph.
      I was actually kind of hoping someone would reply with "You're an idiot, just use this simple thing and you're done". But life is never so simple I suppose.

      Here is a start to what I think the life cycle may look like, and some supporting operations.

      1. Node creation – Entity.persist()
        Create a node with the properties of the entity, also include a createDate.

        It should be noted that best practice for creating entities would be to do something like the following, so the first name change and last name change are not recorded as history changes but are rather created on the first node initialization.
        User u = new User();
        //as opposed to
        User u = new User().persist();
      2. Update a property – Entity.setProperty(Property new)
        When a property is set, a new node should be created. This node will contain the old key-value of the property. If there already exists a “was” relationship on the current entity, that relationship will be transferred over to the new node. The current node will be given a new “was” relationship to the new node, with the modDate of the relationship being set to now().

        A new operation would be very useful here. Something to allow for batch property updating within a single transaction to prevent node histories from becoming unwieldy. I'm not sure of the best way to approach this, perhaps something like:

        User u = userRepository.findOne(23);
        u.autoCommit = false;  //this will temporarily break the synchronized behaviour.
        u.autoCommit = true;  //this will re-enable the auto synchronization and call persist() to re-attach
      3. Create a relationship
        Creating a relationship will create a relationship with a createDate property.

        Note: Relationships that have properties (other than create/delete dates) will have to be treated very similarly to how historical nodes are. That's to say, they may require a “was” relation to nodes that contain their previous properties.

      4. Removing an Entity
        Removing an entity adds a deleteDate to the properties of the entity, and adds a deleteDate to all historical relationships the entity has that don't already contain a deleteDate.

      5. Removing a relationship
        Add a deleteDate to the relationship


      @Historical - //Add to a class with @NodeEntity or @RelationshipEntity to store a history along with the node.

      By default all properties will have a history saved. This can be dissabled by adding
      @GraphProperty(history = false)

      Will have to add a property to @RelatedTo to allow for the automatic creation of historical nodes.
      This should default to true on NodeEntities that are annotated with @Historical and can be dissabled via
      @RelatedTo(history = false)

      An interface called GraphHistoryRepository would need to be designed to implement a way to get historical nodes given dates.

      I'm sure there is a lot more to think out, this is just a start.
      Let me know what you think is a good idea, and what you think is a terrible idea.

      Thanks again, Tye.


      • #4
        I think the best way to do this, is to manually play through some use cases and see if you've covered all the issues. Try to draw them and see if you are not missing corner cases.

        Normally in a transactional setting I would only update historic node information at the end of the transaction. (That should cover your e.autoCommit) feature.

        Don't add attributes to existing annotations, just create new ones for your purpose.

        you said the "was" rel will be transferred to the new entity, but then the previous node looses it's link in the chain.
        Or do you just have "one" historical node for each current node? Not sure if I understood you correctly.

        Please note, you can't create relationships to relationships. So either you model your historical relationships as nodes as well or you have to find some other way to store historical information in the relationship itself (e.g. in some separate properties).

        Please look into the TransactionEventHandler API.

        What is your use-case for the versioning of the graph?

        It is probably easiest to create a proof of concept just using the neo4j-core API (or neo4j-template) to see if it all works out after that you can create the infrastructure classes that handle your lifecycle automatically in the SDG Entity universe.




        • #5
          Yeah, there's only one historical node per node.

          I looked at the TransactionEventHandler API it does look like the right place to start. I think you're right about doing up a proof of concept, so that's what I'll do next.

          Thanks for the heads up about relationships, I wasn't aware of that. I will have to think something up for those cases then.

          I wasn't planning on implementing any versioning in the graph. Was only thinking of using datetimes, do you see that being a possible issue?

          Thanks again,


          • #6
            Ah, ok, I misunderstood that.

            For relationship you also could create a appropriate history-node and link to it from the start-node's history node (and perhaps the end-node's to).

            No I think it should be fine, you'll see you when you play through the use-cases and do the simple PoC.


            • #7
              So I spent Sunday working on a proof of concept. Along the way I was messing with the TransactionEventHandler and realized it might not be the right path to take for a couple of reasons.

              The most notable is that in the documentation they note while it is "possible to perform mutating operations in this method" that "Changes made in this method are not guaranteed to be visible by this or other TransactionEventHandlers." Since storing the histories in nodes and changing the "was" relationships around is fairly 'invasive', I thought that instead of TransactionEventHandlers perhaps aspects and inheritance might be the way to go.

              The other issue using the TransactionEventHandle is that when deleting a node I don't see a way to prevent the Transaction from actually deleting the node. This could be solved by creating a new 'copy' of the deleted node and creating new relationships that mirror all the deleted ones. However this seems a little messy.

              The final issue is that given the circumstance where someone wants to exert Neo4j level control over their node, we don't want to introduce unexpected behavior.

              I think I'm going to try to implement much of the logic in a parent abstract class for the proof of concept. Do you see the above as being non-issues? Do you think inheriting the behavior is a bad direction to go?

              Thanks again for your advice, helps a lot to have someone to ask when hitting things like this.