Announcement Announcement Module
Collapse
No announcement yet.
Spring Data Neo4j Multi Value search indexes Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Spring Data Neo4j Multi Value search indexes

    I took a stab at a solution to my problem but it didn't work out, so I'm curious if I'm bumping into a limitation in Spring Data Neo4J or if I'm going about this wrong. Suppose I want to model People, and each Person object has a name, suppose also that a person object could have one or many aliases. I want to configure my indexes, so that I can search for any name or alias, and find that node (preferably in one search). I was hoping that this would work.

    Code:
    class Person {
    	@Indexed(indexName = "PERSON_NAME") private String name;
    	@Indexed(indexName = "PERSON_NAME", fieldName="name") private Set<String> aliases;
            ...
    }
    Then let's say I save an object:
    { name:"Sean Coombs", aliases:["P-diddy", "Puff Daddy"]}

    I was hoping searching PERSON_NAME index for "name:Puff Daddy" or any other name or alias would return my new node. This does not appear to be the case. Any ideas of how to solve this? Or is this a bug, and this code should do what I expect?

    Additionally, if you tell me I have to search("name:P-diddy") ?: search("alias:P-diddy"). I could live with that even if not ideal, but at this point the alias index doesn't work at all even if I remove the fieldName attribute, so I'm not sure how to properly configure indices on multi-value attributes.

    Any help appreciated,
    Ed

  • #2
    Just throwing out ideas. Not that they are right.

    But I have two

    1) Create a node type of Name. Then for Puff Daddy, I would see three of them. Three different nodes. But all three are related to one node of type Person. Then it is just a traversal type query. You could even have two different types of relationships. One called NAME for the real name, and the other ALIAS. Then in the MATCH portion of the Cypher connect via both relationships and put into the Where clause for the search string using an OR.

    2) I believe each index name has to be different names. You can't use "PERSON_NAME" for two different indexes. But you can put both properties in a Where portion.

    Mark

    Comment


    • #3
      Thanks Mark,

      I'm hoping to avoid 1. I think it will work, but I'm worried if I have millions of nodes and need to create 5-10 nodes per entity, I'll bump into the scalability limit of Neo4j real fast. Well, at least much faster than I would have otherwise.

      Ed

      Comment


      • #4
        Ed,

        Sorry for the late reply, I needed to get my head around this properly.

        So, there are actually a couple of issues here:

        1) You cannot shoehorn two fields into the same fieldname on the same index
        2) Collections are indexed by their toString value

        That said, here is some code that illustrates the problem and works around it. For your case, consider creating a special field with both name and aliases in it. On the one hand, that violates modelling purity. But on the other hand, these entities are really DTOs with the special purpose of facilitating database querying. So I think it is acceptable.

        Notice the aliasesAsCollection-field, it is a Lucene full-text index on a space-delimited string representation of the list of aliases. If you need stricter matching you would probably have to add quotes in places.

        Code:
        interface PersonRepository extends GraphRepository<Person>, NamedIndexRepository<Person> {
        
        }
        
        @NodeEntity
        class Person {
            @GraphId
            Long id;
        
            @Indexed
            String name;
        
            @Indexed
            String[] aliasesAsArray;
        
            @Indexed(indexName = "extra_index", indexType = IndexType.FULLTEXT)
            Set<String> aliasesAsCollection;
        
            Person() {
            }
        
            Person(String name, final Set<String> aliases) {
                this.name = name;
        
                aliasesAsArray = aliases.toArray(new String[aliases.size()]);
        
                aliasesAsCollection = new HashSet<String>(aliases) {
                    @Override
                    public String toString() {
                        StringBuilder stringBuilder = new StringBuilder();
        
                        for (String alias : aliases) {
                            stringBuilder.append(alias);
                            stringBuilder.append(" ");
                        }
        
                        return stringBuilder.toString();
                    }
                };
            }
        }
        
        @RunWith(SpringJUnit4ClassRunner.class)
        @ContextConfiguration
        @Transactional
        public class IndexTests {
            @Configuration
            @EnableNeo4jRepositories
            static class TestConfig extends Neo4jConfiguration {
                @Bean
                GraphDatabaseService graphDatabaseService() {
                    return new ImpermanentGraphDatabase();
                }
            }
        
            @Autowired
            Neo4jTemplate template;
        
            @Autowired
            GraphDatabaseService graphDatabaseService;
        
            @Autowired
            PersonRepository personRepository;
        
            @Before
            public void before() {
                personRepository.save(new Person("Shawn Corey Carter", new HashSet<String>(asList("Jay-Z"))));
                personRepository.save(new Person("Sean Coombs", new HashSet<String>(asList("P-diddy", "Puff Daddy"))));
                personRepository.save(new Person("Curtis James Jackson", new HashSet<String>(asList("50 Cent"))));
            }
        
            @Test
            public void shouldFindByMemberOfIndexedArray() throws Exception {
                assertThat(personRepository.findAllByQuery("aliasesAsArray", "P-diddy").single().name, is(equalTo("Sean Coombs")));
                assertThat(personRepository.findAllByQuery("aliasesAsArray", "\"Puff Daddy\"").single().name, is(equalTo("Sean Coombs")));
            }
        
            @Test
            public void shouldFindByMemberOfIndexedCollection() throws Exception {
                assertThat(personRepository.findAllByQuery("extra_index", "aliasesAsCollection", "P-diddy").single().name, is(equalTo("Sean Coombs")));
                assertThat(personRepository.findAllByQuery("extra_index", "aliasesAsCollection", "Puff Daddy").single().name, is(equalTo("Sean Coombs")));
                assertThat(personRepository.findAllByQuery("extra_index", "aliasesAsCollection", "Puff").single().name, is(equalTo("Sean Coombs")));
                assertThat(personRepository.findAllByQuery("extra_index", "aliasesAsCollection", "Daddy").single().name, is(equalTo("Sean Coombs")));
            }
        }
        Hope that helps. I'll raise a jira for the collection indexing problem.

        Regards,

        Lasse

        Comment


        • #5
          FYI:

          https://jira.springsource.org/browse/DATAGRAPH-290

          Lasse

          Comment

          Working...
          X