Neo4j自动索引,传统索引和标签模式:相对于节点全文搜索的差异 [英] Neo4j auto-index, legacy index and label schema: differences for a relative-to-a-node full-text search

查看:693
本文介绍了Neo4j自动索引,传统索引和标签模式:相对于节点全文搜索的差异的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个问题部分回答在
neo4j-legacy-indexes-and-auto-index-vs-new-label-bases-schema-indexes

the-difference-between-legacy-indexing-auto-indexing-and-the -new-indexing-approach



我无法评论他们,并在此处撰写新帖。
在我的数据库中,我有一个遗留索引'主题'和标签'主题'。



我知道:




  • a。模式MATCH(n:标签)将扫描节点;
  • b。模式START(n:索引)将在传统索引

  • c上进行搜索。自动索引是一种遗留索引,应该和(b)一样的结果,但它并不适用于我的情况

  • d。 START条款应由MATCH替换为良好做法。


我之间的结果不一致。和b。 (见下文),无法弄清楚如何使用正确的MATCH语法来搜索标签索引。



这里有一些例子:

1#



start n = node:topic('name:(keyword1 AND keyword2)')return n

6行,3ms

  start n = node:node_auto_index('name:(keyword1 AND keyword2)')return n; 

0行

  MATCH(n:Topic)其中n.name =〜'(?i)。* keyword1 * .AND。* keyword2 *。'return n; 

0行,10K ms

2 #

  start n = node:topic('name:(keyword1)')return n 

$ b 212行122 ms [包含substring keyword1的所有连贯结果]

  start n = node:node_auto_index('name:(keyword1)')return n 



<0>

  MATCH(n:Topic)其中n.name =〜'(?i)。* keyword1 *。'return n 



<835行,8K ms [结果不一致,包含子字符串 [/ em>]

  MATCH(n:Topic)where n.name =〜'keyword1'return n; 

1行,> 6K ms [完全匹配]

  MATCH(n:topic)其中n.name =〜'keyword1'返回n; 

没有结果(这里我使用了一个索引'主题'而不是'Topic'标签! p>

  MATCH(node:topic)其中node.name =〜'keyword1'返回节点; 

没有结果(尝试直接使用节点对象,就像在自动索引语法中一样) p>

你能帮助解释一下:


  • 有什么区别传统索引和自动索引以及为什么两者之间的结果不一致?


  • 我想重现全文搜索的结果。

  • 全文搜索的语法只适用于节点的邻居,而不是全DB?比赛 ? START子句?传统索引?标签?我很困惑。

    解决方案

    自动索引(只有一个)是名称为 node_auto_index 的手动(aka legacy)索引。通过挂钩事务处理,该特殊索引跟踪对图形的更改。因此,如果您将 name 声明为配置中节点的自动索引的一部分,则对具有name属性的节点的任何更改都会反映到该索引中。



    请注意,添加例如自动索引时不会自动填充现有数据集一个用于自动索引的新属性。



    还要注意,手动索引或自动索引完全独立于标签。

    查询手动或自动索引的唯一方法是使用START子句:

      START n = node:< indexName> (< lucene query expression>)//索引查询
    START n =节点:< indexName>(key ='< value>)//确切索引查找

    模式索引是完全不同的,并在适当的时候用在 MATCH 中。



    博客我的职位涵盖了neo4j的所有索引功能。



    一般而言,您可以在图形数据库中使用索引来标识遍历的起点。一旦你在图形中获得了引用,你只需关注关系,不再进行索引查找。



    有关全文索引,请参见 $ b

    根据下面的评论更新



    事实上 MATCH(p:Topic {name:'DNA'})RETURN p MATCH(n:Topic)其中n.name ='DNA'返回n 都是等价的。两者都导致相同的查询计划。如果在标题和属性名称上有一个模式索引(由 CREATE INDEX ON:主题(名称))Cypher将隐含地使用模式索引来查找指定的节点。



    目前,您无法使用full基于模式索引的文本搜索。全文仅在手动/自动索引中可用。



    您提供的所有示例均为 START n = node:topic(... )依赖于手动索引。这是你的责任,让他们与你的图表内容保持同步,所以我认为这些差异是由于图表中的不一致修​​改造成的,并且没有反映出手动索引的变化。



    无论如何,如果您使用 START n = node:topic(....)永远不会使用模式索引。


    this question is partially answered in neo4j-legacy-indexes-and-auto-index-vs-new-label-bases-schema-indexes and the-difference-between-legacy-indexing-auto-indexing-and-the-new-indexing-approach

    I can't comment on them yet and write a new thread here. In my db, I have a legacy index 'topic' and label 'Topic'.

    I know that:

    • a. pattern MATCH (n:Label) will scan the nodes;
    • b. pattern START (n:Index) will search on legacy index
    • c. auto-index is a sort of legacy index and should gimme same results as (b) but it does not in my case
    • d. START clause should be replaced by MATCH for "good practices".

    I have inconsistent results between a. and b. (see below), cannot figure out how to use proper syntax with MATCH for searching on indexing insted of labels.

    Here some examples:

    1#

    start n=node:topic('name:(keyword1 AND keyword2)') return n

    6 rows, 3ms

    start n=node:node_auto_index('name:(keyword1 AND keyword2)') return n;
    

    0 rows

    MATCH (n:Topic) where n.name =~ '(?i).*keyword1*.AND.*keyword2*.' return n;
    

    0 rows, 10K ms

    2#

    start n=node:topic('name:(keyword1)') return n
    

    212 rows, 122 ms [all coherent results containing substring keyword1]

    start n=node:node_auto_index('name:(keyword1)') return n
    

    0 rows

    MATCH (n:Topic) where n.name =~ '(?i).*keyword1*.'return n
    

    835 rows, 8K ms [also results not coherent, containing substring eyword]

    MATCH (n:Topic) where n.name =~ 'keyword1' return n;
    

    1 row, >6K ms [exact match]

    MATCH (n:topic) where n.name =~ 'keyword1' return n;
    

    no results (here I used an index 'topic' not a label 'Topic'!)

    MATCH (node:topic) where node.name =~ 'keyword1' return node;
    

    no results (attempt to use node "object" directly, as in auto-index syntax)

    Could you help shed some light:

    • What's the difference between a legacy index and auto-index and why inconsistent results between the two?

    • How to use MATCH clause with Indexes rather than labels? I want to reproduce results of full-text search.

    • Which syntax to do a full-text search applied to ONLY the neighbor of a node, not the full-db? MATCH ? START clause? legacy-index ? label? I am confused.

    解决方案

    The auto index (there is only one) is a manual (aka legacy) index having the name node_auto_index. This special index tracks changes to the graph by hooking into the transaction processing. So if you declared name as part of your auto index for nodes in the config, any change to a node having a name property is reflected to that index.

    Note that auto indexes do not automatically populate on an existing dataset when you add e.g. a new property for auto indexing.

    Note further that manual or auto indexes are totally independent of labels.

    The only way to query a manual or auto index is by using the START clause:

    START n=node:<indexName>(<lucene query expression>) // index query
    START n=node:<indexName>(key='<value>') // exact index lookup
    

    Schema indexes are completely different and are used in MATCH when appropriate.

    A blog post of mine covers all the index capabilities of neo4j.

    In general you use an index in graph databases to identify the start points for traversals. Once you've got a reference inside the graph you just follow relationships and do no longer do index lookups.

    For full text indexing, see another blog post.

    updates based on commets below

    In fact MATCH (p:Topic {name: 'DNA'}) RETURN p and MATCH (n:Topic) where n.name = 'DNA' return n are both equvalent. Both result in the same query plan. If there is a schema index on label Topic and property name (by CREATE INDEX ON :Topic(name)) Cypher will implicitly use the schema index to find the specified node(s).

    At the moment you cannot use full text searches based on schema indexes. Full text is only available in manual / auto indexing.

    All the example you've provided with START n=node:topic(...) rely on a manual index. It's your responsibility to keep them in sync with your graph contents, so I assume the differences are due to inconsistent modifications in the graph and not reflecting the change to the manual index.

    In any case if you use START n=node:topic(....) will never use a schema index.

    这篇关于Neo4j自动索引,传统索引和标签模式:相对于节点全文搜索的差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆