Neo4j:使用node_auto_index / lucene索引对32k以上的属性进行索引 [英] Neo4j: indexing properties that are longer then 32k with node_auto_index / lucene index
问题描述
设置索引可以正常工作,索引具有小属性的节点可以工作。当尝试为大于32k的属性索引节点时,neo4j失败(并进入不可用状态)。
错误消息归结为:
$ b
警告:无法调用过程
apoc.index.addNode
:原因:
java.lang.IllegalArgumentException:Document在field =text_e(其UTF8编码长于
最大长度32766)中至少包含一个
的巨大术语,所有这些都被忽略。请更正
分析器,以避免产生这些条款。第一个庞大的
项的前缀是:'[110,101,111,32,110,101,111,32,110,101,111,32,
110,101,111,32 ,110,101,111,32,110,101,111,32,110,101,
111,32,110,101] ...',原始消息:字节最多可以是32766
长度;得到40000
我已经在3.1.2和3.1.0+ apoc 3.1.0.3上检查了这一点
有关此问题的更详细描述,请参阅 https://baach.de/Members/jhb/neo4j-full-text-indexing 。
有什么方法可以解决这个问题吗?例如。我做错了什么,或者有什么配置?
由于潜在的lucene限制,neo4j不支持比〜32k更长的索引值。
有关该区域的详细信息,您可以查看:
https:/ /github.com/neo4j/neo4j/pull/6213 和 https:// github .COM / Neo4j的/ Neo4j的/拉/ 8404 。
您需要将这些较长的值分成多个项。
I am trying to build a little file and email search engine. I'd like also to use more advanced search queries for the full text search. Hence I am looking at lucene indexes. From what I have seen, there are two approaches - node_auto_index and apoc.index.addNode.
Setting the index up works fine, and indexing nodes with small properties works. When trying to index nodes with properties that are larger then 32k, neo4j fails (and get's into an unusable state).
The error message boils down to:
WARNING: Failed to invoke procedure
apoc.index.addNode
: Caused by: java.lang.IllegalArgumentException: Document contains at least one immense term in field="text_e" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms. The prefix of the first immense term is: '[110, 101, 111, 32, 110, 101, 111, 32, 110, 101, 111, 32, 110, 101, 111, 32, 110, 101, 111, 32, 110, 101, 111, 32, 110, 101, 111, 32, 110, 101]...', original message: bytes can be at most 32766 in length; got 40000
I have checked this on 3.1.2 and 3.1.0+ apoc 3.1.0.3
A much longer description of the problem can be found at https://baach.de/Members/jhb/neo4j-full-text-indexing.
Is there any way to fix this? E.g. have I done anything wrong, or is there something to configure?
Thx a lot!
neo4j does not support index values that are longer then ~32k because of underlying lucene limitation. For some details around that area You can look at: https://github.com/neo4j/neo4j/pull/6213 and https://github.com/neo4j/neo4j/pull/8404. You need to split such longer values into multiple terms.
这篇关于Neo4j:使用node_auto_index / lucene索引对32k以上的属性进行索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!