Titan Db忽略索引 [英] Titan Db ignoring index

查看:53
本文介绍了Titan Db忽略索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有两个索引的图.它们是两个带有标签约束的综合索引. (两者只是在不同的属性/标签上完全相同). 一个肯定似乎有效,但另一个则无效.我已经完成了以下profile()来加倍检查:

I have a graph with a couple of indices. They're two composite indices with label restraints. (both are exactly the same just on different properties/labels). One definitely seems to work but the other doesn't. I've done the following profile() to doubled check:

一个称为KeyOnNode:属性uid和标签node:

One is called KeyOnNode : property uid and label node :

gremlin> g.V().hasLabel("node").has("uid", "xxxxxxxx").profile().cap(...)
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
TitanGraphStep([~label.eq(node), uid.eq(dammit_...                     1           1           2.565    96.84
  optimization                                                                                 1.383
  backend-query                                                        1                       0.231
SideEffectCapStep([~metrics])                                          1           1           0.083     3.16
                                            >TOTAL                     -           -           2.648        -

以上内容完全可以接受,并且效果很好.我假设魔术线是backend-query.

The above is perfectly acceptable and works well. I'm assuming the magic line is backend-query.

另一个称为NameOnSuperNode:属性name和标签supernode:

The other is called NameOnSuperNode : property name and label supernode:

gremlin> g.V().hasLabel("supernode").has("name", "xxxxxxxx").profile().cap(...)
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
TitanGraphStep([~label.eq(supernode), name.eq(n...                     1           1        5763.163   100.00
  optimization                                                                                 2.261
  scan                                                                                         0.000
SideEffectCapStep([~metrics])                                          1           1           0.073     0.00
                                            >TOTAL                     -           -        5763.236        -

在这里查询花费的时间非常长,我们有scan行.我最初想知道索引是否不是通过管理系统提交的,但是下面的方法似乎还可以:

Here the query takes an outrageous amount of time and we have a scan line. I originally wondered if the index wasn't commit through the management system but alas the following seems to work just fine :

gremlin> m = graphT.openManagement(); 
==>com.thinkaurelius.titan.graphdb.database.management.ManagementSystem@73c1c105
gremlin> index = m.getGraphIndex("NameOnSuperNode")
==>NameOnSuperNode
gremlin> index.getFieldKeys()
==>name
gremlin> import static com.thinkaurelius.titan.graphdb.types.TypeDefinitionCategory.*
==>null
gremlin> sv = m.getSchemaVertex(index)
==>NameOnSuperNode
gremlin> rel = sv.getRelated(INDEX_SCHEMA_CONSTRAINT, Direction.OUT)
==>com.thinkaurelius.titan.graphdb.types.SchemaSource$Entry@26b2b8e2
gremlin> sse = rel.iterator().next()
==>com.thinkaurelius.titan.graphdb.types.SchemaSource$Entry@2d39a135
gremlin> sse.getSchemaType()
==>supernode

我现在不能只重置数据库.查明问题可能出在哪里的任何帮助都将是惊人的,我在这里遇到了麻烦. 这是我需要重新编制索引的标志吗?

I can't just reset the db at this point. Any help pinpointing what the issues could be would be amazing, I'm hitting a wall here. Is this a sign that I need to reindex?

INFO:Titan DB 1.1(TP 3.1.1)

INFO: Titan DB 1.1 (TP 3.1.1)

欢呼声

更新:我发现所涉及的索引未处于REGISTERED状态:

UPDATE : I've found that the index in question is not in a REGISTERED state:

gremlin> :> m = graphT.openManagement(); index = m.getGraphIndex("NameOnSuperNode"); pkey = index.getFieldKeys()[0]; index.getIndexStatus(pkey)
==>INSTALLED

我如何获得注册?我已经尝试过m.updateIndex(index, SchemaAction.REGISTER_INDEX).get(); m.commit(); graphT.tx().commit();,但是它似乎什么也没做

How do I get it to register? I've tried m.updateIndex(index, SchemaAction.REGISTER_INDEX).get(); m.commit(); graphT.tx().commit(); but it doesn't seem to do anything

更新2:我试图重新索引索引,以便使用以下内容重新索引:

UPDATE 2 : I've tried regitering the index in order to reindex with the following :

gremlin> m = graphT.openManagement(); 
index = m.getGraphIndex("NameOnSuperNode") ; 
import static com.thinkaurelius.titan.graphdb.types.TypeDefinitionCategory.*; 
import com.thinkaurelius.titan.graphdb.database.management.ManagementSystem; 
m.updateIndex(index, SchemaAction.REGISTER_INDEX).get();
ManagementSystem.awaitGraphIndexStatus(graphT, "NameOnSuperNode").status(SchemaStatus.REGISTERED).timeout(20, java.time.temporal.ChronoUnit.MINUTES).call();
m.commit();
graphT.tx().commit()

但这不起作用.我的索引仍然处于INSTALLED状态,并且仍然超时.我检查了有没有未结交易.有人有主意吗?仅供参考,该图在单个服务器上运行,具有约10万个顶点和约13万个边.

But this isn't working. I still have my index in the INSTALLED status and I'm still getting a timeout. I've checked that there were no open transactions. Anyone have an idea? FYI the graph is running on a single server and has ~100K vertices and ~130k edges.

推荐答案

因此,这里可能发生一些事情:

So there are a few things that can be happening here:

  1. 如果您描述的两个索引均未在同一事务中创建(并且问题索引已在定义name propertyKey之后创建),则应发出重新索引,如下每个 Titan文档:

  1. If both of those indices you describe were not created in the same transaction (and the problem index in question was created in after the name propertyKey was already defined) then you should issue a reindex, as per Titan docs:

图形索引的名称必须唯一.根据建立的图形索引 新定义的属性键,即在 与索引相同的管理交易会立即 可用的.针对已经存在的属性键构建的图形索引 使用中需要执行重新索引过程,以确保 索引包含所有先前添加的元素.直到重新索引 程序已完成,索引将不可用.它是 鼓励在与 初始模式.

The name of a graph index must be unique. Graph indexes built against newly defined property keys, i.e. property keys that are defined in the same management transaction as the index, are immediately available. Graph indexes built against property keys that are already in use require the execution of a reindex procedure to ensure that the index contains all previously added elements. Until the reindex procedure has completed, the index will not be available. It is encouraged to define graph indexes in the same transaction as the initial schema.

  • 索引可能会使从REGISTEREDINSTALLED的过程超时,在这种情况下,您要使用mgmt.awaitGraphIndexStatus().您甚至可以在这里指定愿意等待的时间.

  • The index may be timing out the process that takes to move from REGISTERED to INSTALLED, in which case you want to use mgmt.awaitGraphIndexStatus(). You can even specify the amount of time you are willing to wait here.

    请确保图形上没有未结交易,否则索引状态的确不会发生变化,如

    Make sure there are no open transactions on your graph or the index status will indeed not change, as described here.

    这显然对您而言不是这种情况,但Titan中存在一个错误(已通过此PR ),这样,如果您针对新创建的propertyKey和以前使用的propertyKey创建索引,则该索引将卡在REGISTERED状态

    This is clearly not the case for you, but there is a bug in Titan (fixed in JanusGraph via this PR) such that if you create an index against a newly created propertyKey as well as a previously used propertyKey, the index will get stuck in the REGISTERED state

    除非集群中的每个Titan/JanusGraph节点都认可索引的创建,否则索引将不会移至REGISTERED.如果索引卡在INSTALLED状态,则系统中的其他节点可能无法确认索引的存在.这可能是由于集群中的另一台服务器出现问题,Titan/JanusGraph用于彼此通信的消息传递队列中的回填,或者是最意外的:幻象实例的存在.每当您的服务器通过非正常的JVM关闭过程而被杀死时,都可能发生这种情况,即kill -9服务器由于被卡在世界垃圾回收中而被卡住.如果您希望回填成为问题,请在

    Indexes will not move to REGISTERED unless every Titan/JanusGraph node in the cluster acknowledges the index creation. If your indexes are getting stuck in the INSTALLED state, there is a chance that the other nodes in the system are not acknowledging the indexes existence. This can be due to issues with another server in the cluster, backfill in the messaging queue Titan/JanusGraph uses to talk with each other, or most unexpectedly: the existence of phantom instances. These can occur every time your server is killed through non-normal JVM shutdown processes, i.e. kill -9 the server due to it being stuck in thrash the world garbage collection. If you expect backfill to be the problem, the comments in this class offer good insight to customizable configuration options that may help fix the problem. To check for the existence of phantom nodes, use this function and then this function to kill the phantom instances.

    这篇关于Titan Db忽略索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆