py2neo的WriteBatch操作失败 [英] Failed WriteBatch Operation with py2neo
问题描述
我正在尝试找到以下问题的解决方法.我已经在 SO问题中对此问题进行了类似的描述,但没有真的回答了.
I am trying to find a workaround to the following problem. I have seen it quasi-described in this SO question, yet not really answered.
以下代码失败,从新的图形开始:
The following code fails, starting with a fresh graph:
from py2neo import neo4j
def add_test_nodes():
# Add a test node manually
alice = g.get_or_create_indexed_node("Users", "user_id", 12345, {"user_id":12345})
def do_batch(graph):
# Begin batch write transaction
batch = neo4j.WriteBatch(graph)
# get some updated node properties to add
new_node_data = {"user_id":12345, "name": "Alice"}
# batch requests
a = batch.get_or_create_in_index(neo4j.Node, "Users", "user_id", 12345, {})
batch.set_properties(a, new_node_data) #<-- I'm the problem
# execute batch requests and clear
batch.run()
batch.clear()
if __name__ == '__main__':
# Initialize Graph DB service and create a Users node index
g = neo4j.GraphDatabaseService()
users_idx = g.get_or_create_index(neo4j.Node, "Users")
# run the test functions
add_test_nodes()
alice = g.get_or_create_indexed_node("Users", "user_id", 12345)
print alice
do_batch(g)
# get alice back and assert additional properties were added
alice = g.get_or_create_indexed_node("Users", "user_id", 12345)
assert "name" in alice
简而言之,我希望在一个批处理中更新现有的索引节点属性.失败发生在batch.set_properties
行,这是因为前一行返回的BatchRequest
对象没有被解释为有效节点.尽管并非完全相同,但感觉就像我正在尝试在此处
In short, I wish, in one batch transaction, to update existing indexed node properties. The failure is occurring at the batch.set_properties
line, and it is because the BatchRequest
object returned by the previous line is not being interpreted as a valid node. Though not entirely indentical, it feels like I am attempting something like the answer posted here
一些细节
>>> import py2neo
>>> py2neo.__version__
'1.6.0'
>>> g = py2neo.neo4j.GraphDatabaseService()
>>> g.neo4j_version
(2, 0, 0, u'M06')
更新
如果我将问题分成几批,那么它可以正常运行:
Update
If I split the problem into separate batches, then it can run without error:
def do_batch(graph):
# Begin batch write transaction
batch = neo4j.WriteBatch(graph)
# get some updated node properties to add
new_node_data = {"user_id":12345, "name": "Alice"}
# batch request 1
batch.get_or_create_in_index(neo4j.Node, "Users", "user_id", 12345, {})
# execute batch request and clear
alice = batch.submit()
batch.clear()
# batch request 2
batch.set_properties(a, new_node_data)
# execute batch request and clear
batch.run()
batch.clear()
这也适用于许多节点.尽管我不喜欢将批次拆分的想法,但这可能是目前的唯一方法.有人对此有何评论?
This works for many nodes as well. Though I do not love the idea of splitting the batch up, this might be the only way at the moment. Anyone have some comments on this?
推荐答案
在阅读了Neo4j 2.0.0-M06的所有新功能之后,似乎已经取代了旧的节点和关系索引工作流程.目前,在完成索引编制的方式方面,neo方面存在一些分歧.即,标签和
After reading up on all the new features of Neo4j 2.0.0-M06, it seems that the older workflow of node and relationship indexes are being superseded. There is presently a bit of a divergence on the part of neo in the way indexing is done. Namely, labels and schema indexes.
标签可以任意附加到节点上,并且可以用作索引的参考.
Labels can be arbitrarily attached to nodes and can serve as a reference for an index.
可以通过引用标签(此处为User
)和节点属性键(screen_name
)在Cypher中创建索引:
Indexes can be created in Cypher by referencing Labels (here, User
) and node property key, (screen_name
):
CREATE INDEX ON :User(screen_name)
密码MERGE
此外,现在可以通过新密码 MERGE
函数,该函数非常简洁地合并了Labels及其索引:
Cypher MERGE
Furthermore, the indexed get_or_create
methods are now possible via the new cypher MERGE
function, which incorporate Labels and their indexes quite succinctly:
MERGE (me:User{screen_name:"SunPowered"}) RETURN me
批次
可以通过将CypherQuery实例附加到批处理对象来在py2neo
中对此类查询进行批处理:
Batch
Queries of the sort can be batched in py2neo
by appending a CypherQuery instance to the batch object:
from py2neo import neo4j
graph_db = neo4j.GraphDatabaseService()
cypher_merge_user = neo4j.CypherQuery(graph_db,
"MERGE (user:User {screen_name:{name}}) RETURN user")
def get_or_create_user(screen_name):
"""Return the user if exists, create one if not"""
return cypher_merge_user.execute_one(name=screen_name)
def get_or_create_users(screen_names):
"""Apply the get or create user cypher query to many usernames in a
batch transaction"""
batch = neo4j.WriteBatch(graph_db)
for screen_name in screen_names:
batch.append_cypher(cypher_merge_user, params=dict(name=screen_name))
return batch.submit()
root = get_or_create_user("Root")
users = get_or_create_users(["alice", "bob", "charlie"])
限制
但是,存在一个局限性,即批处理事务中的密码查询结果不能稍后在同一事务中引用.最初的问题是关于在一个批处理事务中更新索引用户属性的集合.据我所知,这仍然是不可能的.例如,以下代码段将引发错误:
Limitation
There is a limitation, however, in that the results from a cypher query in a batch transaction cannot be referenced later in the same transaction. The original question was in reference to updating a collection of indexed user properties in one batch transaction. This is still not possible, as far as I can muster. For example, the following snippet throws an error:
batch = neo4j.WriteBatch(graph_db)
b1 = batch.append_cypher(cypher_merge_user, params=dict(name="Alice"))
batch.set_properties(b1, dict(last_name="Smith")})
resp = batch.submit()
因此,似乎不再需要在使用py2neo
的带标签节点上实现get_or_create
的开销,因为不再需要遗留索引,但原始问题仍然需要2个单独的批处理事务才能完成.
So, it seems that although there is a bit less overhead in implementing the get_or_create
over a labelled node using py2neo
because the legacy indexes are no longer necessary, the original question still needs 2 separate batch transactions to complete.
这篇关于py2neo的WriteBatch操作失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!