在Neo4j中创建代谢途径 [英] Creating a metabolic pathway in Neo4j
问题描述
我正在尝试使用以下数据在Neo4j中创建此问题底部的图像中所示的糖酵解途径:
I am attempting to create the glycolytic pathway shown in the image at the bottom of this question, in Neo4j, using these data:
glycolysis_bioentities.csv
name
α-D-glucose
glucose 6-phosphate
fructose 6-phosphate
"fructose 1,6-bisphosphate"
dihydroxyacetone phosphate
D-glyceraldehyde 3-phosphate
"1,3-bisphosphoglycerate"
3-phosphoglycerate
2-phosphoglycerate
phosphoenolpyruvate
pyruvate
hexokinase
glucose-6-phosphatase
phosphoglucose isomerase
phosphofructokinase
"fructose-bisphosphate aldolase, class I"
triosephosphate isomerase (TIM)
glyceraldehyde-3-phosphate dehydrogenase
phosphoglycerate kinase
phosphoglycerate mutase
enolase
pyruvate kinase
glycolysis_relations.csv
source,relation,target
α-D-glucose,substrate_of,hexokinase
hexokinase,yields,glucose 6-phosphate
glucose 6-phosphate,substrate_of,glucose-6-phosphatase
glucose-6-phosphatase,yields,α-D-glucose
glucose 6-phosphate,substrate_of,phosphoglucose isomerase
phosphoglucose isomerase,yields,fructose 6-phosphate
fructose 6-phosphate,substrate_of,phosphofructokinase
phosphofructokinase,yields,"fructose 1,6-bisphosphate"
"fructose 1,6-bisphosphate",substrate_of,"fructose-bisphosphate aldolase, class I"
"fructose-bisphosphate aldolase, class I",yields,D-glyceraldehyde 3-phosphate
D-glyceraldehyde 3-phosphate,substrate_of,glyceraldehyde-3-phosphate dehydrogenase
D-glyceraldehyde 3-phosphate,substrate_of,triosephosphate isomerase (TIM)
triosephosphate isomerase (TIM),yields,dihydroxyacetone phosphate
glyceraldehyde-3-phosphate dehydrogenase,yields,"1,3-bisphosphoglycerate"
"1,3-bisphosphoglycerate",substrate_of,phosphoglycerate kinase
phosphoglycerate kinase,yields,3-phosphoglycerate
3-phosphoglycerate,substrate_of,phosphoglycerate mutase
phosphoglycerate mutase,yields,2-phosphoglycerate
2-phosphoglycerate,substrate_of,enolase
enolase,yields,phosphoenolpyruvate
phosphoenolpyruvate,substrate_of,pyruvate kinase
pyruvate kinase,yields,pyruvate
到目前为止,这就是我所拥有的
This is what I have, thus far,
...使用此密码(传递给Cycli
或cypher-shell
):
... using this cypher code (passed to Cycli
or cypher-shell
):
LOAD CSV WITH HEADERS FROM "file:/glycolysis_relations.csv" AS row
MERGE (s:Glycolysis {source: row.source})
MERGE (r:Glycolysis {relation: row.relation})
MERGE (t:Glycolysis {target: row.target})
FOREACH (x in case row.relation when "substrate_of" then [1] else [] end |
MERGE (s)-[r:substrate_of]->(t)
)
FOREACH (x in case row.relation when "yields" then [1] else [] end |
MERGE (s)-[r:yields]->(t)
);
我想创建一个全连接的路径,在所有节点上都添加字幕.有什么建议吗?
I'd like to create the fully-connected pathway, with captions on all the nodes. Suggestions?
推荐答案
[更新]
存在多个问题并可能进行改进:
There are multiple issues and possible improvements:
- 第二个
MERGE
应该被删除,因为它会创建孤立的节点.关系类型不应调整为Glycolysis
节点,并且此类节点永远不会连接到任何其他节点. - 第1个和第3个
MERGE
子句必须对源节点和目标节点使用相同的属性名称(例如,name
),否则相同的化学物质最终可以有2个节点(具有不同的属性密钥).这就是为什么您最终得到的节点没有所有预期的连接. - APOC过程 apoc.cypher.doIt 可以是用于简化与动态名称的关系
MERGE
. - 此用例不需要
glycolysis_bioentities.csv
.
- The second
MERGE
should be deleted, since it creates orphaned nodes. A relationship type should not be tuned into aGlycolysis
node, and such nodes would never be connected to any other nodes. - The 1st and 3rd
MERGE
clauses must use the same property name (say,name
) for source and target nodes, or else the same chemical can end up with 2 nodes (with different property keys). This is why you ended up with nodes that did not have all the expected connections. - The APOC procedure apoc.cypher.doIt can be used to simplify somewhat the
MERGE
of relationships with dynamic names. - The
glycolysis_bioentities.csv
is not needed for this use case.
进行上述更改后,您将得到类似以下的内容,它将生成一个与您的输入数据匹配的连接图:
With the above changes, you end up with something like this, which will generate a connected graph that matches your input data:
LOAD CSV WITH HEADERS FROM "file:/glycolysis_relations.csv" AS row
MERGE (s:Glycolysis {name: row.source})
MERGE (t:Glycolysis {name: row.target})
WITH s, t, row
CALL apoc.cypher.doIt(
'MERGE (s)-[r:' + row.relation + ']->(t)',
{s:s, t:t}) YIELD value
RETURN 1;
这篇关于在Neo4j中创建代谢途径的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!