如何使用向量或数据框在RNeo4j中创建节点 [英] How to Create Nodes in RNeo4j using Vectors or Dataframes

查看:73
本文介绍了如何使用向量或数据框在RNeo4j中创建节点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

由于软件包/驱动程序RNeo4j( https://github.com/nicolewhite/Rneo4j ).

The popular graph database Neo4j can be used within R thanks to the package/driver RNeo4j (https://github.com/nicolewhite/Rneo4j).

软件包作者@ NicoleWhite ,提供了

The package author, @NicoleWhite, provides several great examples of its usage on GitHub.

对我来说,不幸的是,@ NicoleWhite和文档中给出的示例有点过于简单化,因为它们手动创建了每个图形节点及其关联的labelsproperties,例如:

Unfortunately for me, the examples given by @NicoleWhite and the documentation are a bit oversimplistic, in that they manually create each graph node and its associated labels and properties, such as:

mugshots = createNode(graph, "Bar", name = "Mugshots", location = "Downtown")
parlor = createNode(graph, "Bar", name = "The Parlor", location = "Hyde Park")
nicole = createNode(graph, name = "Nicole", status = "Student")
addLabel(nicole, "Person")

当您处理微小的示例数据集时,这一切都很好,但是这种方法对于像具有数千名用户的大型社交图(其中每个用户是一个节点)之类的东西是不可行的(此类图可能无法利用每个查询中的每个节点,但仍需要将它们输入到Neo4j).

That's all good and fine when you're dealing with a tiny example dataset, but this approach isn't feasible for something like a large social graph with thousands of users, where each user is a node (such graphs might not utilize every node in every query, but they still need to be input to Neo4j).

我试图弄清楚如何使用向量或数据帧来做到这一点.是否有解决方案,也许调用了apply语句或for循环?

I'm trying to figure out how to do this using vectors or dataframes. Is there a solution, perhaps invoving an apply statement or for loop?

此基本尝试:

for (i in 1:length(df$user_id)){
paste(df$user_id[i]) = createNode(graph, "user", name = df$name[i], email = df$email[i])
}

Error: 400 Bad Request

推荐答案

第一次尝试,您应该看看我刚刚为事务性端点添加的功能:

As a first attempt, you should look at the functionality I just added for the transactional endpoint:

http://nicolewhite.github.io/RNeo4j/docs/transactions.html

library(RNeo4j)

graph = startGraph("http://localhost:7474/db/data/")
clear(graph)

data = data.frame(Origin = c("SFO", "AUS", "MCI"),
                  FlightNum = c(1, 2, 3),
                  Destination = c("PDX", "MCI", "LGA"))


query = "
MERGE (origin:Airport {name:{origin_name}})
MERGE (destination:Airport {name:{dest_name}})
CREATE (origin)<-[:ORIGIN]-(:Flight {number:{flight_num}})-[:DESTINATION]->(destination)
"

t = newTransaction(graph)

for (i in 1:nrow(data)) {
  origin_name = data[i, ]$Origin
  dest_name = data[i, ]$Dest
  flight_num = data[i, ]$FlightNum

  appendCypher(t, 
               query, 
               origin_name = origin_name, 
               dest_name = dest_name, 
               flight_num = flight_num)
}

commit(t)

cypher(graph, "MATCH (o:Airport)<-[:ORIGIN]-(f:Flight)-[:DESTINATION]->(d:Airport)
               RETURN o.name, f.number, d.name")

在这里,我形成一个Cypher查询,然后遍历数据帧并将值作为参数传递给Cypher查询.您现在的尝试会很慢,因为您正在为创建的每个节点发送一个单独的HTTP请求.通过使用事务终结点,您可以在单个事务下创建多个事物.如果您的数据框很大,那么每笔交易我会将其分成大约1000行.

Here, I form a Cypher query and then loop through a data frame and pass the values as parameters to the Cypher query. Your attempts right now will be slow, because you're sending a separate HTTP request for each node created. By using the transactional endpoint, you create several things under a single transaction. If your data frame is very large, I would split it up into roughly 1000 rows per transaction.

作为第二次尝试,您应该考虑在neo4j-shell中使用LOAD CSV.

As a second attempt, you should consider using LOAD CSV in the neo4j-shell.

这篇关于如何使用向量或数据框在RNeo4j中创建节点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆