根据LOAD CSV中的数据设置标签 [英] SET label based on data within LOAD CSV

查看:73
本文介绍了根据LOAD CSV中的数据设置标签的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Neo4j 2.2.0,并通过LOAD CSV导入数据(以节点文件和关系文件的形式).

I'm using Neo4j 2.2.0 and importing data (in the form of a nodes file and relationships file) via LOAD CSV.

所有节点都将在"Person"标签下导入,但是如果节点文件中的纬度和经度字段为空,我想向其中的某些节点添加"Geotag"标签.

The nodes will all be imported under the "Person" label, however I want to add the "Geotag" label to some of them if their latitude and longitude fields in the nodes file are being empty.

例如,下面的节点文件(忽略行之间的多余行)

So, for example, the below nodes file (ignore the extra line in between rows)

用户名","id",纬度",经度"

"username","id","latitude","longitude"

"abc123","111111111","33.223","33.223"

"abc123","111111111","33.223","33.223"

"abc456","222222222",","

"abc456","222222222","",""

我想创建带有Person和Geotag标签的节点"abc123",以及仅带有Person标签的节点abc456,因为它没有纬度和经度.

I would like to create node "abc123" with the Person and Geotag labels and node abc456 with just the Person label because it doesn't have a latitude and longitude.

我认为这可能与以下内容类似:

I thought this would be something along the lines of:

LOAD CSV WITH HEADERS FROM "file:/users.csv" AS line 
CREATE (p:Person { username: line.username, id: line.id, latitude: line.latitude, longitude: line.longitude }) 
SET p: (CASE WHEN line.latitude IS NOT NULL THEN GEOTAGGED);

我知道我不正确地使用CASE语句和SET语句,但是在导入节点时可以这样做吗?该文件中有超过300万个节点,因此在插入文件时会有所帮助,因此,当添加新节点(通常是成批添加)时,我们不会仅仅为了找到新节点而探索所有节点.

I know I am using the CASE statement incorrectly as well as the SET statement, but is this possible to do while importing the nodes? This file has over 3 million nodes in it and it would be helpful to do it upon insertion so that when new nodes get added (usually in batches), we're not exploring all nodes just to get to the new ones.

我还探讨了其他SO问题(如何要在LOAD CSV中设置关系类型和标签?正在加载从CSV数据到neo4j db的关系 Neo4j Cypher-创建节点并使用LOAD CSV 设置标签,但是它们与我的问题不同,这些OP试图使用文件中的字段作为标签,而我只是试图做出有条件的决定根据文件中的数据使用哪些标签.

I've explored other SO questions (How to set relationship type and label in LOAD CSV?, Loading relationships from CSV data into neo4j db, Neo4j Cypher - creating nodes and setting labels with LOAD CSV), however they differ from my question in that those OP's are trying to use a field in the file as the label and I am simply trying to make a conditional decision on which labels to use based on data in the file.

谢谢!

为了回答这个问题,我正在尝试以下方法:

In response to an answer, I am trying the following:

LOAD CSV WITH HEADERS FROM "file:/users.csv" AS line
CREATE (p:Person { username: line.username, id: line.id, latitude: line.latitude, longitude: line.longitude }) 
CASE WHEN line.latitude IS NOT NULL THEN [1] ELSE [] END AS geotagged 
FOREACH (x IN geotagged | SET p:Geotag); 

我收到以下错误:

QueryExecutionKernelException: Invalid input 'A': expected 'r/R' (line 3, column 2 (offset: 454)) "CASE WHEN line.latitude IS NOT NULL THEN [1] ELSE [] END AS geotagged"

将胡萝卜放在案例"中的"A"下

With the carrot under the 'A' in "CASE"

下面是完整的解决方案,其灵感来自于大卫的解决方案,并且与大卫的解决方案略有不同.

Below is the complete solution, inspired by and only slightly different from David's solution.

LOAD CSV WITH HEADERS FROM "file:/users.csv" AS line
CREATE (p:Person { username: line.username, id: line.id, latitude: line.latitude, longitude: line.longitude }) 
WITH p, CASE WHEN line.latitude <> "" THEN [1] ELSE [] END AS geotagged 
FOREACH (x IN geotagged | SET p:Geotag); 

推荐答案

您很亲密.您不能将条件逻辑放入set label语句中.当lon/lat值不为空时,您需要创建1的集合以进行迭代.然后遍历1的集合并在那里执行语句.

you are close. You cannot put the conditional logic in the set label statement. You need to create a collection of 1 to iterate through when you have a not null lon/lat value. Then iterate through the collection of 1 and perform the statement there.

...
case when line.latitude IS NOT NULL then [1] else [] end as geotagged
foreach(x in geotagged | set p:Geotag)
...

这篇关于根据LOAD CSV中的数据设置标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆