根据 LOAD CSV 中的数据设置标签 [英] SET label based on data within LOAD CSV

查看:17
本文介绍了根据 LOAD CSV 中的数据设置标签的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 Neo4j 2.2.0 并通过 LOAD CSV 导入数据(以节点文件和关系文件的形式).

I'm using Neo4j 2.2.0 and importing data (in the form of a nodes file and relationships file) via LOAD CSV.

所有节点都将在Person"标签下导入,但是如果节点文件中的纬度和经度字段为空,我想向其中一些节点添加Geotag"标签.

The nodes will all be imported under the "Person" label, however I want to add the "Geotag" label to some of them if their latitude and longitude fields in the nodes file are being empty.

例如,下面的节点文件(忽略行之间的额外行)

So, for example, the below nodes file (ignore the extra line in between rows)

用户名"、id"、纬度"、经度"

"username","id","latitude","longitude"

"abc123","111111111","33.223","33.223"

"abc123","111111111","33.223","33.223"

"abc456","222222222","",""

"abc456","222222222","",""

我想创建带有 Person 和 Geotag 标签的节点abc123"和仅带有 Person 标签的节点 abc456,因为它没有纬度和经度.

I would like to create node "abc123" with the Person and Geotag labels and node abc456 with just the Person label because it doesn't have a latitude and longitude.

我认为这会是这样的:

LOAD CSV WITH HEADERS FROM "file:/users.csv" AS line 
CREATE (p:Person { username: line.username, id: line.id, latitude: line.latitude, longitude: line.longitude }) 
SET p: (CASE WHEN line.latitude IS NOT NULL THEN GEOTAGGED);

我知道我错误地使用了 CASE 语句和 SET 语句,但是在导入节点时可以这样做吗?该文件中包含超过 300 万个节点,在插入时执行此操作会很有帮助,这样在添加新节点时(通常是批量添加),我们不会仅仅为了获得新节点而探索所有节点.

I know I am using the CASE statement incorrectly as well as the SET statement, but is this possible to do while importing the nodes? This file has over 3 million nodes in it and it would be helpful to do it upon insertion so that when new nodes get added (usually in batches), we're not exploring all nodes just to get to the new ones.

我探索了其他 SO 问题(如何在 LOAD CSV 中设置关系类型和标签?, 正在加载从 CSV 数据到 neo4j db 的关系Neo4j Cypher - 创建节点并使用 LOAD CSV 设置标签),但是它们与我的问题不同,因为这些 OP 试图使用文件中的字段作为标签,而我只是试图做出有条件的决定根据文件中的数据使用哪些标签.

I've explored other SO questions (How to set relationship type and label in LOAD CSV?, Loading relationships from CSV data into neo4j db, Neo4j Cypher - creating nodes and setting labels with LOAD CSV), however they differ from my question in that those OP's are trying to use a field in the file as the label and I am simply trying to make a conditional decision on which labels to use based on data in the file.

谢谢!

作为对答案的回应,我正在尝试以下操作:

In response to an answer, I am trying the following:

LOAD CSV WITH HEADERS FROM "file:/users.csv" AS line
CREATE (p:Person { username: line.username, id: line.id, latitude: line.latitude, longitude: line.longitude }) 
CASE WHEN line.latitude IS NOT NULL THEN [1] ELSE [] END AS geotagged 
FOREACH (x IN geotagged | SET p:Geotag); 

我收到以下错误:

QueryExecutionKernelException: Invalid input 'A': expected 'r/R' (line 3, column 2 (offset: 454)) "CASE WHEN line.latitude IS NOT NULL THEN [1] ELSE [] END AS地理标记"

在CASE"中A"下的胡萝卜

With the carrot under the 'A' in "CASE"

以下是完整的解决方案,灵感来自大卫的解决方案,但与大卫的解决方案略有不同.

Below is the complete solution, inspired by and only slightly different from David's solution.

LOAD CSV WITH HEADERS FROM "file:/users.csv" AS line
CREATE (p:Person { username: line.username, id: line.id, latitude: line.latitude, longitude: line.longitude }) 
WITH p, CASE WHEN line.latitude <> "" THEN [1] ELSE [] END AS geotagged 
FOREACH (x IN geotagged | SET p:Geotag); 

推荐答案

你已经接近了.您不能将条件逻辑放在 set label 语句中.当你有一个非空的 lon/lat 值时,你需要创建一个 1 的集合来迭代.然后遍历 1 的集合并在那里执行语句.

you are close. You cannot put the conditional logic in the set label statement. You need to create a collection of 1 to iterate through when you have a not null lon/lat value. Then iterate through the collection of 1 and perform the statement there.

...
case when line.latitude IS NOT NULL then [1] else [] end as geotagged
foreach(x in geotagged | set p:Geotag)
...

这篇关于根据 LOAD CSV 中的数据设置标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆