Neo4j 如何对时间版本图进行建模 [英] Neo4j how to model a time-versioned graph

查看:14
本文介绍了Neo4j 如何对时间版本图进行建模的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的部分图表具有以下架构:

图形的主要部分是域,其中有一些人与之相关联.Person 对 email 属性有一个独特的约束,因为我也有来自其他来源的数据,这很合适.

就我而言,一个人可以是管理员,他有一些与他相关联的设备/日历.我从一个 SQL 数据库中获取这些数据,在那里我导入了几个表来组合整个画面.我从一个表开始,它有两列,管理员的电子邮件和他的用户 ID.此用户 ID 仅适用于生产数据库,也不会全局用于其他来源.这就是我使用电子邮件作为个人全球 ID 的原因.我目前正在使用以下查询来导入所有生产表都链接到的用户 ID.我总是得到用户设置和信息的当前快照.此查询每天运行 4 次:

CALL apoc.load.jdbc(url, import_query) yield 行合并(p:Person{email:row.email})SET p.user_id = row.id

然后我从其他表中导入链接到该用户 ID 的所有数据.

现在问题出现了,因为生产数据库的用户可以更改他的电子邮件.所以我现在导入的方式我最终会得到两个具有相同 user_id 的人,随后所有的设备/日历都将链接到这两个人,因为他们共享相同的 user_id.所以这不是现实的准确表示.我们还需要随时间捕获设备与特定 user_id 的连接/断开连接,因为人们可以连接/断开设备并将其借给具有不同管理员 (user_id) 的朋友.

如何更改我的图形模型(导入查询),以便:

  1. 查询当前的管理员不需要复杂的查询
  2. 查询当前连接的设备不需要复杂的查询
  3. 查询历史记录可能会更复杂一些.

解决方案

此答案基于 Ian Robinson 关于

上图有3个人节点.这些节点是域节点的成员.person_id = 1 的 person 节点连接到 device_id = 1 的设备.此外,person_id = 1 是当前管理员.:ADMIN:CONNECTED_DEVICE 关系中的 fromto 属性用于管理图的历史记录结构体.from 表示时间起点,to 表示结束时间点.为简化起见,我使用 0 作为图形的初始时间,使用 1000 作为结束时间常数.在现实世界的图表中,以毫秒为单位的当前时间可用于表示时间点.此外,Long.MAX_VALUE 可以用作 EOT 常量.与 to = 1000 的关系意味着当前没有与它相关的周期的上限.

查询:

使用此图,我可以获取当前管理员:

MATCH (person:Person)-[:ADMIN {to:1000}]->(:Admin)返回人

结果将是:

╒===============╕│人"│╞==============╡│{"person_id":1}│└──────────────────┘

给定一个设备,获取当前连接的用户:

MATCH (:Device {device_id : 1})<-[:CONNECTED_DEVICE {to : 1000}]-(person:Person)返回人

结果:

╒===============╕│人"│╞==============╡│{"person_id":1}│└──────────────────┘

要查询当前管理员和当前连接到设备的人员,使用 End-Of-Time 常量.

查询设备连接/断开事件:

MATCH (device:Device {device_id : 1})<-[r:CONNECTED_DEVICE]-(person:Person)RETURN person AS person, device AS device, r.from AS from, r.to AS to由 r.from 订购

结果:

╒==============╤==============╤======╤====╤====╕│人"│器"│从"│到"│╞===============╪===============╪======╪=====╡│{"person_id":1}│{"device_id":1}│0 │1000│└────────────────┴──────────────────┴────────┴────┘

上面的结果表明,person_id = 1 连接到今天开始的device_id = 1.

改变图结构

考虑当前时间点是 30.现在 user_id = 1 正在与 device_id = 1 断开连接.user_id = 2 将连接到它.为了表示这种结构变化,我将运行以下查询:

//获取当前连接的人MATCH (person1:Person)-[old:CONNECTED_DEVICE {to : 1000}]->(device:Device {device_id: 1})//获取 person_id = 2匹配 (person2:Person {person_id : 2})//设置30为person_id = 1和device_id = 1的连接结束时间SET old.to = 30//设置 person_id = 2 作为当前连接的用户到 device_id = 1//(从时间点31到现在)CREATE (person2)-[:CONNECTED_DEVICE {from : 31, to: 1000}]->(device)

结果图将是:

在这种结构变化之后,device_id = 1 的连接历史将是:

MATCH (device:Device {device_id : 1})<-[r:CONNECTED_DEVICE]-(person:Person)RETURN person AS person, device AS device, r.from AS from, r.to AS to由 r.from 订购╒===============╤===============╤======╤=====╕│人"│器"│从"│到"│╞===============╪===============╪======╪=====╡│{"person_id":1}│{"device_id":1}│0 │30 │├────────────────┼──────────────────┼──────┼────┤│{"person_id":2}│{"device_id":1}│31 │1000│└────────────────┴──────────────────┴────────┴────┘

上面的结果表明,user_id = 1 从 0 到 30 次连接到 device_id = 1.person_id = 2 当前连接到 device_id = 1.

现在连接到device_id = 1的当前人是person_id = 2:

MATCH (:Device {device_id : 1})<-[:CONNECTED_DEVICE {to : 1000}]-(person:Person)返回人╒==============╕│人"│╞==============╡│{"person_id":2}│└──────────────────┘

同样的方法可用于管理管理员历史记录.

显然这种方法有一些缺点:

  • 需要管理一组额外的关系
  • 更昂贵的查询
  • 更复杂的查询

但如果您真的需要版本控制架构,我相信这种方法是一个不错的选择,或者(至少)是一个好的起点.

Part of my graph has the following schema:

Main part of the graph is the domain, that has some persons linked to it. Person has a unique constraint on the email property, as I also have data from other sources and this fits nicely.

A person can be an admin in my case, where he has some devices/calendars linked to him. I get this data from an SQL db, where I import few tables to combine the whole picture. I start with a table, that has two columns, email of the admin and his user id. This user id is specific only for production database and is not globally used for other sources as well. That is why I use email as global ID for persons. I am currently using the following query to import user id, that all the production tables are linked to. I always get the current snapshot of the user settings and info. This query runs 4x/day:

CALL apoc.load.jdbc(url, import_query) yield row
MERGE (p:Person{email:row.email})
SET p.user_id = row.id

And then I import all the data that is linked to this user id from other tables.

Now the problem occurs, because the user from production db can change his email. So the way I am importing this right now I will end up with two persons having the same user_id and subsequently all the devices/calendars will be linked to both persons, as they both share the same user_id. So this is not an accurate representation of the reality. We also need to capture the connecting/disconnecting of devices to particular user_id through time, as one can connect/disconnect a device and loan it to a friend, that has a different admin (user_id).

How to change my graph model ( importing query ), so that :

  1. Querying who is currently the admin will not require complex queries
  2. Querying who has currently the device connected will not require complex queries
  3. Querying history can be a bit more complex.

解决方案

This answer is based on Ian Robinson's post about time-based versioned graphs.

I don't know if this answer covers ALL the requirements of the question, but I believe that can provide some insights.

Also, I'm considering you are only interested in structural versioning (that is: you are not interested in queries about the changes of the domain user's name over the time). Finally, I'm using a partial representation of your graph model, but I believe that the concepts shown here can be applied in the whole graph.

The initial graph state:

Considering this Cypher to create an initial graph state:

CREATE (admin:Admin)

CREATE (person1:Person {person_id : 1})
CREATE (person2:Person {person_id : 2})
CREATE (person3:Person {person_id : 3})

CREATE (domain1:Domain {domain_id : 1})

CREATE (device1:Device {device_id : 1})

CREATE (person1)-[:ADMIN {from : 0, to : 1000}]->(admin)

CREATE (person1)-[:CONNECTED_DEVICE {from : 0, to : 1000}]->(device1)

CREATE (domain1)-[:MEMBER]->(person1)
CREATE (domain1)-[:MEMBER]->(person2)
CREATE (domain1)-[:MEMBER]->(person3)

Result:

The above graph has 3 person nodes. These nodes are members of a domain node. The person node with person_id = 1 is connected to a device with device_id = 1. Also, person_id = 1 is the current administrator. The properties from and to inside the :ADMIN and :CONNECTED_DEVICE relationships are used to manage the history of the graph structure. from is representing a start point in time and to an end point in time. For simplification purpose I'm using 0 as the initial time of the graph and 1000 as the end-of-time constant. In a real world graph the current time in milliseconds can be used to represent time points. Also, Long.MAX_VALUE can be used instead as the EOT constant. A relationship with to = 1000 means there is no current upper bound to the period associated with it.

Queries:

With this graph, to get the current administrator I can do:

MATCH (person:Person)-[:ADMIN {to:1000}]->(:Admin)
RETURN person

The result will be:

╒═══════════════╕
│"person"       │
╞═══════════════╡
│{"person_id":1}│
└───────────────┘

Given a device, to get the current connected user:

MATCH (:Device {device_id : 1})<-[:CONNECTED_DEVICE {to : 1000}]-(person:Person)
RETURN person

Resulting:

╒═══════════════╕
│"person"       │
╞═══════════════╡
│{"person_id":1}│
└───────────────┘

To query the current administrator and the current person connected to a device the End-Of-Time constant is used.

Query the device connect / disconnect events:

MATCH (device:Device {device_id : 1})<-[r:CONNECTED_DEVICE]-(person:Person)
RETURN person AS person, device AS device, r.from AS from, r.to AS to
ORDER BY r.from

Resulting:

╒═══════════════╤═══════════════╤══════╤════╕
│"person"       │"device"       │"from"│"to"│
╞═══════════════╪═══════════════╪══════╪════╡
│{"person_id":1}│{"device_id":1}│0     │1000│
└───────────────┴───────────────┴──────┴────┘

The above result shows that person_id = 1 is connected to device_id = 1 of the beginning until today.

Changing the graph structure

Consider that the current time point is 30. Now user_id = 1 is disconnecting from device_id = 1. user_id = 2 will connect to it. To represent this structural change, I will run the below query:

// Get the current connected person
MATCH (person1:Person)-[old:CONNECTED_DEVICE {to : 1000}]->(device:Device {device_id : 1})
// get person_id = 2
MATCH (person2:Person {person_id : 2}) 
 // set 30 as the end time of the connection between person_id = 1 and device_id = 1
SET old.to = 30
// set person_id = 2 as the current connected user to device_id = 1
// (from time point 31 to now)
CREATE (person2)-[:CONNECTED_DEVICE {from : 31, to: 1000}]->(device) 

The resultant graph will be:

After this structural change, the connection history of device_id = 1 will be:

MATCH (device:Device {device_id : 1})<-[r:CONNECTED_DEVICE]-(person:Person)
RETURN person AS person, device AS device, r.from AS from, r.to AS to
ORDER BY r.from

╒═══════════════╤═══════════════╤══════╤════╕
│"person"       │"device"       │"from"│"to"│
╞═══════════════╪═══════════════╪══════╪════╡
│{"person_id":1}│{"device_id":1}│0     │30  │
├───────────────┼───────────────┼──────┼────┤
│{"person_id":2}│{"device_id":1}│31    │1000│
└───────────────┴───────────────┴──────┴────┘

The above result shows that user_id = 1 was connected to device_id = 1 from 0 to 30 time. person_id = 2 is currently connected to device_id = 1.

Now the current person connected to device_id = 1 is person_id = 2:

MATCH (:Device {device_id : 1})<-[:CONNECTED_DEVICE {to : 1000}]-(person:Person)
RETURN person

╒═══════════════╕
│"person"       │
╞═══════════════╡
│{"person_id":2}│
└───────────────┘

The same approach can be applied to manage the admin history.

Obviously this approach has some downsides:

  • Need to manage a set of extra relationships
  • More expensive queries
  • More complex queries

But if you really need a versioning schema I believe this approach is a good option or (at least) a good start point.

这篇关于Neo4j 如何对时间版本图进行建模的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆