查找2个日期之间的所有事件 [英] Find all events between 2 dates

查看:86
本文介绍了查找2个日期之间的所有事件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

试图找出以下情况.

设置:

目标

  • 查找两个日期之间发生的所有事件
  • 查找冲突事件(重叠的事件)

尝试了以下查询:

// Get all events between 2 times
MATCH (m1:Minute { minute: 15 })<--(h:Hour { hour: 8 })<--(d:Day { day: 24 })<--(month:Month { month: 10 })
MATCH (m2:Minute { minute: 45 })<--(h2:Hour { hour: 10 })<--(d2:Day { day: 24 })<--(month2:Month { month: 10 })
MATCH p=((m1)-[:NEXT*]->(m2))
WITH NODES(p) as pnodes UNWIND pnodes AS pwind
MATCH (e:Event)-[:STARTS_AT]->(pwind)
RETURN pwind,e

确实正在检索结果,但请注意:

  1. 如果我尝试使用最短路径"而不是常规路径,则该查询仅适用于特定阈值(怪异,无法解释).例如,如果时间跨度超过2-3小时,则无法使用;同样适用于多天等.
  2. 不使用最短路径",则性能很差. 25秒内仅返回2-3个项目.

另一个使用where的变体(仅在将来的日期中尝试过):

// Get all events between 2 times
MATCH (e:Event)
WHERE (:Month { month: 10 })-->(:Day { day: 24 })-->(:Hour { hour: 9 })-->(:Minute { minute: 00})-[:NEXT*]->(:Minute)<--(e)
RETURN e

结果:效果甚至更差. 100秒检索1个项目.

我理解并希望做到这一点的方法是使用某种允许路径返回相关节点的函数.这是:path函数仅返回要查询的特定节点(在本例中为Minutes),但是我想为所有THOSE MINUTES带来由:STARTS_AT"关联的事件.

最后是问题:

  • 执行此查询的推荐方法是什么?
  • 这是Cypher和neo4j支持的方案吗?
  • 最好还是回退"基于属性的时间查询,而不是尝试基于图形的时间查询?

谢谢.

解决方案

所以shortestPath有一个奇怪的地方,如果您不指定最大长度,它会随意将最大值设置为 15 .看到这里:

ShortestPath找不到没有最大跳数限制的任何路径

我实际上将其称为错误,因为它不在文档中,并且会导致意外的行为,如您所见.

所以解决您的问题的方法是使用shortestPath,但要选择最大长度.我会选择非常高的东西.让我们做十亿个日子吧:

MATCH (:Year {year:2015})-[:HAS_MONTH]->(:Month {month:10})-[:HAS_DAY]->(:Day {day:23})-[:HAS_HOUR]->(:Hour {hour:8})-[:HAS_MINUTE]->(startMinute:Minute {minute:15})
MATCH (:Year {year:2015})-[:HAS_MONTH]->(:Month {month:10})-[:HAS_DAY]->(:Day {day:24})-[:HAS_HOUR]->(:Hour {hour:10})-[:HAS_MINUTE]->(endMinute:Minute {minute:45})
MATCH p = shortestPath((startMinute)-[:NEXT*..1000000000]->(endMinute))
UNWIND NODES(p) AS minute
MATCH (event:Event)-[:STARTS_AT]->(minute)
RETURN event, minute;

您应始终使用shortestPath来查找分钟节点的跨度;在(startMinute)-[:NEXT*]->(endMinute)上进行匹配而不将其包装在shortestPath中的目的是尝试查找两个节点之间任何长度的所有路径,因此它是详尽无遗的,并且需要更长的时间,而shortestPath可以在找到路径后立即停止.

只要确定是否有其他事件与某个事件重叠:

MATCH (startMinute:Minute)<-[:STARTS_AT]-(event:Event)-[:ENDS_AT]->(endMinute:Minute)
WHERE event.id = {event_id}
MATCH p = shortestPath((startMinute)-[:NEXT*..1000000000]->(endMinute))
UNWIND NODES(p) AS span
MATCH (overlap:Event)-[:STARTS_AT|ENDS_AT]->(span)
WHERE overlap <> event
RETURN overlap;

下面是有关如何为概念验证目的创建数据的附录.假设所有月份都有31天.

约束和索引.

CREATE CONSTRAINT ON (year:Year) ASSERT year.year IS UNIQUE;
CREATE INDEX ON :Month(month);
CREATE INDEX ON :Day(day);
CREATE INDEX ON :Hour(hour);
CREATE INDEX ON :Minute(minute);

创建时间树.

WITH RANGE(2014, 2015) AS years, RANGE(1, 12) AS months, RANGE(1, 31) AS days, RANGE(0,23) AS hours, RANGE(0, 45, 15) AS minutes
FOREACH(year IN years | 
  MERGE (y:Year {year: year})
  FOREACH(month IN months | 
    CREATE (m:Month {month: month})
    MERGE (y)-[:HAS_MONTH]->(m)
    FOREACH(day IN days |      
      CREATE (d:Day {day: day})
      MERGE (m)-[:HAS_DAY]->(d)
      FOREACH(hour IN hours |
        CREATE (h:Hour {hour: hour})
        MERGE (d)-[:HAS_HOUR]->(h)
        FOREACH(minute IN minutes |
          CREATE (min:Minute {minute: minute})
          MERGE (h)-[:HAS_MINUTE]->(min)
        )
      )
    )
  )
);

在所有Minute节点之间创建[:NEXT]关系.

MATCH (year:Year)-[:HAS_MONTH]->(month:Month)-[:HAS_DAY]->(day:Day)-[:HAS_HOUR]->(hour:Hour)-[:HAS_MINUTE]->(minute:Minute)
WITH year, month, day, hour, minute
ORDER BY year.year, month.month, day.day, hour.hour, minute.minute
WITH COLLECT(minute) AS minutes
FOREACH(i IN RANGE(0, LENGTH(minutes) - 2) | 
    FOREACH(min1 IN [minutes[i]] | 
        FOREACH(min2 IN [minutes[i + 1]] | 
            CREATE UNIQUE (min1)-[:NEXT]->(min2)
        )
    )
);

随机模拟事件及其开始时间.

MATCH (minute:Minute)
WHERE RAND() < 0.3
CREATE (event:Event)-[:STARTS_AT]->(minute);

将所有事件的间隔设为5分钟.

MATCH (event:Event)-[:STARTS_AT]->(startMinute:Minute)-[:NEXT*5]->(endMinute:Minute)
CREATE (event)-[:ENDS_AT]->(endMinute);

Trying to figure out the following scenario.

Setup:

Goals

  • Find all events occurring between two dates
  • Find conflicting events (events that overlap)

Attempted the following query:

// Get all events between 2 times
MATCH (m1:Minute { minute: 15 })<--(h:Hour { hour: 8 })<--(d:Day { day: 24 })<--(month:Month { month: 10 })
MATCH (m2:Minute { minute: 45 })<--(h2:Hour { hour: 10 })<--(d2:Day { day: 24 })<--(month2:Month { month: 10 })
MATCH p=((m1)-[:NEXT*]->(m2))
WITH NODES(p) as pnodes UNWIND pnodes AS pwind
MATCH (e:Event)-[:STARTS_AT]->(pwind)
RETURN pwind,e

The results are indeed being retrieved, but noticed that:

  1. If I try to use "shortestpath" instead of regular path, the query only works for a certain threshold (bizarre, can't explain this). For example, if the timespan is more than 2-3 hours it doesn't work; same applies to multiple days, etc.
  2. Without using "shortestpath", the performance is TERRIBLE. 25 seconds to return only 2-3 items.

Another variation using where (tried it only for future dates):

// Get all events between 2 times
MATCH (e:Event)
WHERE (:Month { month: 10 })-->(:Day { day: 24 })-->(:Hour { hour: 9 })-->(:Minute { minute: 00})-[:NEXT*]->(:Minute)<--(e)
RETURN e

Results: the performance is even WORSE. 100 seconds to retrieve 1 item.

The way I understand and would like to do this, is by using some sort of function that allows the path to return related nodes. This is: path function returns only the specific node being queried (in this case Minutes), but I would like to bring for ALL THOSE MINUTES, the Events associated by ":STARTS_AT".

Finally, the questions:

  • What's the recommended way to perform this query?
  • Is this a scenario that is supported by Cypher and neo4j?
  • Would it be preferrable to "fall back" to property based time querying instead of trying to attempt graph based time queries?

Thanks in advance.

解决方案

So there's this weird thing with shortestPath where if you don't specify a maximum length, it arbitrarily sets the max to 15. See here:

ShortestPath doesn't find any path without max hops limit

I would actually call this a bug, because it's not in the documentation and it leads to unexpected behavior, as you've found.

So the solution to your problem is to use shortestPath but pick a maximum length. I'd choose something really high; let's do a billion and call it a day:

MATCH (:Year {year:2015})-[:HAS_MONTH]->(:Month {month:10})-[:HAS_DAY]->(:Day {day:23})-[:HAS_HOUR]->(:Hour {hour:8})-[:HAS_MINUTE]->(startMinute:Minute {minute:15})
MATCH (:Year {year:2015})-[:HAS_MONTH]->(:Month {month:10})-[:HAS_DAY]->(:Day {day:24})-[:HAS_HOUR]->(:Hour {hour:10})-[:HAS_MINUTE]->(endMinute:Minute {minute:45})
MATCH p = shortestPath((startMinute)-[:NEXT*..1000000000]->(endMinute))
UNWIND NODES(p) AS minute
MATCH (event:Event)-[:STARTS_AT]->(minute)
RETURN event, minute;

You should always use shortestPath for finding the span of minute nodes; matching on (startMinute)-[:NEXT*]->(endMinute) without wrapping it in shortestPath is trying to find all paths of any length between the two nodes, so it's exhaustive and takes much longer, whereas shortestPath can stop as soon as it's found the path.

As far as finding if any other events overlap with a certain event:

MATCH (startMinute:Minute)<-[:STARTS_AT]-(event:Event)-[:ENDS_AT]->(endMinute:Minute)
WHERE event.id = {event_id}
MATCH p = shortestPath((startMinute)-[:NEXT*..1000000000]->(endMinute))
UNWIND NODES(p) AS span
MATCH (overlap:Event)-[:STARTS_AT|ENDS_AT]->(span)
WHERE overlap <> event
RETURN overlap;

Below is an appendix of how the data was created for proof-of-concept purposes. Assume all months have 31 days.

Constraints and indexes.

CREATE CONSTRAINT ON (year:Year) ASSERT year.year IS UNIQUE;
CREATE INDEX ON :Month(month);
CREATE INDEX ON :Day(day);
CREATE INDEX ON :Hour(hour);
CREATE INDEX ON :Minute(minute);

Create the time tree.

WITH RANGE(2014, 2015) AS years, RANGE(1, 12) AS months, RANGE(1, 31) AS days, RANGE(0,23) AS hours, RANGE(0, 45, 15) AS minutes
FOREACH(year IN years | 
  MERGE (y:Year {year: year})
  FOREACH(month IN months | 
    CREATE (m:Month {month: month})
    MERGE (y)-[:HAS_MONTH]->(m)
    FOREACH(day IN days |      
      CREATE (d:Day {day: day})
      MERGE (m)-[:HAS_DAY]->(d)
      FOREACH(hour IN hours |
        CREATE (h:Hour {hour: hour})
        MERGE (d)-[:HAS_HOUR]->(h)
        FOREACH(minute IN minutes |
          CREATE (min:Minute {minute: minute})
          MERGE (h)-[:HAS_MINUTE]->(min)
        )
      )
    )
  )
);

Create [:NEXT] relationships between all the Minute nodes.

MATCH (year:Year)-[:HAS_MONTH]->(month:Month)-[:HAS_DAY]->(day:Day)-[:HAS_HOUR]->(hour:Hour)-[:HAS_MINUTE]->(minute:Minute)
WITH year, month, day, hour, minute
ORDER BY year.year, month.month, day.day, hour.hour, minute.minute
WITH COLLECT(minute) AS minutes
FOREACH(i IN RANGE(0, LENGTH(minutes) - 2) | 
    FOREACH(min1 IN [minutes[i]] | 
        FOREACH(min2 IN [minutes[i + 1]] | 
            CREATE UNIQUE (min1)-[:NEXT]->(min2)
        )
    )
);

Randomly simulate events and their start times.

MATCH (minute:Minute)
WHERE RAND() < 0.3
CREATE (event:Event)-[:STARTS_AT]->(minute);

Make all events 5 minute blocks long.

MATCH (event:Event)-[:STARTS_AT]->(startMinute:Minute)-[:NEXT*5]->(endMinute:Minute)
CREATE (event)-[:ENDS_AT]->(endMinute);

这篇关于查找2个日期之间的所有事件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆