寻找橄榄球联赛中最连续的胜利 [英] Finding the most consecutive wins in a rugby league
问题描述
假设橄榄球联赛的结果是通过这种数据得出的:
Date, Round, Home Team, Visitor Team, Result
18/10/2018, 1, ABC, XYZ, 30-20
18/10/2018, 1, PQR, ABC, 13-12
有人可以阐明如何对以上数据进行建模,以找出连续获胜次数最多的球队吗?
[更新]
如果我们更改CSV数据的格式以使其更易于导入(通过修剪多余的空白并更改为更标准的日期格式),则如下所示(进行了一些其他更改以使示例更有趣):
Date,Round,Home Team,Visitor Team,Result
2018-10-18,1,ABC,XYZ,30-20
2018-10-18,2,ABC,PQR,28-12
2018-10-19,1,PQR,ABC,13-12
然后我们可以像这样导入数据(使用MERGE
代替CREATE
,以避免创建重复的Team
节点):
LOAD CSV WITH HEADERS FROM 'file:///data.csv' AS row
WITH row, SPLIT(row.Result, '-') AS scores
MERGE (h:Team {name: row.`Home Team`})
MERGE (v:Team {name: row.`Visitor Team`})
CREATE (h)<-[:HOME_TEAM {score: scores[0]}]-(g:Game {date: DATE(row.Date), round: row.Round})-[:AWAY_TEAM {score: scores[1]}]->(v)
我们可以使用此查询返回连续获胜次数最多的球队:
MATCH (t:Team)<-[r]-(g:Game)-[r2]->(t2)
WITH t, r.score > r2.score AS isWin ORDER BY g.date, g.round
RETURN t, REDUCE(s = {max: 0, c: 0, prev:false}, w IN COLLECT(isWin) |
CASE WHEN w
THEN {
c: CASE WHEN s.prev THEN s.c+1 ELSE 1 END,
max: CASE WHEN s.max <= s.c
THEN CASE WHEN s.prev
THEN s.c+1
ELSE CASE WHEN s.max = 0 THEN 1 ELSE s.max END END
ELSE s.max END,
prev: w}
ELSE {c: 0, max: s.max, prev: w} END).max AS maxConsecutiveWins
ORDER BY maxConsecutiveWins DESC
LIMIT 1;
MATCH
模式未指定任何关系类型,因此r
将同时匹配HOME_TEAM
和AWAY_TEAM
关系.
WITH
子句计算每个团队参加的每场比赛的获胜统计数据,并按date
和round
对其进行排序.
由于RETURN
子句使用聚合函数 COLLECT
(在 REDUCE 中函数),将t
变量用作聚合grouping key
.这导致REDUCE
函数一次处理一个团队的所有有序获胜统计数据.
REDUCE
函数的accumulator
,s
是具有3个属性的地图:
-
max
是迄今发现的最大连续获胜次数. -
c
是当前获胜顺序的连续获胜次数. -
prev
是上一个游戏的获胜统计.
由于Cypher没有IF
子句,因此CASE
子句用于逻辑分支. (注意:CASE
子句只能返回一个值-它不能在DB上执行任何读/写操作).此查询中的各个CASE
子句用于在依次迭代每个游戏的同时适当地更新accumulator
的目的.
当对一个团队的所有游戏完成REDUCE
功能时,RETURN
子句将获取返回地图的max
值并将其分配给maxConsecutiveWins
变量.
最后一个ORDER BY
子句以降序对所有maxConsecutiveWins
值进行排序,而LIMIT
子句只允许返回第一个(值最高的那个)及其对应的团队. /p>
使用上述数据的结果将是:
╒══════════════╤════════════════════╕
│"t" │"maxConsecutiveWins"│
╞══════════════╪════════════════════╡
│{"name":"ABC"}│2 │
└──────────────┴────────────────────┘
注意:考虑到CASE
子句的复杂性,如果您实际打算使用上述查询,则应在所有可能的边缘情况下对上述查询进行测试.
Suppose Rugby league results are given by this kind of data:
Date, Round, Home Team, Visitor Team, Result
18/10/2018, 1, ABC, XYZ, 30-20
18/10/2018, 1, PQR, ABC, 13-12
Can someone shed a light on how to model the above data to find out the team with the most consecutive wins?
[UPDATED]
If we alter the format of your CSV data to make it easier to import (by trimming excess whitespace, and changing to a more standard date format), like this (with a few extra changes to make the example more interesting):
Date,Round,Home Team,Visitor Team,Result
2018-10-18,1,ABC,XYZ,30-20
2018-10-18,2,ABC,PQR,28-12
2018-10-19,1,PQR,ABC,13-12
Then we can import the data like this (MERGE
is used instead of CREATE
, to avoid creating duplicate Team
nodes):
LOAD CSV WITH HEADERS FROM 'file:///data.csv' AS row
WITH row, SPLIT(row.Result, '-') AS scores
MERGE (h:Team {name: row.`Home Team`})
MERGE (v:Team {name: row.`Visitor Team`})
CREATE (h)<-[:HOME_TEAM {score: scores[0]}]-(g:Game {date: DATE(row.Date), round: row.Round})-[:AWAY_TEAM {score: scores[1]}]->(v)
And we can use this query to return the team with the highest number of consecutive wins:
MATCH (t:Team)<-[r]-(g:Game)-[r2]->(t2)
WITH t, r.score > r2.score AS isWin ORDER BY g.date, g.round
RETURN t, REDUCE(s = {max: 0, c: 0, prev:false}, w IN COLLECT(isWin) |
CASE WHEN w
THEN {
c: CASE WHEN s.prev THEN s.c+1 ELSE 1 END,
max: CASE WHEN s.max <= s.c
THEN CASE WHEN s.prev
THEN s.c+1
ELSE CASE WHEN s.max = 0 THEN 1 ELSE s.max END END
ELSE s.max END,
prev: w}
ELSE {c: 0, max: s.max, prev: w} END).max AS maxConsecutiveWins
ORDER BY maxConsecutiveWins DESC
LIMIT 1;
The MATCH
pattern does not specify any relationship types, so r
will match both HOME_TEAM
and AWAY_TEAM
relationships.
The WITH
clause calculates the win statistics for every game played by every team, and orders them by date
and round
.
Since the RETURN
clause uses the aggregating function COLLECT
(within the REDUCE function), the t
variable is used as the aggregation grouping key
. This causes the REDUCE
function to process all the ordered win statistics for one team at a time.
The REDUCE
function's accumulator
, s
, is a map with 3 properties :
max
is a count of the maximum consecutive wins discovered thus far.c
is the number of consecutive wins for the current sequence of wins.prev
is the win statistic for the previous game.
Since Cypher has no IF
clause, the CASE
clause is used for logical branching. (Note: the CASE
clause can only return a value -- it cannot execute any read/write operations on the DB). The various CASE
clauses in this query serve the purpose of updating the accumulator
appropriately while iterating through each game, in order.
When the REDUCE
function is done with all the games for a team, the RETURN
clause takes the max
value of the returned map and assigns it to the maxConsecutiveWins
variable.
The last ORDER BY
clause sorts all the maxConsecutiveWins
values in descending order, and the LIMIT
clause just allows the first one to be returned (the one with the highest value), along with its corresponding team.
The result, with the above data, would be:
╒══════════════╤════════════════════╕
│"t" │"maxConsecutiveWins"│
╞══════════════╪════════════════════╡
│{"name":"ABC"}│2 │
└──────────────┴────────────────────┘
NOTE: Given the complexity of the CASE
clauses, you should test the above query with all possible edge cases if you actually plan to use it.
这篇关于寻找橄榄球联赛中最连续的胜利的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!