在橄榄球联赛中找到最多的连续胜利 [英] Finding the most consecutive wins in a rugby league
问题描述
假设橄榄球联赛的结果是由这种数据给出的:
Suppose Rugby league results are given by this kind of data:
Date, Round, Home Team, Visitor Team, Result
18/10/2018, 1, ABC, XYZ, 30-20
18/10/2018, 1, PQR, ABC, 13-12
有人能解释一下如何对上述数据进行建模以找出连续获胜次数最多的球队吗?
Can someone shed a light on how to model the above data to find out the team with the most consecutive wins?
推荐答案
[UPDATED]
如果我们更改 CSV 数据的格式以使其更易于导入(通过修剪多余的空格,并更改为更标准的日期格式),如下所示(进行一些额外更改以使示例更有趣):
If we alter the format of your CSV data to make it easier to import (by trimming excess whitespace, and changing to a more standard date format), like this (with a few extra changes to make the example more interesting):
Date,Round,Home Team,Visitor Team,Result
2018-10-18,1,ABC,XYZ,30-20
2018-10-18,2,ABC,PQR,28-12
2018-10-19,1,PQR,ABC,13-12
然后我们可以像这样导入数据(使用MERGE
代替CREATE
,以避免创建重复的Team
节点):>
Then we can import the data like this (MERGE
is used instead of CREATE
, to avoid creating duplicate Team
nodes):
LOAD CSV WITH HEADERS FROM 'file:///data.csv' AS row
WITH row, SPLIT(row.Result, '-') AS scores
MERGE (h:Team {name: row.`Home Team`})
MERGE (v:Team {name: row.`Visitor Team`})
CREATE (h)<-[:HOME_TEAM {score: scores[0]}]-(g:Game {date: DATE(row.Date), round: row.Round})-[:AWAY_TEAM {score: scores[1]}]->(v)
我们可以使用这个查询返回连续获胜次数最多的球队:
And we can use this query to return the team with the highest number of consecutive wins:
MATCH (t:Team)<-[r]-(g:Game)-[r2]->(t2)
WITH t, r.score > r2.score AS isWin ORDER BY g.date, g.round
RETURN t, REDUCE(s = {max: 0, c: 0, prev:false}, w IN COLLECT(isWin) |
CASE WHEN w
THEN {
c: CASE WHEN s.prev THEN s.c+1 ELSE 1 END,
max: CASE WHEN s.max <= s.c
THEN CASE WHEN s.prev
THEN s.c+1
ELSE CASE WHEN s.max = 0 THEN 1 ELSE s.max END END
ELSE s.max END,
prev: w}
ELSE {c: 0, max: s.max, prev: w} END).max AS maxConsecutiveWins
ORDER BY maxConsecutiveWins DESC
LIMIT 1;
MATCH
模式没有指定任何关系类型,所以 r
将匹配 HOME_TEAM
和 AWAY_TEAM
关系.
The MATCH
pattern does not specify any relationship types, so r
will match both HOME_TEAM
and AWAY_TEAM
relationships.
WITH
子句计算每支球队每场比赛的获胜统计数据,并按 date
和 round
对它们进行排序.
The WITH
clause calculates the win statistics for every game played by every team, and orders them by date
and round
.
由于 RETURN
子句使用了 聚合函数 COLLECT
(在REDUCE 函数),t
变量用作聚合分组键
.这会导致 REDUCE
函数处理一个团队的所有有序获胜统计数据.
Since the RETURN
clause uses the aggregating function COLLECT
(within the REDUCE function), the t
variable is used as the aggregation grouping key
. This causes the REDUCE
function to process all the ordered win statistics for one team at a time.
REDUCE
函数的 accumulator
,s
,是一个具有 3 个属性的映射:
The REDUCE
function's accumulator
, s
, is a map with 3 properties :
max
是迄今为止发现的 最大 连续获胜次数.c
是 当前 获胜序列的连续获胜次数.prev
是上一场比赛的获胜统计数据.
max
is a count of the maximum consecutive wins discovered thus far.c
is the number of consecutive wins for the current sequence of wins.prev
is the win statistic for the previous game.
由于 Cypher 没有 IF
子句,CASE
子句用于逻辑分支.(注意:CASE
子句只能返回一个值——它不能对 DB 执行任何读/写操作).此查询中的各种 CASE
子句用于在按顺序迭代每个游戏时适当地更新 accumulator
.
Since Cypher has no IF
clause, the CASE
clause is used for logical branching. (Note: the CASE
clause can only return a value -- it cannot execute any read/write operations on the DB). The various CASE
clauses in this query serve the purpose of updating the accumulator
appropriately while iterating through each game, in order.
当REDUCE
函数在一个团队的所有游戏中完成时,RETURN
子句获取返回地图的max
值并将其分配给 maxConsecutiveWins
变量.
When the REDUCE
function is done with all the games for a team, the RETURN
clause takes the max
value of the returned map and assigns it to the maxConsecutiveWins
variable.
最后一个 ORDER BY
子句按降序对所有 maxConsecutiveWins
值进行排序,LIMIT
子句只允许返回第一个(值最高的那个),以及对应的团队.
The last ORDER BY
clause sorts all the maxConsecutiveWins
values in descending order, and the LIMIT
clause just allows the first one to be returned (the one with the highest value), along with its corresponding team.
使用上述数据的结果将是:
The result, with the above data, would be:
╒══════════════╤════════════════════╕
│"t" │"maxConsecutiveWins"│
╞══════════════╪════════════════════╡
│{"name":"ABC"}│2 │
└──────────────┴────────────────────┘
注意:鉴于 CASE
子句的复杂性,如果您真的打算使用它,您应该使用所有可能的边缘情况测试上述查询.
NOTE: Given the complexity of the CASE
clauses, you should test the above query with all possible edge cases if you actually plan to use it.
这篇关于在橄榄球联赛中找到最多的连续胜利的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!