寻找橄榄球联赛中最连续的胜利 [英] Finding the most consecutive wins in a rugby league

查看:47
本文介绍了寻找橄榄球联赛中最连续的胜利的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设橄榄球联赛的结果是通过这种数据得出的:

Date, Round, Home Team, Visitor Team, Result
18/10/2018, 1, ABC, XYZ, 30-20
18/10/2018, 1, PQR, ABC, 13-12

有人可以阐明如何对以上数据进行建模,以找出连续获胜次数最多的球队吗?

解决方案

[更新]

如果我们更改CSV数据的格式以使其更易于导入(通过修剪多余的空白并更改为更标准的日期格式),则如下所示(进行了一些其他更改以使示例更有趣):

Date,Round,Home Team,Visitor Team,Result
2018-10-18,1,ABC,XYZ,30-20
2018-10-18,2,ABC,PQR,28-12
2018-10-19,1,PQR,ABC,13-12

然后我们可以像这样导入数据(使用MERGE代替CREATE,以避免创建重复的Team节点):

LOAD CSV WITH HEADERS FROM 'file:///data.csv' AS row
WITH row, SPLIT(row.Result, '-') AS scores
MERGE (h:Team {name: row.`Home Team`})
MERGE (v:Team {name: row.`Visitor Team`})
CREATE (h)<-[:HOME_TEAM {score: scores[0]}]-(g:Game {date: DATE(row.Date), round: row.Round})-[:AWAY_TEAM {score: scores[1]}]->(v)

我们可以使用此查询返回连续获胜次数最多的球队:

MATCH (t:Team)<-[r]-(g:Game)-[r2]->(t2)
WITH t, r.score > r2.score AS isWin ORDER BY g.date, g.round
RETURN t, REDUCE(s = {max: 0, c: 0, prev:false}, w IN COLLECT(isWin) |
  CASE WHEN w
    THEN {
      c: CASE WHEN s.prev THEN s.c+1 ELSE 1 END,
      max: CASE WHEN s.max <= s.c
             THEN CASE WHEN s.prev
               THEN s.c+1
               ELSE CASE WHEN s.max = 0 THEN 1 ELSE s.max END END
             ELSE s.max END,
      prev: w}
    ELSE {c: 0, max: s.max, prev: w} END).max AS maxConsecutiveWins
ORDER BY maxConsecutiveWins DESC
LIMIT 1;

MATCH模式未指定任何关系类型,因此r将同时匹配HOME_TEAMAWAY_TEAM关系.

WITH子句计算每个团队参加的每场比赛的获胜统计数据,并按dateround对其进行排序.

由于RETURN子句使用聚合函数 COLLECT(在 REDUCE 中函数),将t变量用作聚合grouping key.这导致REDUCE函数一次处理一个团队的所有有序获胜统计数据.

REDUCE函数的accumulators是具有3个属性的地图:

  • max是迄今发现的最大连续获胜次数.
  • c当前获胜顺序的连续获胜次数.
  • prev上一个游戏的获胜统计.

由于Cypher没有IF子句,因此CASE子句用于逻辑分支. (注意:CASE子句只能返回一个值-它不能在DB上执行任何读/写操作).此查询中的各个CASE子句用于在依次迭代每个游戏的同时适当地更新accumulator的目的.

当对一个团队的所有游戏完成REDUCE功能时,RETURN子句将获取返回地图的max值并将其分配给maxConsecutiveWins变量.

最后一个ORDER BY子句以降序对所有maxConsecutiveWins值进行排序,而LIMIT子句只允许返回第一个(值最高的那个)及其对应的团队. /p>

使用上述数据的结果将是:

╒══════════════╤════════════════════╕
│"t"           │"maxConsecutiveWins"│
╞══════════════╪════════════════════╡
│{"name":"ABC"}│2                   │
└──────────────┴────────────────────┘

注意:考虑到CASE子句的复杂性,如果您实际打算使用上述查询,则应在所有可能的边缘情况下对上述查询进行测试.

Suppose Rugby league results are given by this kind of data:

Date, Round, Home Team, Visitor Team, Result
18/10/2018, 1, ABC, XYZ, 30-20
18/10/2018, 1, PQR, ABC, 13-12

Can someone shed a light on how to model the above data to find out the team with the most consecutive wins?

解决方案

[UPDATED]

If we alter the format of your CSV data to make it easier to import (by trimming excess whitespace, and changing to a more standard date format), like this (with a few extra changes to make the example more interesting):

Date,Round,Home Team,Visitor Team,Result
2018-10-18,1,ABC,XYZ,30-20
2018-10-18,2,ABC,PQR,28-12
2018-10-19,1,PQR,ABC,13-12

Then we can import the data like this (MERGE is used instead of CREATE, to avoid creating duplicate Team nodes):

LOAD CSV WITH HEADERS FROM 'file:///data.csv' AS row
WITH row, SPLIT(row.Result, '-') AS scores
MERGE (h:Team {name: row.`Home Team`})
MERGE (v:Team {name: row.`Visitor Team`})
CREATE (h)<-[:HOME_TEAM {score: scores[0]}]-(g:Game {date: DATE(row.Date), round: row.Round})-[:AWAY_TEAM {score: scores[1]}]->(v)

And we can use this query to return the team with the highest number of consecutive wins:

MATCH (t:Team)<-[r]-(g:Game)-[r2]->(t2)
WITH t, r.score > r2.score AS isWin ORDER BY g.date, g.round
RETURN t, REDUCE(s = {max: 0, c: 0, prev:false}, w IN COLLECT(isWin) |
  CASE WHEN w
    THEN {
      c: CASE WHEN s.prev THEN s.c+1 ELSE 1 END,
      max: CASE WHEN s.max <= s.c
             THEN CASE WHEN s.prev
               THEN s.c+1
               ELSE CASE WHEN s.max = 0 THEN 1 ELSE s.max END END
             ELSE s.max END,
      prev: w}
    ELSE {c: 0, max: s.max, prev: w} END).max AS maxConsecutiveWins
ORDER BY maxConsecutiveWins DESC
LIMIT 1;

The MATCH pattern does not specify any relationship types, so r will match both HOME_TEAM and AWAY_TEAM relationships.

The WITH clause calculates the win statistics for every game played by every team, and orders them by date and round.

Since the RETURN clause uses the aggregating function COLLECT (within the REDUCE function), the t variable is used as the aggregation grouping key. This causes the REDUCE function to process all the ordered win statistics for one team at a time.

The REDUCE function's accumulator, s, is a map with 3 properties :

  • max is a count of the maximum consecutive wins discovered thus far.
  • c is the number of consecutive wins for the current sequence of wins.
  • prev is the win statistic for the previous game.

Since Cypher has no IF clause, the CASE clause is used for logical branching. (Note: the CASE clause can only return a value -- it cannot execute any read/write operations on the DB). The various CASE clauses in this query serve the purpose of updating the accumulator appropriately while iterating through each game, in order.

When the REDUCE function is done with all the games for a team, the RETURN clause takes the max value of the returned map and assigns it to the maxConsecutiveWins variable.

The last ORDER BY clause sorts all the maxConsecutiveWins values in descending order, and the LIMIT clause just allows the first one to be returned (the one with the highest value), along with its corresponding team.

The result, with the above data, would be:

╒══════════════╤════════════════════╕
│"t"           │"maxConsecutiveWins"│
╞══════════════╪════════════════════╡
│{"name":"ABC"}│2                   │
└──────────────┴────────────────────┘

NOTE: Given the complexity of the CASE clauses, you should test the above query with all possible edge cases if you actually plan to use it.

这篇关于寻找橄榄球联赛中最连续的胜利的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆