如何在标题栏中找到最长的重复序列? [英] How to find the longest duplicate sequence in a tibble column?

查看:84
本文介绍了如何在标题栏中找到最长的重复序列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我更新了我的问题,因为我需要在输出三角栏中再增加一列.

I updated my question because I need one more column to my output tible.

我有以下提示:

library(tibble)

my_tbl <- tribble(
  ~year, ~event_id, ~winner_id, 
  2011,      "A",     4322,
  2012,      "A",     4322,
  2013,      "A",     4322,
  2014,      "A",     5478,
  2015,      "A",     4322,
  2011,      "B",     4322,
  2012,      "B",     7893,
  2013,      "B",     7893,
  2014,      "B",     2365,
  2015,      "B",     3407,
  2011,      "C",     5556,
  2012,      "C",     5556,
  2013,      "C",     1238,
  2014,      "C",     2391,
  2015,      "C",     2391,
  2011,      "D",     4219,
  2012,      "D",     7623,
  2013,      "D",     8003,
  2014,      "D",     2851,
  2015,      "D",     0418
)

我想按事件ID找出连续最多的胜利.我要寻找的结果看起来像这样:

I would like to find out the most wins in a row by event id. The result I'm looking for would look like this:

results_summary_tbl<-tribble( 〜event_id,〜most_wins_in_a_row,〜winners数,〜winners,〜years, "A",3,1,"4322","4322 =(2011,2012,2013)", "C",2,2,"5556,2391","5556 =(2011,2012),2391 =(2014,2015)", "B",2,1,"7893","7893 =(2012,2013)", "D",1、5,"4219、7623、8003、2851、0418","4219 =(2011),7623 =(2012),8003 =(2013),2851 =(2014),0418 =(2015) " )

results_summary_tbl <- tribble( ~event_id, ~most_wins_in_a_row, ~number_of_winners, ~winners, ~years, "A", 3, 1, "4322", "4322 = (2011, 2012, 2013)", "C", 2, 2, "5556 , 2391", "5556 = (2011, 2012), 2391 = (2014, 2015)", "B", 2, 1, "7893", "7893 = (2012, 2013)", "D", 1, 5, "4219 , 7623 , 8003 , 2851 , 0418", "4219 = (2011), 7623 = (2012), 8003 = (2013), 2851 = (2014), 0418 = (2015)" )

谢谢

推荐答案

一个dplyr选项可能是:

my_tbl %>%
 add_count(event_id, rleid = cumsum(winner_id != lag(winner_id, default = first(winner_id)))) %>%
 group_by(event_id) %>%
 summarise(most_wins_in_a_row = max(n),
           number_of_winners = n_distinct(winner_id[n == max(n)]),
           winners = paste0(unique(winner_id[n == max(n)]), collapse = ","))

  event_id most_wins_in_a_row number_of_winners winners                
  <chr>                 <int>             <int> <chr>                  
1 A                         3                 1 4322                   
2 B                         2                 1 7893                   
3 C                         2                 2 5556,2391              
4 D                         1                 5 4219,7623,8003,2851,418

这篇关于如何在标题栏中找到最长的重复序列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆