如何计算一行的列值大于另一行的列的分组对的数量? [英] How can I count the number of grouped pairs in which one row's column value is greater than another?

查看:81
本文介绍了如何计算一行的列值大于另一行的列的分组对的数量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有多个成对值的数据集(df1).该对中的一排为一年(例如2014年),另一排为另一年(例如2013年).对于每对,在G列中都有一个值.我需要计算高年级的G值小于小年级的G值的对数.

I have a dataset (df1) with a number of paired values. One row of the pair is for one year (e.g., 2014), the other for a different year (e.g., 2013). For each pair is a value in the column G. I need a count of the number of pairs in which the G value for the higher year is less than the G value for the lesser year.

这是数据集df1的输出:

Here is my dput for the dataset df1:

structure(list(Name = c("A.J. Ellis", "A.J. Ellis", "A.J. Pierzynski", 
"A.J. Pierzynski", "Aaron Boone", "Adam Kennedy", "Adam Melhuse", 
"Adrian Beltre", "Adrian Beltre", "Adrian Gonzalez", "Alan Zinter", 
"Albert Pujols", "Albert Pujols"), Age = c(37, 36, 37, 36, 36, 
36, 36, 37, 36, 36, 36, 37, 36), Year = c(2018, 2017, 2014, 2013, 
2009, 2012, 2008, 2016, 2015, 2018, 2004, 2017, 2016), Tm = c("SDP", 
"MIA", "TOT", "TEX", "HOU", "LAD", "TOT", "TEX", "TEX", "NYM", 
"ARI", "LAA", "LAA"), Lg = c("NL", "NL", "ML", "AL", "NL", "NL", 
"ML", "AL", "AL", "NL", "NL", "AL", "AL"), G = c(66, 51, 102, 
134, 10, 86, 15, 153, 143, 54, 28, 149, 152), PA = c(183, 163, 
362, 529, 14, 201, 32, 640, 619, 187, 40, 636, 650)), row.names = c(NA, 
13L), class = "data.frame")

这是一个小标题,显示了要检查的行的外观: https://www.dropbox.com/s/3nbfi9le568qb3s/grouped-pairs.png?dl = 0

Here is a tibble that shows the look of the rows to be checked: https://www.dropbox.com/s/3nbfi9le568qb3s/grouped-pairs.png?dl=0

这是我用来创建小标题的代码:

Here is the code I used to create the tibble:

df1 %>%
  group_by(Name) %>%
  filter(n() > 1)

推荐答案

我们可以通过NameAgearrange数据,并检查G中的last值是否小于first值对于每个name,并使用sum进行计数.

We could arrange the data by Name and Age and check if last value in G is less than first value for each name and count those occurrences with sum.

library(dplyr)
df %>%
  arrange(Name, Age) %>%
  group_by(Name) %>%
  summarise(check = last(G) < first(G)) %>%
  pull(check) %>%
  sum(., na.rm = TRUE)

#[1] 2

如果您希望较高年份的G值小于较小年份的G值的货币对,我们可以使用filter.

If you want the pairs in which the G value for the higher year is less than the G value for the lesser year we could use filter.

df %>%
  arrange(Name, Age) %>%
  group_by(Name) %>%
  filter(last(G) < first(G))

# Name              Age  Year Tm    Lg        G    PA
#  <chr>           <dbl> <dbl> <chr> <chr> <dbl> <dbl>
#1 A.J. Pierzynski    36  2013 TEX   AL      134   529
#2 A.J. Pierzynski    37  2014 TOT   ML      102   362
#3 Albert Pujols      36  2016 LAA   AL      152   650
#4 Albert Pujols      37  2017 LAA   AL      149   636

这篇关于如何计算一行的列值大于另一行的列的分组对的数量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆