如何计算一行的列值大于另一行的列的分组对的数量? [英] How can I count the number of grouped pairs in which one row's column value is greater than another?
问题描述
我有一个带有多个成对值的数据集(df1).该对中的一排为一年(例如2014年),另一排为另一年(例如2013年).对于每对,在G列中都有一个值.我需要计算高年级的G值小于小年级的G值的对数.
I have a dataset (df1) with a number of paired values. One row of the pair is for one year (e.g., 2014), the other for a different year (e.g., 2013). For each pair is a value in the column G. I need a count of the number of pairs in which the G value for the higher year is less than the G value for the lesser year.
这是数据集df1的输出:
Here is my dput for the dataset df1:
structure(list(Name = c("A.J. Ellis", "A.J. Ellis", "A.J. Pierzynski",
"A.J. Pierzynski", "Aaron Boone", "Adam Kennedy", "Adam Melhuse",
"Adrian Beltre", "Adrian Beltre", "Adrian Gonzalez", "Alan Zinter",
"Albert Pujols", "Albert Pujols"), Age = c(37, 36, 37, 36, 36,
36, 36, 37, 36, 36, 36, 37, 36), Year = c(2018, 2017, 2014, 2013,
2009, 2012, 2008, 2016, 2015, 2018, 2004, 2017, 2016), Tm = c("SDP",
"MIA", "TOT", "TEX", "HOU", "LAD", "TOT", "TEX", "TEX", "NYM",
"ARI", "LAA", "LAA"), Lg = c("NL", "NL", "ML", "AL", "NL", "NL",
"ML", "AL", "AL", "NL", "NL", "AL", "AL"), G = c(66, 51, 102,
134, 10, 86, 15, 153, 143, 54, 28, 149, 152), PA = c(183, 163,
362, 529, 14, 201, 32, 640, 619, 187, 40, 636, 650)), row.names = c(NA,
13L), class = "data.frame")
这是一个小标题,显示了要检查的行的外观: https://www.dropbox.com/s/3nbfi9le568qb3s/grouped-pairs.png?dl = 0
Here is a tibble that shows the look of the rows to be checked: https://www.dropbox.com/s/3nbfi9le568qb3s/grouped-pairs.png?dl=0
这是我用来创建小标题的代码:
Here is the code I used to create the tibble:
df1 %>%
group_by(Name) %>%
filter(n() > 1)
推荐答案
我们可以通过Name
和Age
来arrange
数据,并检查G
中的last
值是否小于first
值对于每个name
,并使用sum
进行计数.
We could arrange
the data by Name
and Age
and check if last
value in G
is less than first
value for each name
and count those occurrences with sum
.
library(dplyr)
df %>%
arrange(Name, Age) %>%
group_by(Name) %>%
summarise(check = last(G) < first(G)) %>%
pull(check) %>%
sum(., na.rm = TRUE)
#[1] 2
如果您希望较高年份的G值小于较小年份的G值的货币对,我们可以使用filter
.
If you want the pairs in which the G value for the higher year is less than the G value for the lesser year we could use filter
.
df %>%
arrange(Name, Age) %>%
group_by(Name) %>%
filter(last(G) < first(G))
# Name Age Year Tm Lg G PA
# <chr> <dbl> <dbl> <chr> <chr> <dbl> <dbl>
#1 A.J. Pierzynski 36 2013 TEX AL 134 529
#2 A.J. Pierzynski 37 2014 TOT ML 102 362
#3 Albert Pujols 36 2016 LAA AL 152 650
#4 Albert Pujols 37 2017 LAA AL 149 636
这篇关于如何计算一行的列值大于另一行的列的分组对的数量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!