使用R有条件地选择另一个列在组中的最后N个值 [英] Conditionally selecting last N values within a group by another column using R
本文介绍了使用R有条件地选择另一个列在组中的最后N个值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
此问题类似于按列但是,我想按组选择最后N个值,其中N取决于相应计数列的值.计数代表特定名称的出现次数.如果count> 3,我只想要最后三个条目,但如果少于3,我只想要最后一个条目.
However, I want to select the last N values by group, with N depending on the value of a corresponding count column. The count represents the number of occurrences of a specific name. If count >3, I only want the last three entries but if it is less than 3, I only want the last entry.
# Sample data
df <- data.frame(Name = c("x","x","x","x","y","y","y","z","z"), Value = c(1,2,3,4,5,6,7,8,9))
# Obtain count for each name
count <- df %>%
group_by(Name) %>%
summarise(Count = n_distinct(Value))
# Merge dataframe with count
merge(df, count, by=c("Name"))
# Delete the first entry for x and the first entry for z
# Desired output
data.frame(Name = c("x","x","x","y","y","y","z"), Value = c(2,3,4,5,6,7,9))
推荐答案
在基数R中,先将 df
除以 df $ Name
.然后,对于每个子组,检查行数并有条件地提取最后3或最后1行.
In base R, split the df
by df$Name
first. Then, for each subgroup, check number of rows and extract last 3 or last 1 row conditionally.
do.call(rbind, lapply(split(df, df$Name), function(a)
a[tail(sequence(NROW(a)), c(3,1)[(NROW(a) < 3) + 1]),]))
或
do.call(rbind, lapply(split(df, df$Name), function(a)
a[tail(sequence(NROW(a)), ifelse(NROW(a) < 3, 1, 3)),]))
# Name Value
#x.2 x 2
#x.3 x 3
#x.4 x 4
#y.5 y 5
#y.6 y 6
#y.7 y 7
#z z 9
对于三个条件值
do.call(rbind, lapply(split(df, df$Name), function(a)
a[tail(sequence(NROW(a)), ifelse(NROW(a) >= 6, 6, ifelse(NROW(a) >= 3, 3, 1))),]))
这篇关于使用R有条件地选择另一个列在组中的最后N个值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文