使用R选择列中组内的前N个值 [英] Selecting top N values within a group in a column using R
本文介绍了使用R选择列中组内的前N个值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我需要从R中的以下数据框中为每个组的[yearmonth]值选择前两个值.我已经按照count和yearmonth对数据进行了排序.如何在随后的数据中实现呢?
I need to select top two values for each group[yearmonth] value from the following data frame in R. I have already sorted the data by count and yearmonth.How can I achieve that in following data?
yearmonth name count
1 201310 Dovas 5
2 201310 Indulgd 2
3 201310 Justina 1
4 201310 Jolita 1
5 201311 Shahrukh Sheikh 1
6 201311 Dovas 29
7 201311 Justina 13
8 201311 Lina 8
9 201312 sUPERED 7
10 201312 John Hansen 7
11 201312 Lina D. 6
12 201312 joanna1st 5
推荐答案
或使用data.table
(来自@jazzurro帖子的mydf
).一些选项是
Or using data.table
(mydf
from @jazzurro's post). Some options are
library(data.table)
setDT(mydf)[order(yearmonth,-count), .SD[1:2], by=yearmonth]
或
setDT(mydf)[mydf[order(yearmonth, -count), .I[1:2], by=yearmonth]$V1,]
或
setorder(setkey(setDT(mydf), yearmonth), yearmonth, -count)[
,.SD[1:2], by=yearmonth]
# yearmonth name count
#1: 201310 Dovas 5
#2: 201310 Indulgd 2
#3: 201311 Dovas 29
#4: 201311 Justina 13
#5: 201312 sUPERED 7
#6: 201312 John Hansen 7
这篇关于使用R选择列中组内的前N个值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文