创建“其他”领域 [英] Creating an "other" field

查看:86
本文介绍了创建“其他”领域的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

现在,我有以下数据框架,由 original.df%创建。%group_by(Category)%。%tally()%。%arrange(desc(n))

Right now, I have the following data.frame which was created by original.df %.% group_by(Category) %.% tally() %.% arrange(desc(n)).

DF <- structure(list(Category = c("E", "K", "M", "L", "I", "A", 
"S", "G", "N", "Q"), n = c(163051, 127133, 106680, 64868, 49701, 
47387, 47096, 45601, 40056, 36882)), .Names = c("Category", 
"n"), row.names = c(NA, 10L), class = c("tbl_df", "tbl", "data.frame"
))

         Category      n
1               E 163051
2               K 127133
3               M 106680
4               L  64868
5               I  49701
6               A  47387
7               S  47096
8               G  45601
9               N  40056
10              Q  36882

我想从底部排名的其他字段创建n。即

I want to create an "Other" field from the bottom ranked Categories by n. i.e.

        Category      n
1              E 163051
2              K 127133
3              M 106680
4              L  64868
5              I  49701
6          Other 217022

现在,我正在做

rbind(filter(DF, rank(rev(n)) <= 5), 
  summarise(filter(DF, rank(rev(n)) > 5), Category = "Other", n = sum(n)))

将所有不在前5名的类别折叠到其他类别中。

which collapses all categories not in the top 5 into the Other category.

但我很好奇是否有更好的方式在 dplyr 或其他现有的包中。 更好我的意思是更简洁/可读。我也有兴趣使用更聪明或更灵活的方法来选择其他

But I'm curious whether there's a better way in dplyr or some other existing package. By "better" I mean more succinct/readable. I'm also interested in methods with cleverer or more flexible ways to choose Other.

推荐答案

不同的包/不同的语法版本:

Different package/different syntax version:

library(data.table)

dt = as.data.table(DF)

dt[order(-n), # your data is already sorted, so this does nothing for it
   if (.BY[[1]]) .SD else list("Other", sum(n)),
   by = 1:nrow(dt) <= 5][, !"nrow", with = F]
#   Category      n
#1:        E 163051
#2:        K 127133
#3:        M 106680
#4:        L  64868
#5:        I  49701
#6:    Other 217022

这篇关于创建“其他”领域的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆