根据R中的条件重命名因子级别 [英] Rename factor levels based on a condition in R
本文介绍了根据R中的条件重命名因子级别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想将所有小于n的因子合并为一个名为"Else"的因子
I want to combine all factors with a count less than n into one factor named "Else"
例如,如果n = 3,则在以下df中,我想将"c","d"和"e"组合为"Else":
For example if n = 3 then in the following df I want to combine "c", "d" and "e" as "Else":
df = data.frame(x=c(1:10), y=c("a","a","a","b","b","b","c","d","d","e"))
我首先获得了具有所有低计数值的df:
I started out by getting a df with all the low count values:
library(plyr)
lowcounts = ddply(df, "y", function(z){if(nrow(z)<3) nrow(z) else NULL})
我知道我可以手动更改这些设置,但实际上我有数十个级别,因此需要使其自动化.
I know I could change these manually but in practice I have dozens of levels so I need to automate this.
我只想选择并重命名level(df)中%low%低级别的级别,其余的保持不变,但不确定如何进行.
I want to select and rename only the levels %in% lowcount in levels(df) and leave the rest the same but not sure how to proceed.
推荐答案
另一种选择:
#your dataframe
df = data.frame(x=c(1:10), y=c("a","a","a","b","b","b","c","d","d","e"))
#which levels to keep and which to change
res <- table(df$y)
notkeep <- names(res[res < 3])
keep <- names(res)[!names(res) %in% notkeep]
names(keep) <- keep
#set new levels
levels(df$y) <- c(keep, list("else" = notkeep))
df
# x y
#1 1 a
#2 2 a
#3 3 a
#4 4 b
#5 5 b
#6 6 b
#7 7 else
#8 8 else
#9 9 else
#10 10 else
这篇关于根据R中的条件重命名因子级别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文