R ifelse将因子值更改为索引 [英] R ifelse changed factor value into index

查看:83
本文介绍了R ifelse将因子值更改为索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在使用R时遇到了一个奇怪的问题,我在使用data.table:

I met a weird problem when I am using R, I'm using data.table:

在这里,当我尝试转换那些省份少于500更改为其他,输出将顶部计数的省份更改为索引号

Here, when I tried to convert those Province has count under 500 to "Other", the output changes the top count Provinces into index number

df <- fact_data[,.N,Province][N >= 500]$Province
df
fact_data[,Province := ifelse(Province %in% df, fact_data$Province, "Other")]
fact_data[,.N,Province][order(-N)]

输出:

但是,此方法对那些数值格式的因子变量效果很好。例如,如果我不使用省,而是使用BranchNumber,则值看起来像 1, 3,我得到的输入是这样的,这很不错:

But, this method worked well on those factor variables which values are in numeric format. For example, instead of using Province, if I use BranchNumber, the values look like "1", "3", I got the input like this, which is good:

您知道为什么会这样以及如何解决该问题吗?

Do you know, why this happened and how to resolve the problem?

推荐答案

这可能是 ifelse 的副作用,该习惯具有不可预期的改变其返回值的类的习惯。 。尝试以下操作:

This is probably a side effect of ifelse, which has a bad habit of changing the class of its return value unpredictably. Try this instead:

fact_data[ !( Province %in% df ), Province := "Other" ] 

通常,我建议尽可能使用字符向量作为data.table列,而不是因素。

Generally, I would recommend working with character vectors as data.table columns instead of factors whenever possible.

这篇关于R ifelse将因子值更改为索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆