替换 <NA>在因子列中 [英] Replace <NA> in a factor column

查看:16
本文介绍了替换 <NA>在因子列中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想用有效值替换因子列中的 值.但我找不到办法.此示例仅用于演示.原始数据来自一个我要处理的国外csv文件.

I want to replace <NA> values in a factors column with a valid value. But I can not find a way. This example is only for demonstration. The original data comes from a foreign csv file I have to deal with.

df <- data.frame(a=sample(0:10, size=10, replace=TRUE),
                 b=sample(20:30, size=10, replace=TRUE))
df[df$a==0,'a'] <- NA
df$a <- as.factor(df$a)

看起来像这样

      a  b
1     1 29
2     2 23
3     3 23
4     3 22
5     4 28
6  <NA> 24
7     2 21
8     4 25
9  <NA> 29
10    3 24

现在我想用数字替换 值.

Now I want to replace the <NA> values with a number.

df[is.na(df$a), 'a'] <- 88
In `[<-.factor`(`*tmp*`, iseq, value = c(88, 88)) :
  invalid factor level, NA generated

我想我错过了关于因素的基本 R 概念.我是吗?我不明白为什么它不起作用.我认为 invalid factor level 意味着 88 不是该因素的有效水平,对吗?所以我必须告诉因子列还有另一个级别?

I think I missed a fundamental R concept about factors. Am I? I can not understand why it doesn't work. I think invalid factor level means that 88 is not a valid level in that factor, right? So I have to tell the factor column that there is another level?

推荐答案

1) addNA If fac is a factor addNA(fac)是相同的因素,但添加了 NA 作为一个级别.见 ?addNA

1) addNA If fac is a factor addNA(fac) is the same factor but with NA added as a level. See ?addNA

强制 NA 级别为 88:

To force the NA level to be 88:

facna <- addNA(fac)
levels(facna) <- c(levels(fac), 88)

给予:

> facna
 [1] 1  2  3  3  4  88 2  4  88 3 
Levels: 1 2 3 4 88

1a) 这可以写成一行,如下所示:

1a) This can be written in a single line as follows:

`levels<-`(addNA(fac), c(levels(fac), 88))

2) factor 也可以使用 factor 的各种参数在一行中完成,如下所示:

2) factor It can also be done in one line using the various arguments of factor like this:

factor(fac, levels = levels(addNA(fac)), labels = c(levels(fac), 88), exclude = NULL)

2a) 或等效:

factor(fac, levels = c(levels(fac), NA), labels = c(levels(fac), 88), exclude = NULL)

3) ifelse 另一种方法是:

factor(ifelse(is.na(fac), 88, paste(fac)), levels = c(levels(fac), 88))

4) forcats forcats 包有一个功能:

4) forcats The forcats package has a function for this:

library(forcats)

fct_explicit_na(fac, "88")
## [1] 1  2  3  3  4  88 2  4  88 3 
## Levels: 1 2 3 4 88

注意:我们使用以下输入fac

fac <- structure(c(1L, 2L, 3L, 3L, 4L, NA, 2L, 4L, NA, 3L), .Label = c("1", 
"2", "3", "4"), class = "factor")

更新:改进了 (1) 并添加了 (1a).后来添加(4).

Update: Have improved (1) and added (1a). Later added (4).

这篇关于替换 &lt;NA&gt;在因子列中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆