R-用NA替换特定值的内容 [英] R - Replace specific value contents with NA

查看:651
本文介绍了R-用NA替换特定值的内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个相当大的数据框,其中包含多个表示丢失数据的-".数据框由多个Excel文件组成,这些文件不能使用"na.strings ="或其他函数,因此我必须以-"表示形式导入它们.

I have a fairly large data frame that has multiple "-" which represent missing data. The data frame consisted of multiple Excel files, which could not use the "na.strings =" or alternative function, so I had to import them with the "-" representation.

如何用NA/缺失值替换数据框中的所有-"?数据框由200列字符,因子和整数组成.

How can I replace all "-" in the data frame with NA / missing values? The data frame consists of 200 columns of characters, factors, and integers.

到目前为止,我已经尝试过:

So far I have tried:

sum(df %in c("-"))
returns: [1] 0

df[df=="-"] <-NA #does not do anything

library(plyr)
df <- revalue(df, c("-",NA))
returns: Error in revalue(tmp, c("-", NA)) : 
  x is not a factor or a character vector.

library(anchors)
df <- replace.value(df,colnames(df),"-",as.character(NA))
Error in charToDate(x) : 
  character string is not in a standard unambiguous format

数据框由200列字符,因子和整数组成,因此我可以了解为什么后两个不能正常工作.任何帮助将不胜感激.

The data frame consists of 200 columns of characters, factors, and integers, so I can see why the last two do not work correctly. Any help would be appreciated.

推荐答案

由于您已经在使用tidyverse函数,因此可以轻松地在管道中使用dplyr中的na_if.

Since you're already using tidyverse functions, you can easily use na_if from dplyr within your pipes.

例如,我有一个数据集,其中999用于填写非答案:

For example, I have a dataset where 999 is used to fill in a non-answer:

df <- tibble(
    alpha = c("a", "b", "c", "d", "e"), 
    val1 = c(1, 999, 3, 8, 999), 
    val2 = c(2, 8, 999, 1, 2))

如果我想更改val1以使999不适用,我可以这样做:

If I wanted to change val1 so 999 is NA, I could do:

df %>% 
    mutate(val1 = na_if(val1, 999))

在您的情况下,听起来好像您想跨多个变量替换一个值,所以使用mutate_atmutate_if会更合适:

In your case, it sounds like you want to replace a value across multiple variables, so using mutate_at or mutate_if would be more appropriate:

df %>%
    mutate_at(vars(val1, val2), na_if, 999)

NA替换val1val2中的所有999实例,现在看起来像这样:

replaces all instances of 999 in both val1 and val2 with NA and now looks like this:

# A tibble: 5 x 3
  alpha  val1  val2
  <chr> <dbl> <dbl>
1 a        1.    2.
2 b       NA     8.
3 c        3.   NA 
4 d        8.    1.
5 e       NA     2.

这篇关于R-用NA替换特定值的内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆