如何替换表中的NA值*所选列*? data.frame,data.table [英] How to replace NA values in a table *for selected columns*? data.frame, data.table

查看:165
本文介绍了如何替换表中的NA值*所选列*? data.frame,data.table的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有很多关于替换NA值的帖子。我知道可以用下面的表/框架替换NAs:

  x [is.na(x) ] <-0 

但是,如果我只想限制某些列让我们给你一个例子。



首先,我们从一个数据集开始。

  set.seed(1234)
x < - data.frame(a = sample(c(1,2,NA),10,replace = T),
b = sample 1,2,NA),10,replace = T),
c = sample(c(1:5,NA),10,replace = T))
pre>

其中:

  abc 
1 1 NA 2
2 2 2 2
3 2 1 1
4 2 NA 1
5 NA 1 2
6 2 NA 5
7 1 1 4
8 1 1 NA
9 2 1 5
10 2 1 1

好的,所以我只想限制替换为列'a'和'b'。我的尝试是:

  x [is.na(x),1:2]< -0 

和:

  x [is.na(x [1:2])] <-0 

工作。



我的data.table尝试,其中 y< -data.table(x)上班:

  y [is.na(y [,list(a,b)]),] 

我想在is.na参数传递列,但显然不会工作。



我想在data.frame和data.table中做到这一点。我的最终目标是在'a'和'b'中将1:2重新编码为0:1,同时保持'c'的方式,因为它不是一个逻辑变量。我有一堆列,所以我不想一个一个做。



你有什么建议吗?

解决方案

您可以:

  x [,1:2] [is.na x [,1:2])] < -  0 

变量名称:

  x [c(a,b)] [is.na(x [c a,b)])] < -  0 

可以用预定义的向量替换$ c> 1:2 c(a,b) >

There are a lot of posts about replacing NA values. I am aware that one could replace NAs in the following table/frame with the following:

x[is.na(x)]<-0

But, what if I want to restrict it to only certain columns? Let's me show you an example.

First, let's start with a dataset.

set.seed(1234)
x <- data.frame(a=sample(c(1,2,NA), 10, replace=T),
                b=sample(c(1,2,NA), 10, replace=T), 
                c=sample(c(1:5,NA), 10, replace=T))

Which gives:

    a  b  c
1   1 NA  2
2   2  2  2
3   2  1  1
4   2 NA  1
5  NA  1  2
6   2 NA  5
7   1  1  4
8   1  1 NA
9   2  1  5
10  2  1  1

Ok, so I only want to restrict the replacement to columns 'a' and 'b'. My attempt was:

x[is.na(x), 1:2]<-0

and:

x[is.na(x[1:2])]<-0

Which does not work.

My data.table attempt, where y<-data.table(x), was obviously never going to work:

y[is.na(y[,list(a,b)]), ]

I want to pass columns inside the is.na argument but that obviously wouldn't work.

I would like to do this in a data.frame and a data.table. My end goal is to recode the 1:2 to 0:1 in 'a' and 'b' while keeping 'c' the way it is, since it is not a logical variable. I have a bunch of columns so I don't want to do it one by one. And, I'd just like to know how to do this.

Do you have any suggestions?

解决方案

You can do:

x[, 1:2][is.na(x[, 1:2])] <- 0

or better (IMHO), use the variable names:

x[c("a", "b")][is.na(x[c("a", "b")])] <- 0

In both cases, 1:2 or c("a", "b") can be replaced by a pre-defined vector.

这篇关于如何替换表中的NA值*所选列*? data.frame,data.table的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆