R-从两个缺少数据的变量的子集创建新的数据框变量NA [英] R- create new dataframe variable from subset of two variables with missing data NA

查看:66
本文介绍了R-从两个缺少数据的变量的子集创建新的数据框变量NA的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个简单的示例数据帧,其中包含两个数据列(data1和data2)和两个分组变量(度量1和2).度量1和2缺少数据NA.

I have a simple example data frame with two data columns (data1 and data2) and two grouping variables (Measure 1 and 2). Measure 1 and 2 have missing data NA.

d <- data.frame(Measure1 = 1:2, Measure2 = 3:4, data1 = 1:10, data2 = 11:20) 
d$Measure1[4]=NA 
d$Measure2[8]=NA 
d

   Measure1 Measure2 data1 data2
1         1        3     1    11
2         2        4     2    12
3         1        3     3    13
4        NA        4     4    14
5         1        3     5    15
6         2        4     6    16
7         1        3     7    17
8         2       NA     8    18
9         1        3     9    19
10        2        4    10    20

我想创建一个包含data1的新变量(d$new),但仅适用于Measure1等于1的行.我尝试这样做并得到以下错误:

I want to create a new variable (d$new) that contains data1, but only for rows where Measure1 equals 1. I tried this and get the following error:

d$new[d$Measure1 == 1] = d$data1[d$Measure1 == 1] 

d $ new [d $ Measure1 == 1]中的错误= d $ data1 [d $ Measure1 == 1]:不适用 下标作业中不允许

Error in d$new[d$Measure1 == 1] = d$data1[d$Measure1 == 1] : NAs are not allowed in subscripted assignments

接下来,我仅想将Measure2等于4的行的data2中的数据添加到d $ new中.但是,Measure1和Measure2中丢失的数据导致在设置数据并将其分配给新变量时出现问题.我可以想到一些过于复杂的解决方案,但是我敢肯定,有一种我没有想到的简单方法.感谢您的帮助!

Next I would like to add to d$new the data from data2 only for rows where Measure2 equals 4. However, the missing data in Measure1 and Measure2 is causing problems in subsetting the data and assigning it to a new variable. I can think of some overly complicated solutions, but I'm sure there's an easy way I'm not thinking of. Thanks for the help!

推荐答案

查找其中Measure1不是NA且是您想要的值的行.

Find rows where Measure1 is not NA and is the value you want.

measure1_notNA = which(!is.na(d$Measure1) & d$Measure1 == 1)

使用一些默认值初始化新列.

Initialize your new column with some default value.

d$new = NA

仅用data1列中的相应值替换那些行.

Replace only those rows with corresponding values from data1 column.

d$new[measure1_notNA] = d$data1[measure1_notNA]


或者,在1行中:


Or, in 1 line:

d$new[d$Measure1 == 1 & !is.na(d$Measure1)] = d$data1[d$Measure1 == 1 & !is.na(d$Measure1)] 

这篇关于R-从两个缺少数据的变量的子集创建新的数据框变量NA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆