在R中用NA进行条件替换(两个数据帧) [英] Conditional Replacing with NA in R (two dataframes)

查看:369
本文介绍了在R中用NA进行条件替换(两个数据帧)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有

    idx <- c(1397, 2000, 3409, 3415, 4077, 4445, 5021, 5155) 

    idy <- c( 1397, 2000, 2860, 3029, 3415, 3707, 4077, 4445, 5021, 5155, 
             5251, 5560)

   agex <- c(NA, NA, NA, 35, NA, 62, 35, 46)

   agey <- c( 3, 45,  0, 89,  7,  2, 13, 24, 58,  8,  3, 45)


   dat1 <- as.data.frame(cbind(idx, agex))
   dat2 <- as.data.frame(cbind(idy, agey))

现在,我希望每当agex = NA,并且idx = idy时,agey = NA,所以

Now I want whenever agex = NA, and idx = idy, that agey = NA, so that

       idy agey
  1    1397   NA
  2    2000   NA
  3    2860    0
  4    3029   89
  5    3415    7
  6    3707    2
  7    4077   NA
  8    4445   24
  9    5021   58
  10   5155    8
  11   5251    3
  12   5560   45

我已经尝试过了

ifelse(is.na(dat1$agex) | dat1$idx %in% dat2$idy, NA, dat2$agey)

它以正确的索引返回NA,但是将idy缩短为idx的长度.

it returns NAs at the correct indices, but shortens idy to the length of idx.

推荐答案

我希望每当agex = NA,且idx = idy时,agey = NA

I want whenever agex = NA, and idx = idy, that agey = NA

使用data.table更新联接...

With a data.table update join...

library(data.table)
setDT(dat1); setDT(dat2)

dat2[dat1[is.na(agex)], on=.(idy = idx), agey := NA]

dat2

     idy agey
 1: 1397   NA
 2: 2000   NA
 3: 2860    0
 4: 3029   89
 5: 3415    7
 6: 3707    2
 7: 4077   NA
 8: 4445   24
 9: 5021   58
10: 5155    8
11: 5251    3
12: 5560   45

工作原理

  • dat1[is.na(agex)]agex是NA
  • 的子集
  • DT[mDT, on=, j]是一个联接,其中使用on=
  • DT中查找mDT的行
  • j是在DT
  • 的连接子集中完成的
  • jk := expr时,DT的列k被更新
  • dat1[is.na(agex)] is the subset where agex is NA
  • DT[mDT, on=, j] is a join where rows of mDT are looked up in DT using on=
  • j is done in the joined subset of DT
  • when j is k := expr, column k of DT is updated

这篇关于在R中用NA进行条件替换(两个数据帧)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆