替换R中data.frame中的某些值 [英] Replacing certain values in data.frame in R

查看:1717
本文介绍了替换R中data.frame中的某些值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将测试"中的NA替换为预测"中的预测值.我正在尝试使用比赛,但我无法弄清楚.请记住ID和时间,以创建一个由两部分组成的唯一ID.有什么建议? (请记住,我的数据集比此示例大得多(rows = 32000))

I am trying to replace the NAs in "test" with the forecast values in "forecast". I am trying to use match, but I can't figure it out. keep in mind the id and time create a two-part unique id. Any suggestions? ( keep in mind my data set is much larger than this example (rows=32000))

test = data.frame(id =c(1,1,1,2,2,2), time=c(89,99,109,89,99,109), data=c(3,4,NA,5,2,NA))
forecast = data.frame(id =c(1,2), time=c(109,109), data=c(5,1))

所需的输出

out = data.frame(id =c(1,1,1,2,2,2), time=c(89,99,109,89,99,109), data=c(3,4,5,5,2,1))

推荐答案

这是data.table解决方案

test_dt <- data.table(test, key = c('id', 'time'))
forecast_dt <- data.table(test, key = c('id', 'time'))
forecast[test][,data := ifelse(is.na(data), data.1, data)]

编辑.基准测试:即使对于较小的数据集,数据表的速度也快约3倍.

EDIT. Benchmarking Tests: Data Table is ~ 3x faster even for a small dataset.

库(rbenchmark)

library(rbenchmark)

f_merge <- function(){
  out2 <- merge(test, forecast, by = c("id", "time"), all.x = TRUE)
  out2 <- transform(out2, 
   newdata = ifelse(is.na(data.x), data.y, data.x), data.x = NULL, data.y = NULL)
  return(out2)
}

f_dtable <- function(){
  test <- data.table(test, key = c('id', 'time'))
  forecast <- data.table(forecast, key = c('id', 'time'))
  test <- forecast[test][,data := ifelse(is.na(data), data.1, data)]
  test$data.1 <- NULL
  return(test)
}

benchmark(f_merge(), f_dtable(), order = 'relative', 
  columns = c('test', 'elapsed', 'relative'))

        test elapsed relative
2 f_dtable()    0.86     1.00
1  f_merge()    2.26     2.63

这篇关于替换R中data.frame中的某些值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆