合并数据框以消除丢失的观测值 [英] merge data frames to eliminate missing observations

查看:56
本文介绍了合并数据框以消除丢失的观测值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个数据帧.一个(df1)包含所有感兴趣的列和行,但包括缺少的观察值.另一个(df2)包括将用于代替缺少的观察值的值,并且仅包括在df1中存在至少一个NA的列和行.我想以某种方式合并两个数据集以获得desired.result.

I have two data frames. One (df1) contains all columns and rows of interest, but includes missing observations. The other (df2) includes values to be used in place of missing observations, and only includes columns and rows for which at least one NA was present in df1. I would like to merge the two data sets somehow to obtain the desired.result.

这似乎是一个非常简单的问题,但我正在空白.我无法使merge正常工作.也许我可以写嵌套的for-loops,但是还没有这样做.我也尝试了几次.我有点害怕发布这个问题,担心我的R卡可能会被吊销.抱歉,如果这是重复的.我确实在这里和Google进行了大量搜索.感谢您的任何建议.最好使用碱R的溶液.

This seems like a very simple problem to solve, but I am drawing a blank. I cannot get merge to work. Maybe I could write nested for-loops, but have not done so yet. I also tried aggregate a few time. I am a little afraid to post this question, fearing my R card might be revoked. Sorry if this is a duplicate. I did search here and with Google fairly intensively. Thank you for any advice. A solution in base R is preferable.

df1 = read.table(text = "
  county year1 year2 year3
    aa     10    20   30
    bb      1    NA    3
    cc      5    10   NA
    dd    100    NA  200
", sep = "", header = TRUE)

df2 = read.table(text = "
  county year2 year3
    bb      2   NA
    cc     NA   15
    dd    150   NA
", sep = "", header = TRUE)

desired.result = read.table(text = "
  county year1 year2 year3
    aa     10    20   30
    bb      1     2    3
    cc      5    10   15
    dd    100   150  200
", sep = "", header = TRUE)

推荐答案

aggregate可以做到:

aggregate(. ~ county,
          data=merge(df1, df2, all=TRUE), # Merged data, including NAs
          na.action=na.pass,              # Aggregate rows with missing values...
          FUN=sum, na.rm=TRUE)            # ...but instruct "sum" to ignore them.
##   county year2 year3 year1
## 1     aa    20    30    10
## 2     bb     2     3     1
## 3     cc    10    15     5
## 4     dd   150   200   100

这篇关于合并数据框以消除丢失的观测值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆