合并数据框以消除丢失的观测值 [英] merge data frames to eliminate missing observations
问题描述
我有两个数据帧.一个(df1
)包含所有感兴趣的列和行,但包括缺少的观察值.另一个(df2
)包括将用于代替缺少的观察值的值,并且仅包括在df1
中存在至少一个NA
的列和行.我想以某种方式合并两个数据集以获得desired.result
.
I have two data frames. One (df1
) contains all columns and rows of interest, but includes missing observations. The other (df2
) includes values to be used in place of missing observations, and only includes columns and rows for which at least one NA
was present in df1
. I would like to merge the two data sets somehow to obtain the desired.result
.
这似乎是一个非常简单的问题,但我正在空白.我无法使merge
正常工作.也许我可以写嵌套的for-loops
,但是还没有这样做.我也尝试了几次.我有点害怕发布这个问题,担心我的R
卡可能会被吊销.抱歉,如果这是重复的.我确实在这里和Google进行了大量搜索.感谢您的任何建议.最好使用碱R
的溶液.
This seems like a very simple problem to solve, but I am drawing a blank. I cannot get merge
to work. Maybe I could write nested for-loops
, but have not done so yet. I also tried aggregate
a few time. I am a little afraid to post this question, fearing my R
card might be revoked. Sorry if this is a duplicate. I did search here and with Google fairly intensively. Thank you for any advice. A solution in base R
is preferable.
df1 = read.table(text = "
county year1 year2 year3
aa 10 20 30
bb 1 NA 3
cc 5 10 NA
dd 100 NA 200
", sep = "", header = TRUE)
df2 = read.table(text = "
county year2 year3
bb 2 NA
cc NA 15
dd 150 NA
", sep = "", header = TRUE)
desired.result = read.table(text = "
county year1 year2 year3
aa 10 20 30
bb 1 2 3
cc 5 10 15
dd 100 150 200
", sep = "", header = TRUE)
推荐答案
aggregate
可以做到:
aggregate(. ~ county,
data=merge(df1, df2, all=TRUE), # Merged data, including NAs
na.action=na.pass, # Aggregate rows with missing values...
FUN=sum, na.rm=TRUE) # ...but instruct "sum" to ignore them.
## county year2 year3 year1
## 1 aa 20 30 10
## 2 bb 2 3 1
## 3 cc 10 15 5
## 4 dd 150 200 100
这篇关于合并数据框以消除丢失的观测值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!