如何匹配/合并R中两个不同文件中的数据? [英] How to match/merge data from two different files in R?
问题描述
我有两个文件(file1.csv和file2.csv).如下所示,file1包含两列date和变量x1,具有365个观测值(全年).文件2包含列日期作为文件1和许多其他变量.我只对仅具有24个观测值的变量x45感兴趣(每月2个观测值).
I have two files (file1.csv and file2.csv). As shown below, file1 contains two columns date and variable x1 that has 365 observations (whole year). file 2 contains column date as file1 and many other variables. I'm interested only in variable x45 that has 24 observations only (2 observations each month).
file1
date x1
1/01/2005 33
2/01/2005 24
3/01/2005 72
31/12/2005 52
文件2
date x2 x3 x45
1/01/2005 115
5/02/2005 125
13/04/2005 127
31/12/2005 138
所以我想将x45列添加到file1.csv中,看起来像
so I'd like to add column x45 to file1.csv to look like
date x1 x45
1/01/2005 33 115
2/01/2005 24 NA
3/01/2005 72 NA
31/12/2005 52 138
我尝试使用
file1= read.csv("D:/file1.csv")
file2= read.csv("D:/file2.csv")
file3 = merge(file1, file2)
但是,文件3只有24行(观测值),而忽略了文件1中的其余观测值.
However, file 3 has only 24 rows (observations) and omits the rest of observations in file 1.
对于获得上述结果的任何帮助,将不胜感激.
Any help to get the result as described above would be much appreciated.
推荐答案
您可以尝试left_join
library(dplyr)
left_join(df1, df2[c('date', 'x45')], by='date')
# date x1 x45
#1 1/01/2005 33 115
#2 2/01/2005 24 NA
#3 3/01/2005 72 NA
#4 31/12/2005 52 138
或使用merge
merge(df1, df2[c('date', 'x45')], all.x=TRUE)
# date x1 x45
#1 1/01/2005 33 115
#2 2/01/2005 24 NA
#3 3/01/2005 72 NA
#4 31/12/2005 52 138
更新
dplyr
中的left_join
和plyr
中的join
保持原始顺序.如果需要在merge
中保持顺序,一种选择是在"df1"中创建一个"indx",在merge
之后,可以使用"indx"保留原始顺序
Update
The left_join
from dplyr
and join
from plyr
keep the original order. If you need to keep order in merge
, one option is to create an "indx" in "df1" and after the merge
, the original order can be retained using the "indx"
df1$indx <- 1:nrow(df1)
merge(df1, df2[c('date', 'x45')], all.x=TRUE)[order(df1$indx),-3]
date x1 x45
#1 1/01/2005 33 115
#2 2/01/2005 24 NA
#3 3/01/2005 72 NA
#4 31/12/2005 52 138
或使用plyr
library(plyr)
join(df1, df2[c('date', 'x45')], by='date', type='left')
数据
df1 <- structure(list(date = c("1/01/2005", "2/01/2005", "3/01/2005",
"31/12/2005"), x1 = c(33L, 24L, 72L, 52L)), .Names = c("date",
"x1"), class = "data.frame", row.names = c(NA, -4L))
df2 <- structure(list(date = c("1/01/2005", "5/02/2005", "13/04/2005",
"31/12/2005"), x2 = c(NA, NA, NA, NA), x3 = c(NA, NA, NA, NA),
x45 = c(115L, 125L, 127L, 138L)), .Names = c("date", "x2",
"x3", "x45"), class = "data.frame", row.names = c(NA, -4L))
这篇关于如何匹配/合并R中两个不同文件中的数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!