如何有条件地合并R中的两个数据框(公共列,条件) [英] How to merge two dataframes in R conditionally (common column, condition)
问题描述
我已经尝试了2-3天,但是仍然找不到答案.我想做的是,我有两个数据框x,y(在下面的示例中给出)
I have been trying to do this for like 2-3 days but still could not find the answer. What I want to do is I have two dataframes x,y (given below a sample of them)
X
Response.No Tab.No Survey.Date AC.Name Mandal.Name Village.Name
1 9530 1 2015-05-26 NA NA NA
2 6702 1 2015-05-30 NA NA NA
3 26744 1 2015-05-31 NA NA NA
4 8925 1 2015-06-03 NA NA NA
5 20242 1 2015-06-04 NA NA NA
6 21316 1 2015-06-04 NA NA NA
7 28056 1 2015-06-04 NA NA NA
8 12661 1 2015-06-05 NA NA NA
9 17187 1 2015-06-05 NA NA NA
10 28795 1 2015-06-05 NA NA NA
Y
AC.Name Mandal.Name Village.Name Tab.No Survey.Start.Date Survey.End.Date
1 Nandigama Chanderlapadu Punnavalli 1 2015-05-23 2015-05-27
2 Nandigama Chanderlapadu Kasarabada 1 2015-05-30 2015-06-07
3 Nandigama Chanderlapadu Kodavatikallu 1 2015-06-09 2015-06-28
4 Nandigama Chanderlapadu Thurlapadu 1 2015-06-29 2015-07-13
5 Nandigama Chanderlapadu Chanderlapadu 1 2015-07-14 2015-07-25
6 Nandigama Chanderlapadu Popuru 2 2015-05-23 2015-05-27
7 Nandigama Chanderlapadu Kandrapadu 2 2015-05-30 2015-06-08
8 Nandigama Chanderlapadu Vibhareethalapadu 3 2015-05-30 2015-06-04
9 Nandigama Chanderlapadu Eturu 3 2015-06-10 2015-06-23
10 Nandigama Chanderlapadu Bobbillapadu 3 2015-06-26 2015-07-03
即我要按Tab.No列匹配x和y,但还要确保x $ Survey.Date位于y $ Survey.Start.Date和y $ Survey.End.Date之间.如果两个条件都不满足,则该行必须具有N.A值.我已经尝试并搜索了Google Stackoverflow和R-Studio帮助,但无法获得所需的结果.
i.e I want to match the x and y by column Tab.No but also make sure that x$Survey.Date lies between y$Survey.Start.Date and y$Survey.End.Date. And if both conditions are not satisfied the row must have N.A values . I've tried and searched google stackoverflow and R-Studio help but was unable to get the desired result.
Z
Response.No Tab.No Survey.Date AC.Name Mandal.Name Village.Name
1 9530 1 2015-05-26 Nandigama Chanderlapadu Punnavalli
2 6702 1 2015-05-30 Nandigama Chanderlapadu Kasarabada
3 26744 1 2015-05-31 Nandigama Chanderlapadu Kasarabada
4 8925 1 2015-06-03 Nandigama Chanderlapadu Kasarabada
5 20242 1 2015-06-04 Nandigama Chanderlapadu Kasarabada
6 21316 1 2015-06-04 Nandigama Chanderlapadu Kasarabada
7 28056 1 2015-06-04 Nandigama Chanderlapadu Kasarabada
8 12661 1 2015-06-05 Nandigama Chanderlapadu Kasarabada
9 17187 1 2015-06-05 Nandigama Chanderlapadu Kasarabada
10 28795 1 2015-06-05 Nandigama Chanderlapadu Kasarabada
我已经检查过: 1. 如何基于匹配列和范围内的两个条件在R中合并两个数据框? 2. 通过开始/结束窗口进行滚动加入 3. R中的条件合并/替换
I've already checked : 1. How to merge two dataframes in R based on two conditions, matching column and within a range? 2. roll join with start/end window 3. Conditional merge/replacement in R
我一直在尝试使用merge(),cbind()和match()解决此问题,但无济于事. 我可以只使用序列但不使用日期条件进行合并.
I've been trying to solve this using merge(), cbind() and match() to no avail. I am able to merge using only serial but without the date condition.
感谢您的帮助
推荐答案
以下是使用dplyr
的方法.
inner_join(X[,1:3],Y, by=c("Tab.No"))%>%
mutate(AC.Name = ifelse(Survey.Date>=Survey.Start.Date & Survey.Date<=Survey.End.Date, AC.Name ,NA),
Mandal.Name = ifelse(Survey.Date>=Survey.Start.Date & Survey.Date<=Survey.End.Date, Mandal.Name ,NA),
Village.Name = ifelse(Survey.Date>=Survey.Start.Date & Survey.Date<=Survey.End.Date, Village.Name ,NA))%>%
group_by(Tab.No)%>%
filter(!is.na(AC.Name)|n()==1)%>%
select(Response.No,Tab.No,Survey.Date,AC.Name,Mandal.Name,Village.Name)
结果
Response.No Tab.No Survey.Date AC.Name Mandal.Name Village.Name
(int) (int) (date) (chr) (chr) (chr)
1 9530 1 2015-05-26 Nandigama Chanderlapadu Punnavalli
2 6702 1 2015-05-30 Nandigama Chanderlapadu Kasarabada
3 26744 1 2015-05-31 Nandigama Chanderlapadu Kasarabada
4 8925 1 2015-06-03 Nandigama Chanderlapadu Kasarabada
5 20242 1 2015-06-04 Nandigama Chanderlapadu Kasarabada
6 21316 1 2015-06-04 Nandigama Chanderlapadu Kasarabada
7 28056 1 2015-06-04 Nandigama Chanderlapadu Kasarabada
8 12661 1 2015-06-05 Nandigama Chanderlapadu Kasarabada
9 17187 1 2015-06-05 Nandigama Chanderlapadu Kasarabada
10 28795 1 2015-06-05 Nandigama Chanderlapadu Kasarabada
数据
X<-read.table(text=" Response.No Tab.No Survey.Date AC.Name Mandal.Name Village.Name
9530 1 2015-05-26 NA NA NA
6702 1 2015-05-30 NA NA NA
26744 1 2015-05-31 NA NA NA
8925 1 2015-06-03 NA NA NA
20242 1 2015-06-04 NA NA NA
21316 1 2015-06-04 NA NA NA
28056 1 2015-06-04 NA NA NA
12661 1 2015-06-05 NA NA NA
17187 1 2015-06-05 NA NA NA
28795 1 2015-06-05 NA NA NA
", header=T,stringsAsFactors =F)
Y<-read.table(text="AC.Name Mandal.Name Village.Name Tab.No Survey.Start.Date Survey.End.Date
Nandigama Chanderlapadu Punnavalli 1 2015-05-23 2015-05-27
Nandigama Chanderlapadu Kasarabada 1 2015-05-30 2015-06-07
Nandigama Chanderlapadu Kodavatikallu 1 2015-06-09 2015-06-28
Nandigama Chanderlapadu Thurlapadu 1 2015-06-29 2015-07-13
Nandigama Chanderlapadu Chanderlapadu 1 2015-07-14 2015-07-25
Nandigama Chanderlapadu Popuru 2 2015-05-23 2015-05-27
Nandigama Chanderlapadu Kandrapadu 2 2015-05-30 2015-06-08
Nandigama Chanderlapadu Vibhareethalapadu 3 2015-05-30 2015-06-04
Nandigama Chanderlapadu Eturu 3 2015-06-10 2015-06-23
Nandigama Chanderlapadu Bobbillapadu 3 2015-06-26 2015-07-03
", header=T,stringsAsFactors =F)
X$Survey.Date <-as.Date(X$Survey.Date)
Y$Survey.Start.Date <-as.Date(Y$Survey.Start.Date)
Y$Survey.End.Date <-as.Date(Y$Survey.End.Date)
这篇关于如何有条件地合并R中的两个数据框(公共列,条件)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!