如何有条件地合并R中的两个数据框(公共列,条件) [英] How to merge two dataframes in R conditionally (common column, condition)

查看:139
本文介绍了如何有条件地合并R中的两个数据框(公共列,条件)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经尝试了2-3天,但是仍然找不到答案.我想做的是,我有两个数据框x,y(在下面的示例中给出)

I have been trying to do this for like 2-3 days but still could not find the answer. What I want to do is I have two dataframes x,y (given below a sample of them)

X
     Response.No Tab.No Survey.Date AC.Name Mandal.Name Village.Name
1         9530      1  2015-05-26      NA          NA           NA
2         6702      1  2015-05-30      NA          NA           NA
3        26744      1  2015-05-31      NA          NA           NA
4         8925      1  2015-06-03      NA          NA           NA
5        20242      1  2015-06-04      NA          NA           NA
6        21316      1  2015-06-04      NA          NA           NA
7        28056      1  2015-06-04      NA          NA           NA
8        12661      1  2015-06-05      NA          NA           NA
9        17187      1  2015-06-05      NA          NA           NA
10       28795      1  2015-06-05      NA          NA           NA

Y
     AC.Name   Mandal.Name      Village.Name Tab.No Survey.Start.Date Survey.End.Date
1  Nandigama Chanderlapadu        Punnavalli      1        2015-05-23      2015-05-27
2  Nandigama Chanderlapadu        Kasarabada      1        2015-05-30      2015-06-07
3  Nandigama Chanderlapadu     Kodavatikallu      1        2015-06-09      2015-06-28
4  Nandigama Chanderlapadu        Thurlapadu      1        2015-06-29      2015-07-13
5  Nandigama Chanderlapadu     Chanderlapadu      1        2015-07-14      2015-07-25
6  Nandigama Chanderlapadu            Popuru      2        2015-05-23      2015-05-27
7  Nandigama Chanderlapadu        Kandrapadu      2        2015-05-30      2015-06-08
8  Nandigama Chanderlapadu Vibhareethalapadu      3        2015-05-30      2015-06-04
9  Nandigama Chanderlapadu             Eturu      3        2015-06-10      2015-06-23
10 Nandigama Chanderlapadu      Bobbillapadu      3        2015-06-26      2015-07-03

即我要按Tab.No列匹配x和y,但还要确保x $ Survey.Date位于y $ Survey.Start.Date和y $ Survey.End.Date之间.如果两个条件都不满足,则该行必须具有N.A值.我已经尝试并搜索了Google Stackoverflow和R-Studio帮助,但无法获得所需的结果.

i.e I want to match the x and y by column Tab.No but also make sure that x$Survey.Date lies between y$Survey.Start.Date and y$Survey.End.Date. And if both conditions are not satisfied the row must have N.A values . I've tried and searched google stackoverflow and R-Studio help but was unable to get the desired result.

Z
     Response.No Tab.No Survey.Date AC.Name      Mandal.Name   Village.Name
1         9530      1  2015-05-26      Nandigama Chanderlapadu Punnavalli
2         6702      1  2015-05-30      Nandigama Chanderlapadu Kasarabada
3        26744      1  2015-05-31      Nandigama Chanderlapadu Kasarabada
4         8925      1  2015-06-03      Nandigama Chanderlapadu Kasarabada
5        20242      1  2015-06-04      Nandigama Chanderlapadu Kasarabada
6        21316      1  2015-06-04      Nandigama Chanderlapadu Kasarabada
7        28056      1  2015-06-04      Nandigama Chanderlapadu Kasarabada
8        12661      1  2015-06-05      Nandigama Chanderlapadu Kasarabada
9        17187      1  2015-06-05      Nandigama Chanderlapadu Kasarabada
10       28795      1  2015-06-05      Nandigama Chanderlapadu Kasarabada

我已经检查过: 1. 如何基于匹配列和范围内的两个条件在R中合并两个数据框? 2. 通过开始/结束窗口进行滚动加入 3. R中的条件合并/替换

I've already checked : 1. How to merge two dataframes in R based on two conditions, matching column and within a range? 2. roll join with start/end window 3. Conditional merge/replacement in R

我一直在尝试使用merge(),cbind()和match()解决此问题,但无济于事. 我可以只使用序列但不使用日期条件进行合并.

I've been trying to solve this using merge(), cbind() and match() to no avail. I am able to merge using only serial but without the date condition.

感谢您的帮助

推荐答案

以下是使用dplyr的方法.

inner_join(X[,1:3],Y, by=c("Tab.No"))%>%
mutate(AC.Name = ifelse(Survey.Date>=Survey.Start.Date & Survey.Date<=Survey.End.Date, AC.Name ,NA),
Mandal.Name = ifelse(Survey.Date>=Survey.Start.Date & Survey.Date<=Survey.End.Date, Mandal.Name ,NA),
Village.Name = ifelse(Survey.Date>=Survey.Start.Date & Survey.Date<=Survey.End.Date, Village.Name ,NA))%>%
group_by(Tab.No)%>%
filter(!is.na(AC.Name)|n()==1)%>%
select(Response.No,Tab.No,Survey.Date,AC.Name,Mandal.Name,Village.Name)

结果

   Response.No Tab.No Survey.Date   AC.Name   Mandal.Name Village.Name
         (int)  (int)      (date)     (chr)         (chr)        (chr)
1         9530      1  2015-05-26 Nandigama Chanderlapadu   Punnavalli
2         6702      1  2015-05-30 Nandigama Chanderlapadu   Kasarabada
3        26744      1  2015-05-31 Nandigama Chanderlapadu   Kasarabada
4         8925      1  2015-06-03 Nandigama Chanderlapadu   Kasarabada
5        20242      1  2015-06-04 Nandigama Chanderlapadu   Kasarabada
6        21316      1  2015-06-04 Nandigama Chanderlapadu   Kasarabada
7        28056      1  2015-06-04 Nandigama Chanderlapadu   Kasarabada
8        12661      1  2015-06-05 Nandigama Chanderlapadu   Kasarabada
9        17187      1  2015-06-05 Nandigama Chanderlapadu   Kasarabada
10       28795      1  2015-06-05 Nandigama Chanderlapadu   Kasarabada

数据

X<-read.table(text="     Response.No Tab.No Survey.Date AC.Name Mandal.Name Village.Name
9530      1  2015-05-26      NA          NA           NA
6702      1  2015-05-30      NA          NA           NA
26744      1  2015-05-31      NA          NA           NA
8925      1  2015-06-03      NA          NA           NA
20242      1  2015-06-04      NA          NA           NA
21316      1  2015-06-04      NA          NA           NA
28056      1  2015-06-04      NA          NA           NA
12661      1  2015-06-05      NA          NA           NA
17187      1  2015-06-05      NA          NA           NA
28795      1  2015-06-05      NA          NA           NA
", header=T,stringsAsFactors =F)

Y<-read.table(text="AC.Name   Mandal.Name      Village.Name Tab.No Survey.Start.Date Survey.End.Date
Nandigama Chanderlapadu        Punnavalli      1        2015-05-23      2015-05-27
Nandigama Chanderlapadu        Kasarabada      1        2015-05-30      2015-06-07
Nandigama Chanderlapadu     Kodavatikallu      1        2015-06-09      2015-06-28
Nandigama Chanderlapadu        Thurlapadu      1        2015-06-29      2015-07-13
Nandigama Chanderlapadu     Chanderlapadu      1        2015-07-14      2015-07-25
Nandigama Chanderlapadu            Popuru      2        2015-05-23      2015-05-27
Nandigama Chanderlapadu        Kandrapadu      2        2015-05-30      2015-06-08
Nandigama Chanderlapadu Vibhareethalapadu      3        2015-05-30      2015-06-04
Nandigama Chanderlapadu             Eturu      3        2015-06-10      2015-06-23
Nandigama Chanderlapadu      Bobbillapadu      3        2015-06-26      2015-07-03
", header=T,stringsAsFactors =F)

X$Survey.Date <-as.Date(X$Survey.Date)
Y$Survey.Start.Date <-as.Date(Y$Survey.Start.Date)
Y$Survey.End.Date <-as.Date(Y$Survey.End.Date)

这篇关于如何有条件地合并R中的两个数据框(公共列,条件)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆