条件连接数据帧R [英] Conditional joining data frames R

查看:60
本文介绍了条件连接数据帧R的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个简单的问题,我无法正确把握.

I gotta a somewhat simple problem that I'm not being able to grasp correctly.

我有两个数据框,第一个仅包含日期(一整年的每个月),第二个也包含日期和一些其他数据,但是第二个变量中的月份有所变化.像下面这样:

I have two data frames, the first one containing just dates (every month for a bunch of years), the second one also with dates and some other data, but just the months for which there have been changes in the second variable. Like bellow:

df1 <- data.frame(Dates.1 = seq.Date(as.Date('1999/1/1'), as.Date('2001/5/1'), 'month'))

Dates.2 <- c(seq.Date(as.Date('1999/1/1'), as.Date('2001/5/1'), by = '5 months'))

Vals <- c(10, 20, 15, 44, 70, 50)

df2 <- data.frame(Dates.2, Vals)

我需要做的是将df1和df2连接起来,将小于或等于df2中日期的"Vals"中对应的值与df1中的每个日期相关联.输出应该像下面这样(我想找到一种矢量化的方式):

What I need to do is to join df1 and df2, associating the corresponding values in "Vals" for each date in df1 which is less than or eaqual to the dates in df2. The output should be as bellow (I want to find a way to do it in a vectorized fashion):

df3 <- cbind(df1,Vals3. = c(10,10,10,10,10,20,20,20,20,20,15,15,15,15,15,
                        44,44,44,44,44,70,70,70,70,70,50,50,50,50))

我尝试使用dplyr的joins和Fuzzyjoin包,但是我无法正确获取它(我是R语言的初学者).当然,如果任何人都可以使用这些软件包提出解决方案,我将非常高兴.谢谢!

I've tried using the dplyr's joins and the fuzzyjoin package, but I couldn't manage to get it properly (I'm a beginner in R). Of course, if anyone can come up with a solution using these packages I'll be more than glad. Tks!

推荐答案

dplyr tidyr 的组合:

dplyr::left_join(df1,df2,by=c(Dates.1="Dates.2")) %>% 
tidyr::fill(Vals,.direction="down")

结果:

      Dates.1 Vals
1  1999-01-01   10
2  1999-02-01   10
3  1999-03-01   10
4  1999-04-01   10
5  1999-05-01   10
6  1999-06-01   20
7  1999-07-01   20
8  1999-08-01   20
9  1999-09-01   20
10 1999-10-01   20
(...)

顺便说一句,另一种方法是首先使用 complete (来自 tidyr )来跳过创建 df1 的操作:/p>

An alternative, by the way, will be to skip creating df1 in the first place by using complete (from tidyr):

tidyr::complete(df2,Dates.2=seq.Date(as.Date('1999/1/1'), as.Date('2001/5/1'), by = 'month')) %>%
tidyr::fill(Vals,.direction="down")

这将产生相同的结果.

这篇关于条件连接数据帧R的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆