从一个数据帧获取日期,并在另一个数据帧中过滤数据 [英] Take dates from one dataframe and filter data in another dataframe - R

查看:104
本文介绍了从一个数据帧获取日期,并在另一个数据帧中过滤数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个数据框,

user=c(rep('A',7),rep('B',8))
data = seq(1:15)
date = as.Date(c('2016-01-01','2016-01-02','2016-01-03','2016-01-04','2016-01-05','2016-01-06','2016-01-07','2016-01-08','2016-01-09','2016-01-10','2016-01-11','2016-01-12','2016-01-13','2016-01-14','2016-01-15'))
df = data.frame(user,date,data)

df

        user   date      data
    1     A 2016-01-01    1
    2     A 2016-01-02    2
    3     A 2016-01-03    3
    4     A 2016-01-04    4
    5     A 2016-01-05    5
    6     A 2016-01-06    6
    7     A 2016-01-07    7
    8     B 2016-01-08    8
    9     B 2016-01-09    9
    10    B 2016-01-10   10
    11    B 2016-01-11   11
    12    B 2016-01-12   12
    13    B 2016-01-13   13
    14    B 2016-01-14   14
    15    B 2016-01-15   15

df1 =data.frame(user = c('A','B'), start_date = as.Date(c('2016-01-02','2016-01-10')),  end_date = as.Date(c('2016-01-06','2016-01-14')))
> df1
  user start_date   end_date
1    A 2016-01-02 2016-01-06
2    B 2016-01-10 2016-01-14

我想从df1中获取开始日期和结束日期,并对df dataframe的日期列中的记录进行过滤。特定用户的数据应仅在df1的start_date和end_date之间。结果数据框应该具有以下输出,

I want to take the start date and end date from df1 , and filter the records in the date column of df dataframe. The data for a particular user should be only between the start_date and end_date of df1. The resulting dataframe should have the following output,

user   date      data 
  A  2016-01-02    2
  A  2016-01-03    3
  A  2016-01-04    4
  A  2016-01-05    5
  A  2016-01-06    6
  B  2016-01-10   10
  B  2016-01-11   11
  B  2016-01-12   12
  B  2016-01-13   13
  B  2016-01-14   14

我尝试过以下操作,

循环使用每个用户,将其传递到数据帧。然后再次使用df1中的相应条目的start_date和end_date进行过滤,然后将其附加到新的数据帧。这是因为数据非常庞大而对我来说很长时间。有没有更有效的方法呢?

Looping through each user, passing it to a dataframe. Then filtering it again with the start_date and end_date of corresponding entry in df1, and then appending it to a new dataframe. This is taking a very long time for me since the data is very huge. Is there a more efficient way to do this?

谢谢

推荐答案

library(dplyr)
df<-left_join(df,df1,by="user")
df <- df %>% filter(date>=start_date & date<=end_date)

这篇关于从一个数据帧获取日期,并在另一个数据帧中过滤数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆