R:在R中创建一个新列以根据两个日期确定学期 [英] R: Create a New Column in R to determine Semester Based on Two Dates

查看:73
本文介绍了R:在R中创建一个新列以根据两个日期确定学期的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些数据。 ID和日期,我正在尝试为学期创建一个新字段。

I have some data. ID and date and I'm trying to create a new field for semester.

df:

id  date
1   20160822
2   20170109
3   20170828
4   20170925
5   20180108
6   20180402
7   20160711
8   20150831
9   20160111
10  20160502
11  20160829
12  20170109
13  20170501

我也有一个学期表:

start       end         season_year
20120801    20121222    Fall-2012
20121223    20130123    Winter-2013
20130124    20130523    Spring-2013
20130524    20130805    Summer-2013
20130806    20131228    Fall-2013
20131229    20140122    Winter-2014
20140123    20140522    Spring-2014
20140523    20140804    Summer-2014
20140805    20141227    Fall-2014
20141228    20150128    Winter-2015
20150129    20150528    Spring-2015
20150529    20150803    Summer-2015
20150804    20151226    Fall-2015
20151227    20160127    Winter-2016
20160128    20160526    Spring-2016
20160527    20160801    Summer-2016
20160802    20161224    Fall-2016
20161225    20170125    Winter-2017
20170126    20170525    Spring-2017
20170526    20170807    Summer-2017
20170808    20171230    Fall-2017
20171231    20180124    Winter-2018
20180125    20180524    Spring-2018
20180525    20180806    Summer-2018
20180807    20181222    Fall-2018
20181223    20190123    Winter-2019
20190124    20190523    Spring-2019
20190524    20180804    Summer-2019

如果<$ c $,我想在 df 中创建一个新字段c> df $ date 在 semester $ start semester $ end 之间,然后放置在 df

I'd like to create a new field in df if df$date is between semester$start and semester$end, then place the respective value semester$season_year in df

我试图查看lubridate软件包是否可以提供帮助,但这似乎是用于计算的矿石

I tried to see if the lubridate package could help but that seems to be more for calculations

我看到了这个问题,它似乎与我想要的最接近,但是,为了使事情变得更复杂,并非我们所有的学期都为六个月

I saw this question and it seems to be the closest to what i want, but, to make things more complicated, not all of our semesters are six months

推荐答案

使用 non-equi 更新的解决方案使用 data.table加入 lubridate 包可以是:

A solution using non-equi update joins using data.table and lubridate package can be as:

library(data.table)

setDT(df)
setDT(semester)


df[,date:=as.IDate(as.character(date), format = "%Y%m%d")]
semester[,':='(start = as.IDate(as.character(start), format = "%Y%m%d"), 
                         end=as.IDate(as.character(end), format = "%Y%m%d"))]


df[semester, on=.(date >= start, date <= end), season_year := i.season_year]

df
#    id       date season_year
# 1:  1 2016-08-22   Fall-2016
# 2:  2 2017-01-09 Winter-2017
# 3:  3 2017-08-28   Fall-2017
# 4:  4 2017-09-25   Fall-2017
# 5:  5 2018-01-08 Winter-2018
# 6:  6 2018-04-02 Spring-2018
# 7:  7 2016-07-11 Summer-2016
# 8:  8 2015-08-31   Fall-2015
# 9:  9 2016-01-11 Winter-2016
# 10: 10 2016-05-02 Spring-2016
# 11: 11 2016-08-29   Fall-2016
# 12: 12 2017-01-09 Winter-2017
# 13: 13 2017-05-01 Spring-2017

数据:

df <- read.table(text="
id  date
1   20160822
2   20170109
3   20170828
4   20170925
5   20180108
6   20180402
7   20160711
8   20150831
9   20160111
10  20160502
11  20160829
12  20170109
13  20170501",
header = TRUE, stringsAsFactors = FALSE)


semester <- read.table(text="
start       end         season_year
20120801    20121222    Fall-2012
20121223    20130123    Winter-2013
20130124    20130523    Spring-2013
20130524    20130805    Summer-2013
20130806    20131228    Fall-2013
20131229    20140122    Winter-2014
20140123    20140522    Spring-2014
20140523    20140804    Summer-2014
20140805    20141227    Fall-2014
20141228    20150128    Winter-2015
20150129    20150528    Spring-2015
20150529    20150803    Summer-2015
20150804    20151226    Fall-2015
20151227    20160127    Winter-2016
20160128    20160526    Spring-2016
20160527    20160801    Summer-2016
20160802    20161224    Fall-2016
20161225    20170125    Winter-2017
20170126    20170525    Spring-2017
20170526    20170807    Summer-2017
20170808    20171230    Fall-2017
20171231    20180124    Winter-2018
20180125    20180524    Spring-2018
20180525    20180806    Summer-2018
20180807    20181222    Fall-2018
20181223    20190123    Winter-2019
20190124    20190523    Spring-2019
20190524    20180804    Summer-2019",
header = TRUE, stringsAsFactors = FALSE)

这篇关于R:在R中创建一个新列以根据两个日期确定学期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆