匹配R中的多个日期值 [英] Matching multiple date values in R
问题描述
我有以下数据框DF,用于描述在某些日期从事过某个项目的人员:
I have the following dataframe DF describing people that have worked on a project on certain dates:
ID ProjectName StartDate
1 Health 3/1/06 18:20
2 Education 2/1/07 15:30
1 Education 5/3/09 9:00
3 Wellness 4/1/10 12:00
2 Health 6/1/11 14:20
目标是找到与每个ID对应的第一个项目.例如,预期输出如下:
The goal is to find the first project corresponding to each ID. For example the expected output would be as follows:
ID ProjectName StartDate
1 Health 3/1/06 18:20
2 Education 2/1/07 15:30
3 Wellness 4/1/10 12:00
到目前为止,我已经完成以下操作以获取每个ID的第一个StartDate:
So far I have done the following to get the first StartDate for each ID:
sub <- ddply(DF, .(ID), summarise, st = min(as.POSIXct(StartDate)));
此后,我需要将sub中的每一行与原始DF匹配,并提取与该ID和StartDate对应的项目.可以为sub中的每一行循环执行此操作.但是,我的数据集非常大,我想知道是否存在一种有效的方法来执行此匹配并从DF中提取此子集.
After this, I need to match each row in sub with the original DF and extract the projects corresponding to that ID and StartDate. This can be done in a loop for each row in sub. However, my dataset is very large and I would like to know if there is an efficient way to do this matching and extract this subset from DF.
推荐答案
使用match
非常简单,因为match
返回:
This is fairly straightforward using match
because match
returns:
first 与其中第一个参数匹配的位置的向量 第二个
a vector of the positions of first matches of its first argument in its second
因此,您要做的只是按日期排序,然后使用unique
获取每个ID的一个实例,并使用match
查找第一个位置.感谢@MatthewLunberg提供了可重复的数据示例:
So all you need to do is sort by date, then use unique
to get one instance of each ID and match
to find the first position. Thanks to @MatthewLunberg for providing a reproducible example of your data:
DF <- DF[ order(as.POSIXct(DF$StartDate, format="%m/%d/%y %H:%M")) , ]
DF[ match( unique( DF$ID ) , DF$ID ) , ]
# ID ProjectName StartDate
#6 1 Health 1/1/06 11:10
#2 2 Education 2/1/07 15:30
#4 3 Wellness 4/1/10 12:00
优点之一是,它可以在重新使用之前保留原始数据帧的行号.我不知道这是否对您有用.
One advantage is that it retains the rownumbers of the original dataframe before resorting. I do not know if this could be useful to you.
这篇关于匹配R中的多个日期值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!