向量化“日期范围的扩展” R的dplyr中的每行 [英] vectorizing "expansion of date range" per row in dplyr of R
问题描述
我在R中有一个像下面这样的数据集:
I have a dataset in tibble in R like the one below:
# A tibble: 50,045 x 5
ref_key start_date end_date
<chr> <date> <date>
1 123 2010-01-08 2010-01-13
2 123 2010-01-21 2010-01-23
3 123 2010-03-10 2010-04-14
我需要创建另一个小标题,每行仅存储一个日期,例如以下日期:
I need to create another tibble that each row only store one date, like the one below:
ref_key date
<chr> <date>
1 123 2010-01-08
2 123 2010-01-09
3 123 2010-01-10
4 123 2010-01-11
5 123 2010-01-12
6 123 2010-01-13
7 123 2010-01-21
8 123 2010-01-22
9 123 2010-01-23
当前,我正在为以下内容编写一个显式循环:
Currently I am writing an explicit loop for that like below:
for (loop in (1:nrow(input.df))) {
if (loop%%100==0) {
print(paste(loop,'/',nrow(input.df)))
}
temp.df.st00 <- input.df[loop,] %>% data.frame
temp.df.st01 <- tibble(ref_key=temp.df.st00[,'ref_key'],
date=seq(temp.df.st00[,'start_date'],
temp.df.st00[,'end_date'],1))
if (loop==1) {
output.df <- temp.df.st01
} else {
output.df <- output.df %>%
bind_rows(temp.df.st01)
}
}
它正在工作,但是速度很慢,因为我有> 50k行
It is working, but in a slow way, given that I have >50k rows to process, it takes a few minutes to finish the loop.
我想知道是否可以对这一步骤进行矢量化处理,因为它与 row_wise有关吗?
在 dplyr
?
I wonder if this step can be vectorized, is it something related to row_wise
in dplyr
?
推荐答案
行名称列( rownames_to_column
),然后 nest
'rn'和'ref_key',映射
和 unnest
变异 >在选择
删除不需要的列之后
We create a row name column (rownames_to_column
), then nest
the 'rn' and 'ref_key', mutate
by taking the sequence of 'start_date' and 'end_date' within map
and unnest
after select
ing out the unwanted columns
library(tidyverse)
res <- df1 %>%
rownames_to_column('rn') %>%
nest(-rn, -ref_key) %>%
mutate(date = map(data, ~ seq(.x$start_date, .x$end_date, by = "1 day"))) %>%
select(-data, -rn) %>%
unnest
head(res, 9)
# ref_key date
#1 123 2010-01-08
#2 123 2010-01-09
#3 123 2010-01-10
#4 123 2010-01-11
#5 123 2010-01-12
#6 123 2010-01-13
#7 123 2010-01-21
#8 123 2010-01-22
#9 123 2010-01-23
这篇关于向量化“日期范围的扩展” R的dplyr中的每行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!