“嵌套" R中的一个数据框 [英] "Unnesting" a dataframe in R

查看:147
本文介绍了“嵌套" R中的一个数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下data.frame:

df <- data.frame(id=c(1,2,3), 
                 first.date=as.Date(c("2014-01-01", "2014-03-01", "2014-06-01")), 
                 second.date=as.Date(c("2015-01-01", "2015-03-01", "2015-06-1")),
                 third.date=as.Date(c("2016-01-01", "2017-03-01", "2018-06-1")),
                 fourth.date=as.Date(c("2017-01-01", "2018-03-01", "2019-06-1")))

> df

  id first.date second.date third.date fourth.date
1  1 2014-01-01  2015-01-01 2016-01-01  2017-01-01
2  2 2014-03-01  2015-03-01 2017-03-01  2018-03-01
3  3 2014-06-01  2015-06-01 2018-06-01  2019-06-01

每一行代表三个时间跨度;即时间跨度分别在first.datesecond.datesecond.datethird.datethird.datefourth.date之间.

Each row represents three timespans; i.e. the time spans between first.date and second.date, second.date and third.date, and third.date and fourth.date respectively.

我想在缺少一个更好的词的情况下,使数据框嵌套以获取此信息:

I would like to, in lack of a better word, unnest the dataframe to obtain this instead:

  id  StartDate    EndDate
1  1 2014-01-01 2015-01-01
2  1 2015-01-01 2016-01-01
3  1 2016-01-01 2017-01-01
4  2 2014-03-01 2015-03-01
5  2 2015-03-01 2017-03-01
6  2 2017-03-01 2018-03-01
7  3 2014-06-01 2015-06-01
8  3 2015-06-01 2018-06-01
9  3 2018-06-01 2019-06-01

我一直在使用tidyr包中的unnest函数,但是得出的结论是我认为这不是我真正想要的.

I have been playing around with the unnest function from the tidyr package, but I came to the conclusion that I don't think it's what I'm really looking for.

有什么建议吗?

推荐答案

您可以按以下方式尝试tidyr/dplyr:

You can try tidyr/dplyr as follows:

library(tidyr)
library(dplyr)
df %>% gather(DateType, StartDate, -id) %>% select(-DateType) %>% arrange(id) %>% group_by(id) %>% mutate(EndDate = lead(StartDate))

您可以通过添加以下内容来消除每个id组中的最后一行:

You can eliminate the last row in each id group by adding:

%>% slice(-4)

到上述管道.

这篇关于“嵌套" R中的一个数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆