传播重复标识符(使用tidyverse和%>%) [英] Spread with duplicate identifiers (using tidyverse and %>%)
本文介绍了传播重复标识符(使用tidyverse和%>%)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想使用%>%链接进行整理。
I would like to do this in tidyverse using %>%-chaining.
df <-
structure(list(id = c(2L, 2L, 4L, 5L, 5L, 5L, 5L), start_end = structure(c(2L,
1L, 2L, 2L, 1L, 2L, 1L), .Label = c("end", "start"), class = "factor"),
date = structure(c(6L, 7L, 3L, 8L, 9L, 10L, 11L), .Label = c("1979-01-03",
"1979-06-21", "1979-07-18", "1989-09-12", "1991-01-04", "1994-05-01",
"1996-11-04", "2005-02-01", "2009-09-17", "2010-10-01", "2012-10-06"
), class = "factor")), .Names = c("id", "start_end", "date"
), row.names = c(3L, 4L, 7L, 8L, 9L, 10L, 11L), class = "data.frame")
我试过的是:
What I have tried:
data.table::dcast( df, formula = id ~ start_end, value.var = "date", drop = FALSE ) # does not work because it summarises the data
tidyr::spread( df, start_end, date ) # does not work because of duplicate values
df$id2 <- 1:nrow(df)
tidyr::spread( df, start_end, date ) # does not work because the dataset now has too many rows.
这些问题不回答我的问题:
These questions do not answer my question:
对行使用重复标识符的扩展(因为它们总结)
R :在具有重复项的数据框上传播函数(因为它们将值粘贴在一起)
R: spread function on data frame with duplicates (because they paste the values together)
使用登录重新整形R中的数据"注销"时间(因为没有具体要求/回答使用整理和链接)
Reshaping data in R with "login" "logout" times (because not specifically asking for/answered using tidyverse and chaining)
推荐答案
我们可以使用 tidyverse
。在'start_end','id'分组之后,创建一个序列'ind',然后从'long'到'wide'格式分发
We can use tidyverse
. After grouping by 'start_end', 'id', create a sequence column 'ind' , then spread
from 'long' to 'wide' format
library(dplyr)
library(tidyr)
df %>%
group_by(start_end, id) %>%
mutate(ind = row_number()) %>%
spread(start_end, date) %>%
select(start, end)
# id start end
#* <int> <fctr> <fctr>
#1 2 1994-05-01 1996-11-04
#2 4 1979-07-18 NA
#3 5 2005-02-01 2009-09-17
#4 5 2010-10-01 2012-10-06
这篇关于传播重复标识符(使用tidyverse和%>%)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文