4列宽数据帧到3列长数据帧 [英] Wide data frame with 4 columns to long data frame with 3 columns
本文介绍了4列宽数据帧到3列长数据帧的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个数据框(下面的示例),如下所示:
I have a data frame (sample below), as follows:
df = structure(list(Stage1yBefore = c("3.1", "1", "4", "2", "NA"),
Stage2yBefore = c("NA", "2", "3.2", "2", "NA"), ClinicalActivity1yBefore =
c(TRUE,
TRUE, TRUE, TRUE, FALSE), ClinicalActivity2yBefore = c(FALSE,
TRUE, TRUE, TRUE, FALSE)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -5L))
我想使用dplyr将其转换为长格式,但由于某种原因会出现错误。
I would like to convert it to a long format using dplyr, but for some reason get an error.
输出应如下所示(将df的第一行):
The output should look like (converting the first row of df):
Output = data_frame(TimeFrame = c("1y", "2y"), Stage = c(3, NA), Clinical =
c(T, F))
df的行在输出中变成2行。
So that each row of df becomes 2 rows in the output.
我尝试的方法不起作用(而且我实际上不确定到底该怎么做):
What I tried doesnt work (and I'm actually not sure exactly how to do this):
Output = gather(df, TimeFrame, Stage, Clinical, Stage1yBefore:ClinicalActivity2yBefore)
我得到:
Error in .f(.x[[i]],...): Object 'Clinical' not found.
有什么想法吗?
推荐答案
library(dplyr)
library(stringr)
library(tidyr)
df %>% rownames_to_column() %>%
gather(TimeFrame, Stage, Stage1yBefore:ClinicalActivity2yBefore) %>%
#From TimeFrame extract a digit followed by y, also Stage or Clinical
mutate(Time=str_extract(TimeFrame,'\\dy'), Key=str_extract(TimeFrame,'Stage|Clinical')) %>%
dplyr::select(-TimeFrame) %>%
spread(Key,Stage)
# A tibble: 10 x 4
rowname Time Clinical Stage
<chr> <chr> <chr> <chr>
1 1 1y TRUE 3.1
2 1 2y FALSE NA
3 2 1y TRUE 1
4 2 2y TRUE 2
5 3 1y TRUE 4
6 3 2y TRUE 3.2
7 4 1y TRUE 2
8 4 2y TRUE 2
9 5 1y FALSE NA
10 5 2y FALSE NA
这篇关于4列宽数据帧到3列长数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文