使用 R tidyr pivot_wide 从多个列名和值中获取宽格式数据 [英] Using R tidyr pivot_wide to get wide-form data from multiple column names and values
本文介绍了使用 R tidyr pivot_wide 从多个列名和值中获取宽格式数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
如何使用 tidyr pivot_wide 将此数据框从长格式转换为宽形式?我尝试应用文档页面上的示例,但我一定遗漏了一些东西.
How can I use tidyr pivot_wide to convert this data frame from long form to wide form? I tried applying the examples on the docs page, but I must be missing something.
数据框
id <- c(1,1,2,2,3,3)
filename <- c('file1a.txt', 'file1b.txt',
'file2a.txt', 'file2b.txt',
'file3a.txt', 'file3b.txt')
val <- c(832, 834, 221, 878, 2, 19)
df1 <- data.frame(id, filename, val)
view(df1)
id | 文件名 | val |
---|---|---|
1 | file1a.txt | 832 |
1 | file1b.txt | 834 |
2 | file2a.txt | 221 |
2 | file2b.txt | 878 |
3 | file3a.txt | 2 |
3 | file3b.txt | 19 |
期望输出
id | 文件名1 | 文件名2 | val1 | val2 |
---|---|---|---|---|
1 | file1a.txt | file1b.txt | 832 | 834 |
2 | file2a.txt | file2b.txt | 221 | 878 |
3 | file3a.txt | file3b.txt | 2 | 19 |
失败的尝试
df_wide <- pivot_wider(data = df1,
id_cols = id,
values_from = c("filename", "val"))
view(df_wide)
id | 文件名_ | val_ |
---|---|---|
1 | 1:2 | c(832,834) |
2 | 3:4 | c(221,878) |
3 | 5:6 | c(2,19) |
df_wide <- pivot_wider(data = df1,
id_cols = id,
names_from = c("filename", "val"),
values_from = c("filename", "val"))
view(df_wide)
id | filename_file1a.txt_832 | filename_file1b.txt_834 | filename_file2a.txt_221 | ...等 |
---|---|---|---|---|
1 | file1a.txt | file1b.txt | 不适用 | ...等 |
2 | 不适用 | 不适用 | file2a.txt | ...等 |
3 | 不适用 | 不适用 | 不适用 | ...等 |
推荐答案
我们需要一个行序列
library(dplyr)
library(tidyr)
library(data.table)
df1 %>%
mutate(cn = rowid(id)) %>%
pivot_wider(names_from = cn, values_from = c(filename, val), names_sep="")
-输出
# A tibble: 3 x 5
# id filename1 filename2 val1 val2
# <dbl> <chr> <chr> <dbl> <dbl>
#1 1 file1a.txt file1b.txt 832 834
#2 2 file2a.txt file2b.txt 221 878
#3 3 file3a.txt file3b.txt 2 19
或者按row_number
df1 %>%
group_by(id)
mutate(cn = row_number()) %>%
pivot_wider(names_from = cn, values_from = c(filename, val), names_sep="")
如果我们不需要使用%>%
,指定data
为mutate
d原始数据集,并增加一列'cn' 基于 'id' 的序列
If we need not use %>%
, specify the data
as the mutate
d original dataset, with an added column 'cn' based on the sequence of 'id'
pivot_wider(mutate(df1, cn = rowid(id)),
names_from = cn, values_from = c(filename, val), names_sep="")
这篇关于使用 R tidyr pivot_wide 从多个列名和值中获取宽格式数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文