在解析列名称中的信息并使用它从特定列中收集信息时,重塑R中的表 [英] Reshaping a table in R while parsing information from column names and using it to collect information from specific columns
问题描述
我给了我这个组织不好的数据表,其中有数百列(子集在下面给出)
I have this badly organized data table given to me, in which there are hundreds of columns (subset is given below)
列名以点分隔,其中第一个字段包含有关对象类型(例如Item123,object_AB等)的信息,而没有任何命名约定.这些列也没有特定的顺序.其他列共享对象字段的类型,并具有该对象的某些属性的名称(例如颜色,制造商等).
Names of columns are dot delimited where the first field holds information about a type of object (e.g. Item123, object_AB etc.) without any naming convention. There is no specific order for these columns as well. Other columns share the type of object field and also have the name of some property for that object (e.g. color, manufacturer etc.).
Item123.type.value Item123.mass.value Item123.color.value object_AB.type.value object_AB.mass.value object_AB.color.value
Desk 11.2 blue Chair 2.3 orange
Desk 14.2 red Sofa 22 grey
Armchair 23.3 black Monitor 2.2 white
添加dput()结构:
EDITED: Adding dput() structure:
structure(list(Item123.type.value = structure(c(2L, 2L, 1L),
levels = c("Armchair", "Desk"), class = "factor"), Item123.mass.value = structure(1:3,
levels = c("11.2", "14.2", "23.3"), class = "factor"), Item123.color.value = structure(c(2L,
3L, 1L), levels = c("black", "blue", "red"), class = "factor"),
object_AB.type.value = structure(c(1L, 3L, 2L), levels = c("Chair",
"Monitor", "Sofa"), class = "factor"), object_AB.mass.value = structure(c(2L,
3L, 1L), levels = c("2.2", "2.3", "22"), class = "factor"),
object_AB.color.value = structure(c(2L, 1L, 3L), levels = c("grey",
"orange", "white"), class = "factor")), row.names = c(NA_integer_,
-3L), class = "data.frame")
我需要将表转换成这样的格式(行的顺序无关紧要):
I need to convert the table into something like this (order of rows does not matter):
type name mass color
Item123 Desk 11.2 blue
Item123 Desk 14.2 red
object_AB Chair 2.3 orange
object_AB Sofa 22 grey
Item123 Armchair 23.3 black
object_AB Monitor 2.2 white
我将非常感谢我能提供的任何帮助!
I would really appreciate any help I could get!!
推荐答案
您可以在此处使用 pivot_longer
指定 names_pattern
从列名中获取数据.
You can use pivot_longer
here specifying names_pattern
to get data from the column names.
tidyr::pivot_longer(df,
cols = everything(),
names_to = c('name', '.value'),
names_pattern = '(\\w+)\\.(\\w+)\\.')
# A tibble: 6 x 4
# name type mass color
# <chr> <fct> <fct> <fct>
#1 Item123 Desk 11.2 blue
#2 object_AB Chair 2.3 orange
#3 Item123 Desk 14.2 red
#4 object_AB Sofa 22 grey
#5 Item123 Armchair 23.3 black
#6 object_AB Monitor 2.2 white
这篇关于在解析列名称中的信息并使用它从特定列中收集信息时,重塑R中的表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!