在解析列名称中的信息并使用它从特定列中收集信息时,重塑R中的表 [英] Reshaping a table in R while parsing information from column names and using it to collect information from specific columns

查看:37
本文介绍了在解析列名称中的信息并使用它从特定列中收集信息时,重塑R中的表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我给了我这个组织不好的数据表,其中有数百列(子集在下面给出)

I have this badly organized data table given to me, in which there are hundreds of columns (subset is given below)

列名以点分隔,其中第一个字段包含有关对象类型(例如Item123,object_AB等)的信息,而没有任何命名约定.这些列也没有特定的顺序.其他列共享对象字段的类型,并具有该对象的某些属性的名称(例如颜色,制造商等).

Names of columns are dot delimited where the first field holds information about a type of object (e.g. Item123, object_AB etc.) without any naming convention. There is no specific order for these columns as well. Other columns share the type of object field and also have the name of some property for that object (e.g. color, manufacturer etc.).

Item123.type.value  Item123.mass.value  Item123.color.value object_AB.type.value  object_AB.mass.value  object_AB.color.value
Desk  11.2  blue  Chair 2.3 orange
Desk 14.2 red Sofa  22  grey
Armchair  23.3  black  Monitor 2.2 white

添加dput()结构:

EDITED: Adding dput() structure:

structure(list(Item123.type.value = structure(c(2L, 2L, 1L),
levels = c("Armchair", "Desk"), class = "factor"), Item123.mass.value = structure(1:3,
levels = c("11.2", "14.2", "23.3"), class = "factor"), Item123.color.value = structure(c(2L,
3L, 1L), levels = c("black", "blue", "red"), class = "factor"),
object_AB.type.value = structure(c(1L, 3L, 2L), levels = c("Chair",
"Monitor", "Sofa"), class = "factor"), object_AB.mass.value = structure(c(2L,
3L, 1L), levels = c("2.2", "2.3", "22"), class = "factor"),
object_AB.color.value = structure(c(2L, 1L, 3L), levels = c("grey",
"orange", "white"), class = "factor")), row.names = c(NA_integer_,
-3L), class = "data.frame")

我需要将表转换成这样的格式(行的顺序无关紧要):

I need to convert the table into something like this (order of rows does not matter):

type  name  mass  color
Item123  Desk  11.2  blue
Item123  Desk  14.2  red
object_AB  Chair 2.3 orange
object_AB  Sofa  22  grey
Item123  Armchair  23.3  black
object_AB  Monitor 2.2 white

我将非常感谢我能提供的任何帮助!

I would really appreciate any help I could get!!

推荐答案

您可以在此处使用 pivot_longer 指定 names_pattern 从列名中获取数据.

You can use pivot_longer here specifying names_pattern to get data from the column names.

tidyr::pivot_longer(df, 
                    cols = everything(), 
                    names_to = c('name', '.value'),
                    names_pattern = '(\\w+)\\.(\\w+)\\.')

# A tibble: 6 x 4
#  name      type     mass  color 
#  <chr>     <fct>    <fct> <fct> 
#1 Item123   Desk     11.2  blue  
#2 object_AB Chair    2.3   orange
#3 Item123   Desk     14.2  red   
#4 object_AB Sofa     22    grey  
#5 Item123   Armchair 23.3  black 
#6 object_AB Monitor  2.2   white 

这篇关于在解析列名称中的信息并使用它从特定列中收集信息时,重塑R中的表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆