如何在R中获取此数据结构? [英] How to get this data structure in R?

查看:215
本文介绍了如何在R中获取此数据结构?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从当前数据结构中查找通缉的数据结构。
我知道部分预期数据结构的示意图。
所需的数据结构包括另一个 list(...) factor 类。
当前数据结构

I am trying to find Wanted data structure from the current data structure. I know the schematics of the expected data structure partially. The wanted data structure includes one more list(...) and factor class. Current data structure

> print(dat.m)

         [,1] [,2]
ave_max  150   61
ave       60    0
lepo      41    0

dat.m <- structure(c(150L, 60L, 41L, 61L, 0L, 0L), .Dim = c(3L, 2L), .Dimnames = list(
    c("ave_max", "ave", "lepo"), NULL))

想要的数据结构

> print(dat.m)

     Vars    M1    M2 
1 ave_max   150    61 
2 ave        60     0 
3 lepo       41     0 

我知道它大致类似于以下内容,其中未知的 structure(c(...) row.names = c(...)

I know it is schematically something close to the following where unknown structure(c(...) and row.names = c(...)

structure(list(Vars = structure(c(...), .Label = c("ave_max", 
"ave", "lepo"), class = "factor"), M1 = c(150, 60, 
41), M2 = c(61, 0, 0)), .Names = c("Vars", "ave_max", "ave", 
"lepo"), class = "data.frame", row.names = c(...))

R:3.4。 0(向后移植)

操作系统:Debian 8.7

R: 3.4.0 (backports)
OS: Debian 8.7

推荐答案

如果您不要坚持使用 M1 M2 等作为列名, data.table 解决方案:

If you don't insist on M1, M2, etc. as column names, there is an even shorter data.table solution:

library(data.table)   # CRAN version 1.10.4 used
as.data.table(dat.m, keep.rownames = "Vars")
#      Vars  V1 V2
#1: ave_max 150 61
#2:     ave  60  0
#3:    lepo  41  0






如果您愿意坚持使用 M1 M2 等作为列名,并且矩阵 dat.m 有很多列,可以将这些列重命名:


If you do insist on M1, M2, etc. as column names and your matrix dat.m has many columns, the columns can be renamed:

DT <- as.data.table(dat.m, keep.rownames = "Vars")
setnames(DT, stringr::str_replace(names(DT), "^V(?=\\d+$)", "M"))
DT
#      Vars  M1 M2
#1: ave_max 150 61
#2:     ave  60  0
#3:    lepo  41  0

常规表达式使用超前断言来确保仅更改以 V 开头并紧随其后并至少由一位数字结尾的列。其他如 Vars V V17b VV3 没有被触及。

The regular expression uses a look-ahead assertion to ensure that only columns starting with V and immediately followed and ended by at least one digit are changed. Others like Vars, V, V17b, VV3 aren't touched.

如果矩阵中有很多列,并且操作的目的不仅仅是为了打印漂亮的列标题,可能会考虑将您的数据从宽到长整形。例如, ggplot 首选长格式。

If your matrix has many columns and the purpose of your operation is not just to have nice column headers for printing, you may consider to reshape your data from wide to long form. The long form is preferred by ggplotfor instance.

DT_long <- melt(as.data.table(dat.m, keep.rownames = "Vars"), id.vars = "Vars")
DT_long
#      Vars variable value
#1: ave_max       V1   150
#2:     ave       V1    60
#3:    lepo       V1    41
#4: ave_max       V2    61
#5:     ave       V2     0
#6:    lepo       V2     0

长格式通常可以更轻松地操作数据,例如重命名列:

In long form, it is often easier to manipulate your data, for instance, to rename the columns:

DT_long[, variable := stringr::str_replace(variable, "^V", "M")]
DT_long
#      Vars variable value
#1: ave_max       M1   150
#2:     ave       M1    60
#3:    lepo       M1    41
#4: ave_max       M2    61
#5:     ave       M2     0
#6:    lepo       M2     0

最后,您可以再次从长形变形为宽形

Finally, you can reshape from long to wide form again

dcast(DT_long, Vars ~ ...)
#      Vars  M1 M2
#1:     ave  60  0
#2: ave_max 150 61
#3:    lepo  41  0

请注意,强制转换公式可以识别两个特殊变量: ... 表示没有变量; ... 表示公式 中未提及的所有变量。 (有关详细信息,请参见?data.table :: dcast )。

Note that the cast formula recognizes two special variables: . and .... . represents no variable; ... represents all variables not otherwise mentioned in formula. (See ?data.table::dcast for details).

这篇关于如何在R中获取此数据结构?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆