使用多个值列从宽到长重塑 [英] Reshaping wide to long with multiple values columns

查看:17
本文介绍了使用多个值列从宽到长重塑的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要将宽表改造成长格式,但为每条记录保留多个字段,例如:

I need to reshape my wide table into long format but keeping multiple fields for each record, for example:

dw <- read.table(header=T, text='
 sbj f1.avg f1.sd f2.avg f2.sd  blabla
   A   10    6     50     10      bA
   B   12    5     70     11      bB
   C   20    7     20     8       bC
   D   22    8     22     9       bD
 ')

# Now I want to melt this table, keeping both AVG and SD as separate fields for each measurement, to get something like this:

 #    sbj var avg  sd  blabla
 #     A   f1  10  6     bA
 #     A   f2  50  10    bA
 #     B   f1  12  5     bB
 #     B   f2  70  11    bB
 #     C   f1  20  7     bC
 #     C   f2  20  8     bC
 #     D   f1  22  8     bD
 #     D   f2  22  9     bD

我有使用 meltreshape 的基本知识,但我不清楚如何在我的情况下应用这种重塑.

I have basic knowledge of using melt and reshape, but it is not obvious for me how to apply such reshaping in my case.

推荐答案

reshape 使用适当的参数来做到这一点.

reshape does this with the appropriate arguments.

variing 列出了以宽格式存在但在长格式中被拆分为多行的列.v.names 是等价的长格式.在两者之间,创建了一个映射.

varying lists the columns which exist in the wide format, but are split into multiple rows in the long format. v.names is the long format equivalents. Between the two, a mapping is created.

来自?reshape:

此外,如果明确给出 v.names,则不会尝试猜测.请注意,variables 中变量的顺序类似于 x.1,y.1,x.2,y.2.

Also, guessing is not attempted if v.names is given explicitly. Notice that the order of variables in varying is like x.1,y.1,x.2,y.2.

给定这些 variingv.names 参数,reshape 足够聪明,可以看到我已经指定索引在点这里(即,顺序 1.x, 1.y, 2.x, 2.y).请注意,原始数据具有按此顺序排列的列,因此我们可以为此示例数据指定 variing=2:5,但这通常不安全.

Given these varying and v.names arguments, reshape is smart enough to see that I've specified that the index is before the dot here (i.e., order 1.x, 1.y, 2.x, 2.y). Note that the original data has the columns in this order, so we can specify varying=2:5 for this example data, but that is not safe in general.

给定 timesv.names 的值,reshapevariing 列拆分为 >. 字符(默认的 sep 参数)在输出中创建列.

Given the values of times and v.names, reshape splits the varying columns on a . character (the default sep argument) to create the columns in the output.

times 指定要在创建的 var 列中使用的值,并将 v.names 粘贴到这些值上以获取列用于映射到结果的宽格式名称.

times specifies values that are to be used in the created var column, and v.names are pasted onto these values to get column names in the wide format for mapping to the result.

最后,idvar 被指定为 sbj 列,它以宽格式标识单个记录(感谢@thelatemail).

Finally, idvar is specified to be the sbj column, which identifies individual records in the wide format (thanks @thelatemail).

reshape(dw, direction='long', 
        varying=c('f1.avg', 'f1.sd', 'f2.avg', 'f2.sd'), 
        timevar='var',
        times=c('f1', 'f2'),
        v.names=c('avg', 'sd'),
        idvar='sbj')

##      sbj blabla var avg sd
## A.f1   A     bA  f1  10  6
## B.f1   B     bB  f1  12  5
## C.f1   C     bC  f1  20  7
## D.f1   D     bD  f1  22  8
## A.f2   A     bA  f2  50 10
## B.f2   B     bB  f2  70 11
## C.f2   C     bC  f2  20  8
## D.f2   D     bD  f2  22  9

这篇关于使用多个值列从宽到长重塑的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆