带有dcast data.table的宽格式 [英] wide format with dcast data.table

查看:189
本文介绍了带有dcast data.table的宽格式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想像这样转换一个表(*):

I would like to transform a table like this (*):

set.seed(1)
mydata <- data.frame(ID=rep(1:4, each=3), R=rep(1:3, times=4), FIXED=rep(runif(4), each=3), AAA=rnorm(12), BBB=rbinom(12,12,0.5), CCC=runif(12))

ID R    FIXED    AAA  BBB   CCC
 1 1    0.26   -0.83   8   0.82
 1 2    0.26    1.59   5   0.64
 1 3    0.26    0.32   6   0.78
 2 1    0.37   -0.82   6   0.55
 2 2    0.37    0.48   6   0.52
 2 3    0.37    0.73   4   0.78
 3 1    0.57    0.57   8   0.02
 3 2    0.57   -0.30   7   0.47
 3 3    0.57    1.51   7   0.73
 4 1    0.90    0.38   4   0.69
 4 2    0.90   -0.62   7   0.47
 4 3    0.90   -2.21   6   0.86    

宽幅格式,如下所示:

ID  FIXED   AAA1    BBB2    CCC2    FIXED2  AAA2    BBB2    CCC2    FIXED3  AAA3    BBB3    CCC3
1   0.27    0.49       7    0.73     0.37   0.74       4    0.69      0.57  0.58       7    0.48
2   0.91    -0.31      6    0.86     0.20   1.51       8    0.44      0.90  0.39       7    0.24
3   0.94    -0.62      7    0.07     0.66  -2.21       6    0.10      0.63  1.12       6    0.32
4   0.06    -0.04      7    0.52     0.21  -0.02       3    0.66      0.18  0.94       6    0.41

我该怎么办?

我尝试过

How can I do it?
I've tried with

dcast(mydata, ID + FIXED ~ R, value.var=(names(mydata)[3:5])   

甚至编写列名 AAA, BBB, CCC,但是会产生错误,我无法获取所需的宽格式。我也尝试了其他选择,但没有运气。

or even writing the column names, "AAA", "BBB", "CCC", but it produces an error and I can't get the wide format I need. I've also tried other options, with no luck.

我该怎么做?

(*)实际上,列数更多,但故事却是

(*) In reality has much more columns, but the story is the same.

错误是:

Error in .subset2(x, i, exact = exact) : 
  recursive indexing failed at level 2
In addition: Warning message:
In if (!(value.var %in% names(data))) { :
  the condition has length > 1 and only the first element will be used


推荐答案

引用了错误的值变量( AAA BBB CCC 列的索引号为4-6),您应该使用 setDT()将数据框转换为数据表。使用:

You are referencing to the wrong value variables (the AAA, BBB and CCC columns have index numbers 4 - 6) and you should use setDT() to convert the dataframe to a datatable. Using:

dcast(setDT(mydata), ID + FIXED ~ R, value.var = names(mydata)[4:6])

给出:

   ID     FIXED      AAA_1      AAA_2      AAA_3 BBB_1 BBB_2 BBB_3     CCC_1     CCC_2     CCC_3
1:  1 0.2655087 -0.8356286  1.5952808  0.3295078     8     5     6 0.8209463 0.6470602 0.7829328
2:  2 0.3721239 -0.8204684  0.4874291  0.7383247     6     6     4 0.5530363 0.5297196 0.7893562
3:  3 0.5728534  0.5757814 -0.3053884  1.5117812     8     7     7 0.0233312 0.4772301 0.7323137
4:  4 0.9082078  0.3898432 -0.6212406 -2.2146999     4     7     6 0.6927316 0.4776196 0.8612095

如果不转换为数据表,则使用 data.table 包将从 reshape2 退回到 dcast 的实现,它不能处理多个值.var ,因此出现错误消息。

If you don't convert to a datatable, the data.table package will fall back to the implementation of dcast from reshape2 which is not able to hande multiple value.var's, hence the error-message.

如果您想要另一个分隔符,您可以在 dcast 中添加例如 sep =’。参数。

If you want another separator, you can add for example sep = '.' parameter to dcast.

这篇关于带有dcast data.table的宽格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆