将三列数据帧重新形成矩阵(“长”到“宽”格式) [英] Reshape three column data frame to matrix ("long" to "wide" format)

查看:142
本文介绍了将三列数据帧重新形成矩阵(“长”到“宽”格式)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 data.frame ,看起来像这样。

  xa 1 
xb 2
xc 3
ya 3
yb 3
yc 2

我想要这样矩阵的形式,所以我可以喂它热图来制作一个情节。结果应该如下所示:

  abc 
x 1 2 3
y 3 3 2

我从重塑包中尝试过 cast 编写一个手动功能来做到这一点,但我似乎无法正确的。

解决方案

有很多方法做这个。这个答案以我最喜欢的方式开始,但也收集了各种各样的方式,从这个网站散布的类似问题的答案。

  tmp< -  data.frame(x = gl(2,3,labels = letters [24:25]),
y = gl(3,1,6,labels = letters [1:3]),
z = c(1,2,3,3,3,2))

使用reshape2: p>

  library(reshape2)
acast(tmp,x〜y,value.var =z)

使用矩阵索引:

  with(tmp,{
out< - matrix(nrow = nlevels(x),ncol = nlevels(y),
dimnames = list(levels(x),levels(y)) )
out [cbind(x,y)]< - z
out
})

使用 xtabs

  xtabs z〜x + y,data = tmp)

您还可以使用 reshape ,如下所示:将表格逐列转换为名称,虽然您必须稍后进行一些操作,以删除额外的列,并获得正确的名称(未显示)。

 > reshape(tmp,idvar =x,timevar =y,direction =wide)
x za zb zc
1 x 1 2 3
4 y 3 3 2

中还有 sparseMatrix 矩阵包,如下所示: R - 将BIG表按列名转换为矩阵

 > with(tmp,sparseMatrix(i = as.numeric(x),j = as.numeric(y),x = z,
+ dimnames = list(levels(x),levels(y))) b $ b 2 x 3稀疏类dgCMatrix的矩阵
abc
x 1 2 3
y 3 3 2

也可以使用 plyr 库中的 daply 函数,如下所示: http://stackoverflow.com/a/7020101/210673

 >图书馆(plyr)
> daply(tmp,。(x,y),function(x)x $ z)
y
xabc
x 1 2 3
y 3 3 2

dcast 从reshape2也可以工作,如下所示:对一列中的值进行统计数据,但是您可以获得一个包含 x 的列的data.frame价值。

 > dcast(tmp,x〜y,value.var =z)
xabc
1 x 1 2 3
2 y 3 3 2
/ pre>

同样,tidyr中的 spread 也可以进行这样的转换:

  library(tidyr)
spread(tmp,y,z)
#xabc
#1 x 1 2 3
#2 y 3 3 2


I have a data.frame that looks like this.

x a 1 
x b 2 
x c 3 
y a 3 
y b 3 
y c 2 

I want this in matrix form so I can feed it to heatmap to make a plot. The result should look something like:

    a    b    c
x   1    2    3
y   3    3    2

I have tried cast from the reshape package and I have tried writing a manual function to do this but I do not seem to be able to get it right.

解决方案

There are many ways to do this. This answer starts with my favorite ways, but also collects various ways from answers to similar questions scattered around this site.

tmp <- data.frame(x=gl(2,3, labels=letters[24:25]),
                  y=gl(3,1,6, labels=letters[1:3]), 
                  z=c(1,2,3,3,3,2))

Using reshape2:

library(reshape2)
acast(tmp, x~y, value.var="z")

Using matrix indexing:

with(tmp, {
  out <- matrix(nrow=nlevels(x), ncol=nlevels(y),
                dimnames=list(levels(x), levels(y)))
  out[cbind(x, y)] <- z
  out
})

Using xtabs:

xtabs(z~x+y, data=tmp)

You can also use reshape, as suggested here: Convert table into matrix by column names, though you have to do a little manipulation afterwards to remove an extra columns and get the names right (not shown).

> reshape(tmp, idvar="x", timevar="y", direction="wide")
  x z.a z.b z.c
1 x   1   2   3
4 y   3   3   2

There's also sparseMatrix within the Matrix package, as seen here: R - convert BIG table into matrix by column names

> with(tmp, sparseMatrix(i = as.numeric(x), j=as.numeric(y), x=z,
+                        dimnames=list(levels(x), levels(y))))
2 x 3 sparse Matrix of class "dgCMatrix"
  a b c
x 1 2 3
y 3 3 2

The daply function from the plyr library could also be used, as here: http://stackoverflow.com/a/7020101/210673

> library(plyr)
> daply(tmp, .(x, y), function(x) x$z)
   y
x   a b c
  x 1 2 3
  y 3 3 2

dcast from reshape2 also works, as here: Reshape data for values in one column, but you get a data.frame with a column for the x value.

> dcast(tmp, x~y, value.var="z")
  x a b c
1 x 1 2 3
2 y 3 3 2

Similarly, spread from "tidyr" would also work for such a transformation:

library(tidyr)
spread(tmp, y, z)
#   x a b c
# 1 x 1 2 3
# 2 y 3 3 2

这篇关于将三列数据帧重新形成矩阵(“长”到“宽”格式)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆