将三列数据帧重塑为矩阵(“长"到“宽"格式) [英] Reshape three column data frame to matrix ("long" to "wide" format)

查看:35
本文介绍了将三列数据帧重塑为矩阵(“长"到“宽"格式)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个看起来像这样的 data.frame.

I have a data.frame that looks like this.

x a 1 
x b 2 
x c 3 
y a 3 
y b 3 
y c 2 

我想要矩阵形式的它,这样我就可以将它提供给热图来绘制图表.结果应该类似于:

I want this in matrix form so I can feed it to heatmap to make a plot. The result should look something like:

    a    b    c
x   1    2    3
y   3    3    2

我已经尝试过 reshape 包中的 cast 并且我已经尝试编写一个手动函数来做到这一点,但我似乎无法做到这一点.

I have tried cast from the reshape package and I have tried writing a manual function to do this but I do not seem to be able to get it right.

推荐答案

有很多方法可以做到这一点.此答案从迅速成为标准方法的内容开始,但也包括旧方法和各种其他方法,包括对分散在本网站周围的类似问题的答案.

There are many ways to do this. This answer starts with what is quickly becoming the standard method, but also includes older methods and various other methods from answers to similar questions scattered around this site.

tmp <- data.frame(x=gl(2,3, labels=letters[24:25]),
                  y=gl(3,1,6, labels=letters[1:3]), 
                  z=c(1,2,3,3,3,2))

使用 tidyverse:

使用 tidyr 1.0.0 中的 pivot_wider 来实现这一点很酷.它返回一个数据框,这可能是这个答案的大多数读者想要的.但是,对于热图,您需要将其转换为真正的矩阵.

The new cool new way to do this is with pivot_wider from tidyr 1.0.0. It returns a data frame, which is probably what most readers of this answer will want. For a heatmap, though, you would need to convert this to a true matrix.

library(tidyr)
pivot_wider(tmp, names_from = y, values_from = z)
## # A tibble: 2 x 4
## x         a     b     c
## <fct> <dbl> <dbl> <dbl>
## 1 x       1     2     3
## 2 y       3     3     2

旧的很酷的新方法是使用 tidyr 中的 spread.它类似地返回一个数据框.

The old cool new way to do this is with spread from tidyr. It similarly returns a data frame.

library(tidyr)
spread(tmp, y, z)
##   x a b c
## 1 x 1 2 3
## 2 y 3 3 2

使用 reshape2:

向 tidyverse 迈出的第一步是 reshape2 包.

One of the first steps toward the tidyverse was the reshape2 package.

使用 acast 获取矩阵:

library(reshape2)
acast(tmp, x~y, value.var="z")
##   a b c
## x 1 2 3
## y 3 3 2

或者要获取数据框,请使用 dcast,如下所示:重塑一列中的值的数据.

Or to get a data frame, use dcast, as here: Reshape data for values in one column.

dcast(tmp, x~y, value.var="z")
##   x a b c
## 1 x 1 2 3
## 2 y 3 3 2

使用plyr:

在 reshape2 和 tidyverse 之间出现了 plyr,带有 daply 功能,如下所示:https://stackoverflow.com/a/7020101/210673

In between reshape2 and the tidyverse came plyr, with the daply function, as shown here: https://stackoverflow.com/a/7020101/210673

library(plyr)
daply(tmp, .(x, y), function(x) x$z)
##    y
## x   a b c
##   x 1 2 3
##   y 3 3 2

使用矩阵索引:

这有点老派,但很好地演示了矩阵索引,这在某些情况下非常有用.

This is kinda old school but is a nice demonstration of matrix indexing, which can be really useful in certain situations.

with(tmp, {
  out <- matrix(nrow=nlevels(x), ncol=nlevels(y),
                dimnames=list(levels(x), levels(y)))
  out[cbind(x, y)] <- z
  out
})

使用xtabs:

xtabs(z~x+y, data=tmp)

使用稀疏矩阵:

Matrix 包中还有 sparseMatrix,如下所示:R- 将大表按列名转换成矩阵

There's also sparseMatrix within the Matrix package, as seen here: R - convert BIG table into matrix by column names

with(tmp, sparseMatrix(i = as.numeric(x), j=as.numeric(y), x=z,
                       dimnames=list(levels(x), levels(y))))
## 2 x 3 sparse Matrix of class "dgCMatrix"
##   a b c
## x 1 2 3
## y 3 3 2

使用reshape:

您也可以使用基本的 R 函数 reshape,如下所示:按列将表转换为矩阵名称,但之后您必须进行一些操作以删除额外的列并获得正确的名称(未显示).

You can also use the base R function reshape, as suggested here: Convert table into matrix by column names, though you have to do a little manipulation afterwards to remove an extra columns and get the names right (not shown).

reshape(tmp, idvar="x", timevar="y", direction="wide")
##   x z.a z.b z.c
## 1 x   1   2   3
## 4 y   3   3   2

这篇关于将三列数据帧重塑为矩阵(“长"到“宽"格式)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆