使用 data.table 功能将长结构化 data.table 重塑为宽结构? [英] Reshape long structured data.table into a wide structure using data.table functionality?

查看:14
本文介绍了使用 data.table 功能将长结构化 data.table 重塑为宽结构?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

> library(data.table)
> A <- data.table(x = c(1,1,2,2), y = c(1,2,1,2), v = c(0.1,0.2,0.3,0.4))
> A
   x y   v
1: 1 1 0.1
2: 1 2 0.2
3: 2 1 0.3
4: 2 2 0.4
> B <- dcast(A, x~y)
Using v as value column: use value.var to override.
> B
  x   1   2
1 1 0.1 0.2
2 2 0.3 0.4

显然,我可以使用 f.x 将 data.table 从长到宽重塑.包 reshape2 的 dcast.但是 data.table 带有一个重载的括号运算符,提供像by"和group"这样的参数,这让我想知道是否可以使用它来实现它(到 data.table 特定功能)?

Apparently I can reshape a data.table from long to wide using f.x. dcast of package reshape2. But data.table comes along with an overloaded bracket-operator offering parameters like 'by' and 'group', which make me wonder if it is possible to achieve it using this (to data.table specific functionality)?

手册中的一个随机示例:

Just one random example from the manual:

DT[,lapply(.SD,sum),by=x]

这看起来很棒 - 但我还不完全了解它的用法.

That looks awesome - but I don't fully understand the usage yet.

我既没有找到方法也没有找到例子,所以也许这是不可能的,也许它甚至不应该是 - 所以,一个明确的不,不可能,因为......"当然也是一个有效的答案.

I neither found a way nor an example for this so maybe it is just not possible maybe it isn't even supposed to be - so, a definite "no, is not possible because ..." is then of course also a valid answer.

推荐答案

我将选择一个不相等组的示例,以便更容易说明一般情况:

I'll pick an example with unequal groups so that it's easier to illustrate for the general case:

A <- data.table(x=c(1,1,1,2,2), y=c(1,2,3,1,2), v=(1:5)/5)
> A
   x y   v
1: 1 1 0.2
2: 1 2 0.4
3: 1 3 0.6
4: 2 1 0.8
5: 2 2 1.0

第一步是使每组x"的元素/条目数相同.这里,对于 x=1,y 有 3 个值,但对于 x=2,只有 2 个.所以,我们必须先用 NA 解决这个问题,因为 x=2, y=3.

The first step is to get the number of elements/entries for each group of "x" to be the same. Here, for x=1 there are 3 values of y, but only 2 for x=2. So, we'll have to fix that first with NA for x=2, y=3.

setkey(A, x, y)
A[CJ(unique(x), unique(y))]

现在,要将其转换为宽格式,我们应该按x"分组并在 v 上使用 as.list,如下所示:

Now, to get it to wide format, we should group by "x" and use as.list on v as follows:

out <- A[CJ(unique(x), unique(y))][, as.list(v), by=x]
   x  V1  V2  V3
1: 1 0.2 0.4 0.6
2: 2 0.8 1.0  NA

现在,您可以使用 setnames 的引用来设置重塑列的名称,如下所示:

Now, you can set the names of the reshaped columns using reference with setnames as follows:

setnames(out, c("x", as.character(unique(A$y)))

   x   1   2   3
1: 1 0.2 0.4 0.6
2: 2 0.8 1.0  NA

这篇关于使用 data.table 功能将长结构化 data.table 重塑为宽结构?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆