一次重塑多个值 [英] Reshape multiple values at once
问题描述
我有一个很长的数据集,我想扩大范围,我很好奇是否有一种方法可以使用R中的reshape2或tidyr包一步完成全部操作。
I have a long data set I would like to make wide and I'm curious if there is a way to do this all in one step using the reshape2 or tidyr packages in R.
数据框 df
看起来像这样:
id type transactions amount
20 income 20 100
20 expense 25 95
30 income 50 300
30 expense 45 250
我想了解以下内容:
id income_transactions expense_transactions income_amount expense_amount
20 20 25 100 95
30 50 45 300 250
我知道我可以通过以下方式使用reshape2:
I know I can get part of the way there with reshape2 via for example:
dcast(df, id ~ type, value.var="transactions")
但是有没有办法一次将整个df整形,一次处理交易和金额变量?理想情况下,使用新的更合适的列名?
But is there a way to reshape the entire df in one shot addressing both the "transactions" and "amount" variables at once? And ideally with new more appropriate column names?
推荐答案
在 reshape2中,可以使用 recast
(尽管以我的经验来看,这不是一个众所周知的函数)。
In "reshape2", you can use recast
(though in my experience, this isn't a widely known function).
library(reshape2)
recast(mydf, id ~ variable + type, id.var = c("id", "type"))
# id transactions_expense transactions_income amount_expense amount_income
# 1 20 25 20 95 100
# 2 30 45 50 250 300
您还可以使用基R的 reshape
:
reshape(mydf, direction = "wide", idvar = "id", timevar = "type")
# id transactions.income amount.income transactions.expense amount.expense
# 1 20 20 100 25 95
# 3 30 50 300 45 250
或者,您可以融化
和 dcast
,如下所示(此处带有 data.table):
Or, you can melt
and dcast
, like this (here with "data.table"):
library(data.table)
library(reshape2)
dcast.data.table(melt(as.data.table(mydf), id.vars = c("id", "type")),
id ~ variable + type, value.var = "value")
# id transactions_expense transactions_income amount_expense amount_income
# 1: 20 25 20 95 100
# 2: 30 45 50 250 300
在 dcast.data.table
的更高版本中,来自 data.table(1.9.8)您将可以直接执行此操作。如果我正确理解的话,@ Arun尝试实现的内容将是重新整形,而无需首先融化
数据,这就是当前在中发生的情况重铸
,本质上是熔化
+ dcast
操作序列的包装。
In later versions of dcast.data.table
from "data.table" (1.9.8) you will be able to do this directly. If I understand correctly, what @Arun is trying to implement would be doing the reshaping without first having to melt
the data, which is what happens presently with recast
, which is essentially a wrapper for a melt
+ dcast
sequence of operations.
为了更全面,这里是 tidyr
的方法:
And, for thoroughness, here's the tidyr
approach:
library(dplyr)
library(tidyr)
mydf %>%
gather(var, val, transactions:amount) %>%
unite(var2, type, var) %>%
spread(var2, val)
# id expense_amount expense_transactions income_amount income_transactions
# 1 20 95 25 100 20
# 2 30 250 45 300 50
这篇关于一次重塑多个值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!