在R中一次重塑多个值 [英] Reshape multiple values at once in R
问题描述
我有一个很长的数据集,我想做广泛,如果有一种方法可以使用R中的reshape2或tidyr包,一步一步地完成,我很好奇。
数据框 df
如下所示:
id类型交易金额
pre>
20收入20 100
20费用25 95
30收入50 300
30费用45 250
我想要这样:
id income_transactions costs_transactions income_amount costs_amount
20 20 25 100 95
30 50 45 300 250
我知道我可以通过reshape2获得一部分方式,例如:
dcast(df,id〜type,value .var =transactions)
但是有没有办法o一次重塑整个df,同时处理交易和金额变量?理想情况下,使用新的更适合的列名称?
解决方案在reshape2中,您可以使用
recast
(虽然在我的经验中,这不是一个广为人知的功能)。library(reshape2)
recast(mydf,id〜variable + type,id.var = c ,type))
#id transactions_expense transactions_income amount_expense amount_income
#1 20 25 20 95 100
#2 30 45 50 250 300
您还可以使用base R的
reshape
:reshape(mydf,direction =wide,idvar =id,timevar =type)
#id transactions.income amount.income transactions.expense amount.expense
#1 20 20 100 25 95
#3 30 50 300 45 250
或者,您可以像这样(这里使用data.table)
融合
和dcast
$ blibrary(data.table)
库(reshape2)
dcast.data.table(mel t(as.data.table(mydf),id.vars = c(id,type)),
id〜variable + type,value.var =value)
# id transactions_expense transactions_income amount_expense amount_income
#1:20 25 20 95 100
#2:30 45 50 250 300
在data.table(1.9.8)的
dcast.data.table
的后续版本中你将能够直接做到这一点。如果我理解正确,那么@Arun试图实现的就是进行重新整形,而不必首先使融化
数据,这是目前在重写
,它本质上是一个融合的包装器。
+dcast
操作顺序。 / p>
而且,为了彻底,这里是
tidyr
方法: p>
库(dplyr)
库(tidyr)
mydf%>%
gather ,val,交易:amount)%>%
unite(var2,type,var)%>%
spread(var2,val)
#id expenses_amount expenses_transactions income_amount income_transactions
#1 20 95 25 100 20
#2 30 250 45 300 50
I have a long data set I would like to make wide and I'm curious if there is a way to do this all in one step using the reshape2 or tidyr packages in R.
The data frame
df
looks like this:id type transactions amount 20 income 20 100 20 expense 25 95 30 income 50 300 30 expense 45 250
I'd like to get to this:
id income_transactions expense_transactions income_amount expense_amount 20 20 25 100 95 30 50 45 300 250
I know I can get part of the way there with reshape2 via for example:
dcast(df, id ~ type, value.var="transactions")
But is there a way to reshape the entire df in one shot addressing both the "transactions" and "amount" variables at once? And ideally with new more appropriate column names?
解决方案In "reshape2", you can use
recast
(though in my experience, this isn't a widely known function).library(reshape2) recast(mydf, id ~ variable + type, id.var = c("id", "type")) # id transactions_expense transactions_income amount_expense amount_income # 1 20 25 20 95 100 # 2 30 45 50 250 300
You can also use base R's
reshape
:reshape(mydf, direction = "wide", idvar = "id", timevar = "type") # id transactions.income amount.income transactions.expense amount.expense # 1 20 20 100 25 95 # 3 30 50 300 45 250
Or, you can
melt
anddcast
, like this (here with "data.table"):library(data.table) library(reshape2) dcast.data.table(melt(as.data.table(mydf), id.vars = c("id", "type")), id ~ variable + type, value.var = "value") # id transactions_expense transactions_income amount_expense amount_income # 1: 20 25 20 95 100 # 2: 30 45 50 250 300
In later versions of
dcast.data.table
from "data.table" (1.9.8) you will be able to do this directly. If I understand correctly, what @Arun is trying to implement would be doing the reshaping without first having tomelt
the data, which is what happens presently withrecast
, which is essentially a wrapper for amelt
+dcast
sequence of operations.
And, for thoroughness, here's the
tidyr
approach:library(dplyr) library(tidyr) mydf %>% gather(var, val, transactions:amount) %>% unite(var2, type, var) %>% spread(var2, val) # id expense_amount expense_transactions income_amount income_transactions # 1 20 95 25 100 20 # 2 30 250 45 300 50
这篇关于在R中一次重塑多个值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!