在R中一次重塑多个值 [英] Reshape multiple values at once in R

查看:96
本文介绍了在R中一次重塑多个值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个很长的数据集,我想做广泛,如果有一种方法可以使用R中的reshape2或tidyr包,一步一步地完成,我很好奇。



数据框 df 如下所示:

  id类型交易金额
20收入20 100
20费用25 95
30收入50 300
30费用45 250
pre>

我想要这样:

  id income_transactions costs_transactions income_amount costs_amount 
20 20 25 100 95
30 50 45 300 250

我知道我可以通过reshape2获得一部分方式,例如:

  dcast(df,id〜type,value .var =transactions)

但是有没有办法o一次重塑整个df,同时处理交易和金额变量?理想情况下,使用新的更适合的列名称?

解决方案

在reshape2中,您可以使用 recast (虽然在我的经验中,这不是一个广为人知的功能)。

  library(reshape2)
recast(mydf,id〜variable + type,id.var = c ,type))
#id transactions_expense transactions_income amount_expense amount_income
#1 20 25 20 95 100
#2 30 45 50 250 300

您还可以使用base R的 reshape

  reshape(mydf,direction =wide,idvar =id,timevar =type)
#id transactions.income amount.income transactions.expense amount.expense
#1 20 20 100 25 95
#3 30 50 300 45 250

或者,您可以像这样(这里使用data.table)融合 dcast

$ b

  library(data.table)
库(reshape2)
dcast.data.table(mel t(as.data.table(mydf),id.vars = c(id,type)),
id〜variable + type,value.var =value)
# id transactions_expense transactions_income amount_expense amount_income
#1:20 25 20 95 100
#2:30 45 50 250 300

在data.table(1.9.8)的 dcast.data.table 的后续版本中你将能够直接做到这一点。如果我理解正确,那么@Arun试图实现的就是进行重新整形,而不必首先使融化数据,这是目前在重写,它本质上是一个融合的包装器。 + dcast 操作顺序。 / p>




而且,为了彻底,这里是 tidyr 方法: p>

 库(dplyr)
库(tidyr)
mydf%>%
gather ,val,交易:amount)%>%
unite(var2,type,var)%>%
spread(var2,val)
#id expenses_amount expenses_transactions income_amount income_transactions
#1 20 95 25 100 20
#2 30 250 45 300 50


I have a long data set I would like to make wide and I'm curious if there is a way to do this all in one step using the reshape2 or tidyr packages in R.

The data frame df looks like this:

id  type    transactions    amount
20  income       20          100
20  expense      25          95
30  income       50          300
30  expense      45          250

I'd like to get to this:

id  income_transactions expense_transactions    income_amount   expense_amount
20       20                           25                 100             95
30       50                           45                 300             250

I know I can get part of the way there with reshape2 via for example:

dcast(df, id ~  type, value.var="transactions")

But is there a way to reshape the entire df in one shot addressing both the "transactions" and "amount" variables at once? And ideally with new more appropriate column names?

解决方案

In "reshape2", you can use recast (though in my experience, this isn't a widely known function).

library(reshape2)
recast(mydf, id ~ variable + type, id.var = c("id", "type"))
#   id transactions_expense transactions_income amount_expense amount_income
# 1 20                   25                  20             95           100
# 2 30                   45                  50            250           300

You can also use base R's reshape:

reshape(mydf, direction = "wide", idvar = "id", timevar = "type")
#   id transactions.income amount.income transactions.expense amount.expense
# 1 20                  20           100                   25             95
# 3 30                  50           300                   45            250

Or, you can melt and dcast, like this (here with "data.table"):

library(data.table)
library(reshape2)
dcast.data.table(melt(as.data.table(mydf), id.vars = c("id", "type")), 
                 id ~ variable + type, value.var = "value")
#    id transactions_expense transactions_income amount_expense amount_income
# 1: 20                   25                  20             95           100
# 2: 30                   45                  50            250           300

In later versions of dcast.data.table from "data.table" (1.9.8) you will be able to do this directly. If I understand correctly, what @Arun is trying to implement would be doing the reshaping without first having to melt the data, which is what happens presently with recast, which is essentially a wrapper for a melt + dcast sequence of operations.


And, for thoroughness, here's the tidyr approach:

library(dplyr)
library(tidyr)
mydf %>% 
  gather(var, val, transactions:amount) %>% 
  unite(var2, type, var) %>% 
  spread(var2, val)
#   id expense_amount expense_transactions income_amount income_transactions
# 1 20             95                   25           100                  20
# 2 30            250                   45           300                  50

这篇关于在R中一次重塑多个值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆