在plyr中使用多个变量 [英] using multiple variables in plyr
问题描述
我正在尝试使用plyr,但是在使用多个变量时遇到困难. 这里举个例子.
I am trying to use plyr but have difficulties in using several variables. Here an example.
df <- read.table(header=TRUE, text="
Firm Foreign SME Turnover
A1 N Y 200
A2 N N 1000
A3 Y Y 100
A1 N N 500
A2 Y Y 200
A3 Y Y 1000
A1 Y N 200
A2 N N 1000
A2 N Y 100
A2 N Y 200 ")
我正在尝试创建一个表,该表总结了两个变量的营业额. 基本上结合以下代码
I am trying to create a table which summarize the Turnover on the two variables. Basically combining the following codes
t1 <- ddply(df, c('Firm', 'Foreign'), summarise,
BudgetForeign = sum(Turnover, na.rm = TRUE))
t2 <- ddply(df, c('Firm', 'SME'), summarise,
BudgetSME = sum(Turnover, na.rm = TRUE))
具有以下结果
res <- read.table(header=TRUE, text="
Firm A1 A2 A3
BudgetForeign 200 200 1100
BudgetSME 200 500 1100")
res
如何在不进行多次操作和子集再合并的情况下实现此目标?
How can I achieve this without doing multiple operations and subset and combine afterwards ?
先谢谢了.
推荐答案
我认为您只需要Foreign或SME为'Y'
...的值.我会使用reshape2
软件包中的melt
和dcast
而不是plyr
.
I think you only want the values where Foreign or SME are 'Y'
... if that's the case. I would use melt
and dcast
from the reshape2
package rather than plyr
.
df.m <- melt(df, id.var=c('Firm', 'Turnover'))
dcast(df.m[df.m$value=='Y',], variable ~ Firm, value.var='Turnover', fun.aggregate=sum)
variable A1 A2 A3
1 Foreign 200 200 1100
2 SME 200 500 1100
如果要查看Y
和N
之间的差异,也可以将它们添加到dcast
中的公式中:
If you want to see the differences between Y
and N
also you can add them to the formula in dcast
:
> dcast(df.m, variable + value ~ Firm, value.var='Turnover', fun.aggregate=sum)
variable value A1 A2 A3
1 Foreign N 700 2300 0
2 Foreign Y 200 200 1100
3 SME N 700 2000 0
4 SME Y 200 500 1100
>
这篇关于在plyr中使用多个变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!