投射多个由fun.aggregate控制的value.var [英] Casting multiple value.var controled by fun.aggregate
问题描述
我有以下数据集
client_id<-c( A, A, B, B, B, B, B, A, A, B, B)
值<-c(10,35,20,30, 50、40、30、40、30、40、10)
period_30<-c(1、1、1、0、0、0、0、0、0、0、0)
period_60<-c(1,1,1,1,1,0,0,0,0,0,0)
符号<-c( D, D, D, D, C, C, C, D, D, D, D)
data<-data.frame(client_id,值,period_30,period_60,符号)
我可以使用此代码来计算每个给定的时间段,其代码如下:
库(data.table)
test<-dcast(setDT(data) ,client_id〜paste0( period_30,sign),value.var = period_30,sum)
但我还要根据不同的分割来计算值。
预期结果如下所示:
client_id av.value_period_30_sign_D av.value_period_60_sign_D av.value_period_30_sign_C av.value_period_30_sign_D
A 34.16667 NaN NaN NaN
B 30.00000 34.16667 NaN 7.5$$p$ $ p>
然后,它应该可以扩展到周期1中类型为X的其他拆分,例如符号X的平均值。
我不确定这种方法是否可以实现所需的输出。但是我正在查看 fun.aggregate
参数。也许可以结合使用多个 value.var
参数?
更新:Joel的代码回答了问题的第一部分。
client_id sign period_30 period_60
AD 34.16667 34.16667
BD 30.00000 34.16667
BC NaN 27.50000
但是我该如何转置变量并根据拆分自动分配名称?
解决方案另一种方法(会更快)正在使用 data.table
基于对问题的编辑:(希望代码现在可以自我解释)
library(data.table)
data1<-setDT(data)[,lapply(.SD,function(x)mean(value [x == 1])),
.SDcols = period_30:period_60,
by =。(client_id,sign)]
#`dcast`,如果也是来自data.table包
dcast(data1,client_id〜sign,drop = FALSE, value.var = c( period_30, period_60))
#client_id period_30_C period_30_D period_60_C period_60_D
#1:A不适用34.16667不适用34.16667
#2:B不适用30.00000 27.5 34.16667
I have the following dataset
client_id <- c("A", "A", "B", "B", "B", "B", "B", "A", "A", "B", "B")
value <- c(10, 35, 20, 30, 50, 40, 30, 40, 30, 40, 10)
period_30 <- c(1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0)
period_60 <- c(1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0)
sign <- c("D", "D", "D", "D", "C", "C", "C", "D", "D", "D", "D")
data <- data.frame(client_id, value, period_30, period_60, sign)
I can use this code to count the number of different splits per given period with the code below:
library(data.table)
test<- dcast(setDT(data), client_id ~ paste0("period_30", sign), value.var = "period_30", sum)
But I would like to also calculate the value as per the different splits.
The expected outcome would look like this:
client_id av.value_period_30_sign_D av.value_period_60_sign_D av.value_period_30_sign_C av.value_period_30_sign_D
A 34.16667 NaN NaN NaN
B 30.00000 34.16667 NaN 27.50000
And then, it should be extendable to additional splits, like average value of sign X, of type X in period 1.
I am not sure if the desired output is doable with this approach. But I was looking at the fun.aggregate
argument. Perhaps it could be used in combination with multiple value.var
arguments?
Update: Joel's code answers the first part of the question.
client_id sign period_30 period_60
A D 34.16667 34.16667
B D 30.00000 34.16667
B C NaN 27.50000
But how do I transpose the variables and assign the names as per the splits automatically?
解决方案 another method(would be faster) is using data.table
Based on the edit made to the question :(hope the code is self explanatory now)
library(data.table)
data1 <- setDT(data)[, lapply(.SD, function(x) mean(value[x==1])),
.SDcols = period_30:period_60,
by = .(client_id, sign)]
# `dcast` if also from `data.table` package
dcast(data1, client_id~sign, drop = FALSE, value.var = c("period_30", "period_60"))
# client_id period_30_C period_30_D period_60_C period_60_D
#1: A NA 34.16667 NA 34.16667
#2: B NaN 30.00000 27.5 34.16667
这篇关于投射多个由fun.aggregate控制的value.var的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!