R数据表累加和函数 [英] R data.table cumulative sum function
问题描述
我创建了以下可重现的示例:
I have created the following reproducible example:
library(data.table)
Col_1 <- 0.05
Col_2 <- c( "B", "A", "C", "B", "C", "A", "C", "B", "B", "A" )
Col_3 <- 1000
Col_4 <- ""
data <- data.frame( Col_1, Col_2, Col_3, Col_4 )
mydata.table <- as.data.table( data )[ , list( Col_1, Col_2, Col_3, Col_4 ) ]
Col1 <- "Col_1"; Col2 <- "Col_2"; Col3 <- "Col_3"; Col4 <- "Col_4"
mydata.table[, (Col4) := ifelse( get( Col2 ) == "A" , get( Col1 ) * get( Col3 ), "0" ) ]
mydata.table[ , (Col3) := cumsum( c( 1000, head( Col4, -1 )))]
我的问题是Col3没有正确计算的cumsum和保持静态在1000.我已经适应我的代码从这个网站的其他答案,但需要一点帮助。
我想Col3从1000开始,然后累加添加Col4(上面一行)。
My problem is that Col3 is not calculating the cumsum correctly and remains static at 1000. I have adapted my code from other answers on this site but need a little help please. I would like Col3 to start at 1000 then cumulatively add Col4 (lagging by one row above).
我想输出显示以下内容: / p>
I would like the output to show the following:
Col_1 <- 0.05
Col_2 <- c( "B", "A", "C", "B", "C", "A", "C", "B", "B", "A")
Col_3 <- c( 1000.0, 1000.0, 1050.0, 1050.0, 1050.0, 1050.0, 1102.5, 1102.5, 1102.5, 1102.5 )
Col_4 <- c( 0, 50.0, 0, 0, 0, 52.5, 0, 0, 0, 55.1 )
good_data <- data.frame( Col_1, Col_2, Col_3, Col_4 )
gooddata.table <- as.data.table( good_data )[ , list( Col_1, Col_2, Col_3, Col_4 )]
是否需要在循环中计算,因为每个列都依赖于另一个的结果?
谢谢。
Would this need to be calculated in a loop as each column relies on the result of another? Thank you.
UPDATE以下面的注释为例,感谢@Frank的回答:
UPDATE to example based on comments below and including new code thanks to @Frank 's answer:
library(data.table)
Col_1 <- 0.05
Col_2 <- c( "B", "A", "C", "B", "C", "A", "C", "B", "B", "A" )
Col_3 <- 1000
Col_4 <- 0
mydata.table <- data.table(Col_1, Col_2, Col_3, Col_4)
Col1 <- "Col_1"; Col2 <- "Col_2"; Col3 <- "Col_3"; Col4 <- "Col_4"
mydata.table[, (Col3) := Col_3*cumprod(1+Col_1*shift(Col_2=="A", type="lag", fill=FALSE))]
mydata.table[, (Col4) := ifelse( get( Col2 ) == "A" , get( Col1 ) * get( Col3 ), "0" ) ]
推荐答案
要获得所需的输出,请尝试跳过创建中间对象 Col_4
只需执行
To get your desired output, try skipping creation of the intermediate object Col_4
and just doing
mydata.table[, Col_3*cumprod(1 + Col_1*shift(Col_2 == "A", type = "lag", fill=FALSE))]
这个工作,尝试?cumprod
和?shift
。您还可以运行它,例如
To understand how this works, try ?cumprod
and ?shift
. You can also run it in pieces, e.g.,
mydata.table[, shift(Col_2 == "A", type = "lag", fill = FALSE)]
(我忽略了你的问题 get
在注释中提及;以及您覆盖 Col_3
。)
(I'm ignoring your issues with get
mentioned in comments; as well as your overwriting of Col_3
.)
这篇关于R数据表累加和函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!