在以值为条件的列上运行总和 [英] Running sum on a column conditional on value

查看:33
本文介绍了在以值为条件的列上运行总和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个二元变量向量,用于说明产品在此期间是否正在促销.我正在研究如何计算每次促销的持续时间和促销之间的持续时间.

I have a vector of binary variables which state whether a product is on promotion in the period. I'm trying to work out how to calculate the duration of each promotion and the duration between promotions.

promo.flag = c(1,1,0,1,0,0,1,1,1,0,1,1,0))

所以换句话说:如果 promo.flag 与上一期相同,则 running.total + 1,否则 running.total 是重置为 1

So in other words: if promo.flag is same as previous period then running.total + 1, else running.total is reset to 1

我尝试使用应用函数和 cumsum 但无法获得运行总工作的条件重置:-(

I've tried playing with apply functions and cumsum but can't manage to get the conditional reset of running total working :-(

我需要的输出是:

promo.flag =  c(1,1,0,1,0,0,1,1,1,0,1,1,0)
rolling.sum = c(1,2,1,1,1,2,1,2,3,1,1,2,0)

谁能解释一下如何在 R 中实现这一点?

Can anybody shed any light on how to achieve this in R?

推荐答案

听起来您需要运行长度编码(通过 base R 中的 rle 命令).

It sounds like you need run length encoding (via the rle command in base R).

unlist(sapply(rle(promo.flag)$lengths,seq))

给你一个向量1 2 1 1 1 2 1 2 3 1 1 2 1.不确定最后的零是什么意思,但我认为这是一个终止条件,事后很容易改变.

Gives you a vector 1 2 1 1 1 2 1 2 3 1 1 2 1. Not sure what you're going for with the zero at the end, but I assume it's a terminal condition and easy to change after the fact.

这是可行的,因为 rle() 返回一个包含两个的列表,其中一个名为 lengths 并包含每个重复多少次的紧凑序列.然后 seq 当输入一个整数时会给你一个从 1 到那个数字的序列.然后使用 rle()$lengths 中的单个数字重复调用 seq,生成迷你序列列表.unlist 然后将该列表转换为向量.

This works because rle() returns a list of two, one of which is named lengths and contains a compact sequence of how many times each is repeated. Then seq when fed a single integer gives you a sequence from 1 to that number. Then apply repeatedly calls seq with the single numbers in rle()$lengths, generating a list of the mini sequences. unlist then turns that list into a vector.

这篇关于在以值为条件的列上运行总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆