使用多索引计算数据帧中数字的连续出现次数 [英] Counting the number of consecutive occurences of numbers in dataframe with multi index

查看:74
本文介绍了使用多索引计算数据帧中数字的连续出现次数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含多索引(股票和日期时间)的数据框,其中包含一个包含 1 和 0 的虚拟列,我想计算每只股票和每一天,在每一行中 1 或 0 的次数发生在假人"中列,每次从 1 开始,向上计数为 1,向下计数为 0 我在下面有一个示例,其中Counter"列代表我想要创建的内容:

I have a dataframe that has a multi index (stock and datetime) with a dummy column that contains 1s and 0s and I would like to count for each stock and for each day, in each row how many times the 1s or 0s have occurred in the 'Dummy" column, starting at 1 every time, and counting up for 1s and counting down for 0s I have an example below where the column 'Counter' represents what I would like to create:

import pandas as pd
df = pd.DataFrame(  {
'stock': ['AAPL', 'AAPL', 'AAPL','AAPL', 'AAPL','AAPL', 'AAPL', 'MSFT', 'MSFT'], 
'datetime': ['2015-01-02 20:57', '2015-01-02 20:58', '2015-01-02 20:59', '2015-01-02 21:00','2015-01-03 20:57', '2015-01-03 20:58', '2015-01-03 20:59','2015-01-02 20:57', '2015-01-02 20:58'],
'Dummy': [0, 0, 1, 1, 1,1, 0, 1, 1],
'Counter': [-1, -2, 1, 2, 1, 2, 1, 1,2]})
df['datetime'] = pd.to_datetime(df['datetime'])
df.set_index(['stock', 'datetime'], inplace =True)

这里回答了这个问题的一个更简单的版本(但是忽略了股票代码和日期)

A simpler version of this problem was answered here (this ignores the tickers and dates however)

统计数字连续出现的次数数据框

推荐答案

只需稍微修改你之前的解决方案

Just slightly modify your previous solution

m = df.Dummy.diff().ne(0).cumsum()
counters = df.groupby([df.index.get_level_values(0), 
                       df.index.get_level_values(1).date, 
                       m]).cumcount()+1
df['Counter'] = np.where(df['Dummy']==0, -1, 1) * counters

Out[95]:
                           Dummy  Counter
stock datetime
AAPL  2015-01-02 20:57:00      0       -1
      2015-01-02 20:58:00      0       -2
      2015-01-02 20:59:00      1        1
      2015-01-02 21:00:00      1        2
      2015-01-03 20:57:00      1        1
      2015-01-03 20:58:00      1        2
      2015-01-03 20:59:00      0       -1
MSFT  2015-01-02 20:57:00      1        1
      2015-01-02 20:58:00      1        2

这篇关于使用多索引计算数据帧中数字的连续出现次数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆