计算连续天数python数据框 [英] count consecutive days python dataframe

查看:31
本文介绍了计算连续天数python数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试按连续日期对 ID 进行分组.

I'm trying to group IDs by consecutive dates.

ID     Date   
abc    2017-01-07  
abc    2017-01-08  
abc    2017-01-09  
abc    2017-12-09  
xyz    2017-01-05  
xyz    2017-01-06 
xyz    2017-04-15  
xyz    2017-04-16 

需要返回:

ID     Count
abc    3
abc    1
xyz    2
xyz    2

我试过了:

d = {'ID': ['abc', 'abc', 'abc', 'abc', 'xyz', 'xyz', 'xyz', 'xyz'], 'Date': ['2017-01-07','2017-01-08', '2017-01-09', '2017-12-09', '2017-01-05', '2017-01-06', '2017-04-15', '2017-04-16']}
df = pd.DataFrame(data=d)
df['Date'] =  pd.to_datetime(df['Date'])

today = pd.to_datetime('2018-10-23')   
x = df.sort_values('Date', ascending=0)
g = x.groupby(['ID'])
x[(today - x['Date']).dt.days == g.cumcount()].groupby(['ID']).size()

是否有一种简单的方法可以通过 ID 获取所有日期范围的计数?

Is there a simple way to do this in order to obtain the counts of all date ranges by ID?

推荐答案

创建一个 Series 来检查每个 ID 中日期之间的差异.检查这是否不是 1 天,然后按 ID 和该系列的累计总和进行分组.

Create a Series which checks for the difference between Dates within each ID. Check if that's not 1 day, and then groupby the ID and the cumulative sum of that Series.

import pandas as pd

s = df.groupby('ID').Date.diff().dt.days.ne(1).cumsum()
df.groupby(['ID', s]).size().reset_index(level=1, drop=True)

输出:

ID
abc    3
abc    1
xyz    2
xyz    2
dtype: int64

这篇关于计算连续天数python数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆