计算1年内每个ID的行数 [英] Count number of rows for each ID within 1 year
问题描述
我有一个类似这样的熊猫数据框
I have a pandas dataframe something like this
Date ID
01/01/2016 a
05/01/2016 a
10/05/2017 a
05/05/2014 b
07/09/2014 b
12/08/2017 b
我需要做的是添加一列,该列显示上一年内每个ID的条目数,另一列显示下一年内的ID数.我写了一些可怕的代码,遍历整个数据帧(数百万行)并进行了计算,但是必须有更好的方法!
What I need to do is to add a column which shows the number of entries for each ID that occurred within the last year and another column showing the number within the next year. I've written some horrible code that iterates through the whole dataframe (millions of lines) and does the computations but there must be a better way!
推荐答案
我认为您需要 boolean indexing
首先用于过滤,然后进行groupby
并汇总
I think you need between
with boolean indexing
for filter first and then groupby
and aggregate size
.
concat
的输出和添加 reindex
来添加缺少的行由0
填充:
print (df)
Date ID
0 01/01/2016 a
1 05/01/2016 a
2 10/05/2017 a
3 05/05/2018 b
4 07/09/2014 b
5 07/09/2014 c
6 12/08/2018 b
#convert to datetime (if first number is day, add parameter dayfirst)
df['Date'] = pd.to_datetime(df['Date'], dayfirst=True)
now = pd.datetime.today()
print (now)
oneyarbeforenow = now - pd.offsets.DateOffset(years=1)
oneyarafternow = now + pd.offsets.DateOffset(years=1)
#first filter
a = df[df['Date'].between(oneyarbeforenow, now)].groupby('ID').size()
b = df[df['Date'].between(now, oneyarafternow)].groupby('ID').size()
print (a)
ID
a 1
dtype: int64
print (b)
ID
b 2
dtype: int64
df1 = pd.concat([a,b],axis=1).fillna(0).astype(int).reindex(df['ID'].unique(),fill_value=0)
print (df1)
0 1
a 1 0
b 0 2
c 0 0
如果需要按日期对每个日期进行比较,请添加或减去每组year offset
需要具有条件和sum
Trues的自定义功能:
If need compare each date by first date add or subtract year offset
per group need custom function with condition and sum
Trues:
offs = pd.offsets.DateOffset(years=1)
f = lambda x: pd.Series([(x > x.iat[-1] - offs).sum(), \
(x < x.iat[-1] + offs).sum()], index=['last','next'])
df = df.groupby('ID')['Date'].apply(f).unstack(fill_value=0).reset_index()
print (df)
ID last next
0 a 1 3
1 b 3 2
2 c 1 1
这篇关于计算1年内每个ID的行数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!