Pandas 损益汇总至下一个工作日 [英] Pandas P&L rollup to the next business day

查看:61
本文介绍了Pandas 损益汇总至下一个工作日的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我很难有效地做到这一点.我在数据框中有一些股票和每日盈亏信息.实际上,我有数百万行数据,因此效率非常重要!数据框看起来像:

I'm having a hard time trying to do this efficiently. I have some stocks and daily P&L info in a dataframe. In reality, I have millions of rows of data so efficiency matters a lot! The Dataframe looks like :

-------------------------------
| Date       | Security | P&L |
-------------------------------
| 2016-01-01 | AAPL     | 100 |
-------------------------------
| 2016-01-02 | AAPL     | 200 |
-------------------------------
| 2016-01-03 | AAPL     | 300 |
-------------------------------
| 2016-01-04 | AAPL     | -200 |
-------------------------------

所有,我想做的是将损益表滚动到下一个工作日(不包括所有美国假期和周末)因此,生成的 Dataframe 如下所示:

All, I want to do is roll the P&L over to the next business day (exclude all US holidays and weekends) So, the resultant Dataframe looks like this:

-------------------------------
| Date       | Security | P&L |
-------------------------------
| 2016-01-04 | AAPL     | 400 |
-------------------------------

我正在寻找一种有效的方法来实现这一目标.不幸的是,我确实有数千种证券和超过 5 年的数据需要处理,因此暴力破解是行不通的!

I'm looking for an efficient way to achieve this. I do have thousands of securities and over 5 yrs of data to process so brute force can't work, unfortunately!

提前致谢,非常感谢您对此的任何指点!

Thanks in advance and highly appreciate any pointers on this!

推荐答案

我们可以创建业务日期的DataFrame然后merge_asof.然后我们可以对此进行分组以获得总和.

We can create the DataFrame of business dates then merge_asof. Then we can group on this to get the sums.

import pandas as pd
from pandas.tseries.holiday import USFederalHolidayCalendar

#df['Date'] = pd.to_datetime(df.Date)
date_min = '2015-01-01'
date_max = '2016-12-31'

cal = USFederalHolidayCalendar()
holidays = cal.holidays(date_min, date_max).tolist()
df2 = pd.DataFrame({'bdate': pd.bdate_range(date_min, date_max, 
                                            holidays=holidays, freq='C')})

res = pd.merge_asof(df, df2, left_on='Date', right_on='bdate', direction='forward')
#        Date Security  P&L      bdate
#0 2016-01-01     AAPL  100 2016-01-04
#1 2016-01-02     AAPL  200 2016-01-04
#2 2016-01-03     AAPL  300 2016-01-04
#3 2016-01-04     AAPL -200 2016-01-04

res.groupby(['Security', 'bdate'])['P&L'].sum()
#Security  bdate     
#AAPL      2016-01-04    400

这篇关于Pandas 损益汇总至下一个工作日的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆