使用Python/Pandas提取时间序列中的工作日 [英] Extract Business Days in Time Series using Python/Pandas

查看：526 发布时间：2020/5/24 0:32:59 python pandas time-series

本文介绍了使用Python/Pandas提取时间序列中的工作日的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用时间序列"中的高频数据，并且我希望从我的数据中获取所有工作日.我的数据观察值以秒为单位，因此每天有86400秒，我的数据集分布在31天之内(因此有2678400个观察值！).

I am working with high frequency data in Time Series and I would like to get all the business days from my data. My data observations are separated by seconds, so there are 86400 seconds each day and my data set are spread over 31 days (so there are 2,678,400 observations!).

这是我的数据的一部分:

Here is (part) of my data:

In[1]: ts
Out[1]: 
2013-01-01 00:00:00    0.480928
2013-01-01 00:00:01    0.480928
2013-01-01 00:00:02    0.483977
2013-01-01 00:00:03    0.486725
2013-01-01 00:00:04    0.486725
...
2013-01-31 23:59:56    0.451630
2013-01-31 23:59:57    0.451630
2013-01-31 23:59:58    0.451630
2013-01-31 23:59:59    0.454683
Freq: S, Length: 2678400

我要做的是创建一个新的时间序列，其中包含本月的工作日，但是我希望它们具有相应的数据秒. 例如，如果2013年1月1日(星期三)至2013年1月4日(星期五)是一月第一周的第一工作日，则:

What I would like to do is to create a new time series which consists of the business days from this month, but I would like to have them with their corresponding data seconds. For example, if 2013-01-02 (WED) until 2013-01-04 (Fri) are the first business days of the first week in January, then:

2013-01-02 00:00:00    0.507477
2013-01-02 00:00:01    0.501373
...
2013-01-03 00:00:00    0.489778
2013-01-03 00:00:01    0.489778
...
2013-01-04 23:59:58    0.598115
2013-01-04 23:59:59    0.598115
Freq: S, Length: 259200

因此，由于这些时间是周末，因此它将当然排除2013年1月5日和2013年1月6日星期六的所有数据. 等等...

so it will exclude of course all the data on Sat 2013-01-05 and 2013-01-06 since these are the weekend days. and so on...

我尝试使用一些内置的pandas命令，但由于它们是按日汇总的，因此没有找到正确的命令，而没有考虑到每天其中都包含子列.也就是说，每一秒都有一个值，不应将它们取平均值，而应将其分组为一个新的序列.

I tried to use some pandas built-in commands, but couldn't find the right one since they aggregate by day without taking into consideration that each day contains sub columns in them. That is, for each second there is a value and they should not be averaged, just grouped together to a new series..

例如，我尝试过:

ts.asfreq(BDay())->查找工作日，但每天的平均值
ts.resample()->您必须定义方式"(平均值，最大值，最小值...)
ts.groupby(lambda x : x.weekday)->也不行！
ts = pd.Series(df, index = pd.bdate_range(start = '2013/01/01 00:00:00', end = '2013/01/31 23:59:59' , freq = 'S')) -> df，因为原始数据为DataFramem. 使用pd.bdate_range没有帮助，因为df和index必须在同一维上.

ts.asfreq(BDay()) --> finds the business day but averages over each day
ts.resample() --> you must define 'how' (mean, max, min...)
ts.groupby(lambda x : x.weekday) --> not either!
ts = pd.Series(df, index = pd.bdate_range(start = '2013/01/01 00:00:00', end = '2013/01/31 23:59:59' , freq = 'S')) --> df since the original data is as DataFramem. Using pd.bdate_range didn't help since df and index must be in the same dimension..

我在熊猫文档中进行了搜索，但用谷歌搜索却找不到线索...
有人有主意吗?

I searched in pandas documentation, googled but could not find a clue...
Does anybody have an idea?

非常感谢您的帮助！

谢谢！

p.s 我宁愿不使用循环，因为我的数据集非常大... (我还有其他几个月要分析)

p.s I would rather not use loops for that, since my data set is very large... (I have also other months to analyse)

使用Python/Pandas提取时间序列中的工作日 [英] Extract Business Days in Time Series using Python/Pandas

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用Python/Pandas提取时间序列中的工作日 [英] Extract Business Days in Time Series using Python/Pandas

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭