如何按天拆分pandas数据帧或系列(可能使用迭代器) [英] How to split a pandas dataframe or series by day (possibly using an iterator)

查看:597
本文介绍了如何按天拆分pandas数据帧或系列(可能使用迭代器)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有很长一段时间,例如。

I have a long time series, eg.

import pandas as pd
index=pd.date_range(start='2012-11-05', end='2012-11-10', freq='1S').tz_localize('Europe/Berlin')
df=pd.DataFrame(range(len(index)), index=index, columns=['Number'])

现在我想提取每天的所有子DataFrame,以获得以下输出:

Now I want to extract all sub-DataFrames for each day, to get the following output:

df_2012-11-05: data frame with all data referring to day 2012-11-05
df_2012-11-06: etc.
df_2012-11-07
df_2012-11-08
df_2012-11-09
df_2012-11-10

这样做的最有效方法是避免检查index.date == give_date是否非常慢。此外,用户不知道框架中的天数范围。

What is the most effective way to do this avoiding to check if the index.date==give_date which is very slow. Also, the user does not know a priory the range of days in the frame.

任何提示都使用迭代器执行此操作?

Any hint do do this with an iterator?

我目前的解决方案是这样,但它不是那么优雅,下面定义了两个问题:

My current solution is this, but it is not so elegant and has two issues defined below:

time_zone='Europe/Berlin'
# find all days
a=np.unique(df.index.date) # this can take a lot of time
a.sort()
results=[]
for i in range(len(a)-1):
    day_now=pd.Timestamp(a[i]).tz_localize(time_zone)
    day_next=pd.Timestamp(a[i+1]).tz_localize(time_zone)
    results.append(df[day_now:day_next]) # how to select if I do not want day_next included?

# last day
results.append(df[day_next:])

这种方法存在以下问题:

This approach has the following problems:


  • a = np.unique(df.index.date)可能会占用很多时间

  • df [day_now:day_next]包含day_next,但我需要在范围内排除它

推荐答案

也许是groupby?

Perhaps groupby?

DFList = []
for group in df.groupby(df.index.day):
    DFList.append(group[1])

应该为您提供数据框列表,其中每个数据框是一天的数据。

Should give you a list of data frames where each data frame is one day of data.

或者在一行中:

DFList = [group[1] for group in df.groupby(df.index.day)]

得爱蟒蛇!

这篇关于如何按天拆分pandas数据帧或系列(可能使用迭代器)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆