在Python中将一个数据帧拆分为多个5秒的数据帧 [英] Spliting a dataframe into multiple 5-second dataframes in Python

查看:270
本文介绍了在Python中将一个数据帧拆分为多个5秒的数据帧的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个相对较大的数据集,我想根据包含日期时间对象的列在 Python 中拆分为多个数据框.列中的值(我想通过其拆分数据框)的格式如下:

I have a relatively big dataset that I want to split into multiple dataframes in Python based on a column containing a datetime object. The values in the column (that I want to split the dataframe by) are given in the following format:

  1. 2015-11-01 00:00:05

您可能会认为数据框看起来像这样.

如何通过以下方式将数据帧分为5秒间隔:

  1. 第一个数据帧2015-11-01 00:00:00 - 2015-11-01 00:00:05

第二个数据帧2015-11-01 00:00:05 - 2015-11-01 00:00:10,依此类推.

2nd dataframe 2015-11-01 00:00:05 - 2015-11-01 00:00:10, and so on.

我还需要计算每个结果数据框中的观察次数.换句话说,如果我可以得到另一个包含2列的数据框,那就太好了:1st代表拆分的组(此列的值无关紧要:它们可以简单地为1、2、3 ...指示顺序). 5秒间隔),第二列显示属于各个间隔的观察次数

I also need to count the number of observations in each of resulting dataframes. In other, words, it would be nice if I could get another dataframe with 2 columns: 1st representing the splitted group (values of this column don't matter: they could be simply 1, 2, 3,.. indicating the order of the 5-second intervals ), 2nd column showing the number of observations belonging to the respective intervals

推荐答案

我认为存储多个DataFrame的最佳方法是dict:

I think the best for store multiple DataFrames is dict:

rng = pd.date_range('2015-11-01 00:00:00', periods=100, freq='S')
df = pd.DataFrame({'Date': rng, 'a': range(100)})  
print (df.head(10))
                 Date  a
0 2015-11-01 00:00:00  0
1 2015-11-01 00:00:01  1
2 2015-11-01 00:00:02  2
3 2015-11-01 00:00:03  3
4 2015-11-01 00:00:04  4
5 2015-11-01 00:00:05  5
6 2015-11-01 00:00:06  6
7 2015-11-01 00:00:07  7
8 2015-11-01 00:00:08  8
9 2015-11-01 00:00:09  9

dfs={k.strftime('%Y-%m-%d %H:%M:%S'):v for k,v in 
                 df.groupby(pd.Grouper(key='Date', freq='5S'))}

print (dfs['2015-11-01 00:00:00'])
                 Date  a
0 2015-11-01 00:00:00  0
1 2015-11-01 00:00:01  1
2 2015-11-01 00:00:02  2
3 2015-11-01 00:00:03  3
4 2015-11-01 00:00:04  4

print (dfs['2015-11-01 00:00:05'])
                 Date  a
5 2015-11-01 00:00:05  5
6 2015-11-01 00:00:06  6
7 2015-11-01 00:00:07  7
8 2015-11-01 00:00:08  8
9 2015-11-01 00:00:09  9

这篇关于在Python中将一个数据帧拆分为多个5秒的数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆