使用特定时间间隔将大 pandas 时间序列数据帧分组 [英] group pandas time-series data frame using specific time intervals

查看：111 发布时间：2020/5/24 4:26:37 python csv pandas

本文介绍了使用特定时间间隔将大 pandas 时间序列数据帧分组的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个很大的csv文件，其中带有iso格式2015-04-01 10:26:41的时间戳数据.数据跨越数月，输入范围从相隔30秒到数小时不等.它的列是id，时间，速度.

I have a large csv file with time stamp data in the iso format 2015-04-01 10:26:41. The data span multiple months with entries ranging from 30 secs apart to multiple hours. It's columns are id, time, speed.

最终，我想按15分钟的时间间隔对数据进行分组，然后计算平均速度，但是在15分钟的时隙中有很多条目.

Ultimately I want to group data by a time interval of 15 mins, then calculate an average speed, for however many entries are in the 15 mins timeslot.

我正在尝试使用Pandas，因为它似乎具有可靠的时间序列工具，并且这样做可能很容易，但是我却遇到了第一个障碍.

I am trying to use Pandas because it seems like it has a solid time-series tools and it might be easy to do this, but I am falling at the first hurdle.

到目前为止，我已经将CSV导入为数据框，并且所有列的dtype为object.我已经按日期对数据进行了排序，现在正尝试按时间间隔对条目进行分组，这正是我在其中努力的地方.基于谷歌搜索，我尝试使用此代码df.resample('5min', how=sum) resample数据.在这里，我得到错误TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex.我正在考虑尝试groupby方法，也许像在df.groupby(lambda x:x.minutes + 5)中那样使用lambda，这会产生错误AttributeError: 'str' object has no attribute 'minutes'.

So far I have imported the CSV as a dataframe and, all columns have a dtype of object. I have sorted the data by date and am now trying to group the entries by a time interval which is where i'm struggling. Based around google searching, I have tried to resample the data using this code df.resample('5min', how=sum) Here I get the error TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex. I was thinking about trying the groupbymethod, perhaps using lambda as in df.groupby(lambda x:x.minutes + 5) which produces the error AttributeError: 'str' object has no attribute 'minutes'.

基本上，我对a)熊猫是否具有其可以识别的格式的时间序列数据感到困惑，因为它是dtype是object，并且b)如果它可以识别它，我似乎就不知道了缩短时间间隔.

Basically I'm a little confused as to a) whether pandas has the time-series data in a format it's recognising as it's dtype is object, and b) if it can recognize it I can't seem to get the time-intervals down.

热衷于学习是否有人能指出我正确的方向.

Keen to learn if anyone could point me in the right direction.

DF看起来像这样

        0        1                    2      3       
0          id  boat_id                 time  speed     
1      386226       32  2015-01-15 05:14:32      4.2343243      
2      386285       32  2015-01-15 05:44:57      3.45234

使用特定时间间隔将大 pandas 时间序列数据帧分组 [英] group pandas time-series data frame using specific time intervals

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用特定时间间隔将大 pandas 时间序列数据帧分组 [英] group pandas time-series data frame using specific time intervals

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭