使用大 pandas 计算不规则时间序列的每日平均值 [英] Calculating daily average from irregular time series using pandas

查看：205 发布时间：2017/2/24 23:24:59 python csv pandas timestamp

本文介绍了使用大 pandas 计算不规则时间序列的每日平均值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图从csv文件中获取每日平均值。

csv文件中的数据从20点13:00开始2013年9月，2014年1月14日至10:57：

 时间值
 20/09/2013 13： 00 5.133540 
 20/09/2013 13:01 5.144993 
 20/09/2013 13:02 5.158208 
 20/09/2013 13:03 5.170542 
 20/09/2013 13:04 5.167899 20/09/2013 13:25 5.168780 
 20/09/2013 13:26 5.179351 
 ...

我用以下形式导入它们：

import pandas as pd = pd.read_csv（'< file name>'，parse_dates = {'Timestamp'：'Time']}，index_col ='Timestamp'）

这会导致

 值
 Timestamp 
 2013- 09-20 13:00:00 5.133540 
 2013-09-20 13:01:00 5.144993 
 2013-09-20 13:02:00 5.158208 
 2013-09-20 13： 03:00 5.170542 
 2013-09-20 13:04:00 5.167899 
 2013-09-20 13:25:00 5.168780 
 2013-09-20 13:26:00 5.179351 
 ...

然后我做

  dataDailyAv = data.resample（'D'，how ='mean'）

这会导致

 值
 Timestamp 
 2013-01-10 8.623744 
 2013-01-11 NaN 
 2013-01-12 NaN 
 2013-01-13 NaN 
 2013-01-14 NaN 
 ...

换句话说，结果包含未出现在原始数据中的日期，（例如

感谢您对我们的支持！。

编辑：显然日期解析有问题：01/10/2013解释为2013年1月10日，而不是2013年10月1日。这可以解决

解决方案

您可以通过编辑csv文件中的日期格式来指定日期格式< dayfirst = True .io.parsers.read_csv.htmlrel =nofollow> read_csv docs 。

I am trying to obtain daily averages from an irregular time series from a csv-file.

The data in the csv-file start at 13:00 on 20 September 2013 and run till 10:57 on 14 January 2014:

Time                    Values
20/09/2013 13:00        5.133540
20/09/2013 13:01        5.144993
20/09/2013 13:02        5.158208
20/09/2013 13:03        5.170542
20/09/2013 13:04        5.167899    20/09/2013 13:25        5.168780
20/09/2013 13:26        5.179351
...

I import them with:

import pandas as pd
data = pd.read_csv('<file name>', parse_dates={'Timestamp':'Time']},index_col='Timestamp')

This results in

                           Values
Timestamp                          
2013-09-20 13:00:00        5.133540
2013-09-20 13:01:00        5.144993
2013-09-20 13:02:00        5.158208
2013-09-20 13:03:00        5.170542
2013-09-20 13:04:00        5.167899
2013-09-20 13:25:00        5.168780
2013-09-20 13:26:00        5.179351
...

And then I do

dataDailyAv = data.resample('D', how = 'mean')

This results in

                  Values
Timestamp                 
2013-01-10        8.623744
2013-01-11             NaN
2013-01-12             NaN
2013-01-13             NaN
2013-01-14             NaN
...

In other words, the result contains dates that do not appear in the original data, and for some of these dates (e.g. 10 January 2013), there even appears a value.

Any ideas about what is going wrong?

Thanks.

Edit: apparently something goes wrong with the parsing of the date: 01/10/2013 is interpreted as 10 January 2013 instead of 1 October 2013. This can be solved by editing the date format in the csv-file, but is there a way to specify the date format in read_csv?

解决方案

You want dayfirst=True, one of the many tweaks listed in the read_csv docs.

这篇关于使用大 pandas 计算不规则时间序列的每日平均值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用大 pandas 计算不规则时间序列的每日平均值 [英] Calculating daily average from irregular time series using pandas

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用大 pandas 计算不规则时间序列的每日平均值 [英] Calculating daily average from irregular time series using pandas

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭