使用Python pandas 使用每日数据的月平均值 [英] Monthly Averages Using Daily Data Using Python Pandas

查看:573
本文介绍了使用Python pandas 使用每日数据的月平均值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

就使用熊猫而言,我是Python用户,但是菜鸟.我希望在使用更多时间序列时会更多地使用它,并且听说它们使用熊猫进行修改要容易得多.我已经阅读了一些教程,但是它们还没有意义.希望你能帮我一个例子.

I'm a Python user but a rookie in terms of using pandas. I'm hoping to use it more as I'm getting into working with a lot of time series and I've heard they're a whole lot easier to modify with pandas. I've read through some of the tutorials but they have yet to make sense. Hoping you can help me out with an example.

我有一个包含四列的文本文件:年,月,日和降雪深度.这是1979年至2009年这30年的每日数据.我想使用熊猫技术(即,将1979年1月,1979年2月,... 2009年12月...的所有值隔离开来,并对每个值进行平均)来计算360(30年X 12个月)个人每月平均值.有人可以帮我提供一些示例代码吗?

I have a text file with four columns: year, month, day and snow depth. This is daily data for a 30-year period, 1979-2009. I would like to calculate 360 (30yrs X 12 months) individual monthly averages using pandas techniques (i.e. isolating all the values for Jan-1979, Feb-1979,... Dec-2009 and averaging each). Can anyone help me out with some example code?

谢谢.

1979    1   1   3
1979    1   2   3
1979    1   3   3
1979    1   4   3
1979    1   5   3
1979    1   6   3
1979    1   7   4
1979    1   8   5
1979    1   9   7
1979    1   10  8
1979    1   11  16
1979    1   12  16
1979    1   13  16
1979    1   14  18
1979    1   15  18
1979    1   16  18
1979    1   17  18
1979    1   18  20
1979    1   19  20
1979    1   20  20
1979    1   21  20
1979    1   22  20
1979    1   23  18
1979    1   24  18
1979    1   25  18
1979    1   26  18
1979    1   27  18
1979    1   28  18
1979    1   29  18
1979    1   30  18
1979    1   31  19
1979    2   1   19
1979    2   2   19
1979    2   3   19
1979    2   4   19
1979    2   5   19
1979    2   6   22
1979    2   7   24
1979    2   8   27
1979    2   9   29
1979    2   10  32
1979    2   11  32
1979    2   12  32
1979    2   13  32
1979    2   14  33
1979    2   15  33
1979    2   16  33
1979    2   17  34
1979    2   18  36
1979    2   19  36
1979    2   20  36
1979    2   21  36
1979    2   22  36
1979    2   23  36
1979    2   24  31
1979    2   25  29
1979    2   26  27
1979    2   27  27
1979    2   28  27

推荐答案

您需要按年和月对数据进行分组,然后计算每组的平均值.伪代码:

You'll want to group your data by year and month, and then calculate the mean of each group. Pseudo-code:

import numpy as np
import pandas as pd

# Read in your file as a pandas.DataFrame
# using 'any number of whitespace' as the seperator
df = pd.read_csv("snow.txt", sep='\s*', names=["year", "month", "day", "snow_depth"])

# Show the first 5 rows of the DataFrame
print df.head()

# Group data first by year, then by month
g = df.groupby(["year", "month"])

# For each group, calculate the average of only the snow_depth column
monthly_averages = g.aggregate({"snow_depth":np.mean})

有关Pandas中的拆分应用组合方法的更多信息,请在此处此处阅读.

For more, about the split-apply-combine approach in Pandas, read here.

DataFrame 是一个:

带有标注轴(行和列)的二维大小可变的,可能是异构的表格数据结构."

"Two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns)."

出于您的目的,numpy ndarrayDataFrame之间的区别不太明显,但是DataFrames具有许多使您的生活更轻松的函数,因此,建议对它们进行一些阅读.

For your purposes, the difference between a numpy ndarray and a DataFrame are not too significant, but DataFrames have a bunch of functions that will make your life easier, so I'd suggest doing some reading on them.

这篇关于使用Python pandas 使用每日数据的月平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆