Python:绘制按月归一化的直方图 [英] Python: Plot month-wise normalised histogram

查看:176
本文介绍了Python:绘制按月归一化的直方图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个CSV文件,其数据如下所示:

I have a CSV file with data that look like this:

Time               Pressure
1/1/2017 0:00       5.8253
...                     ...
3/1/2017 0:10       4.2785
4/1/2017 0:20       5.20041
5/1/2017 0:30       4.40774
6/1/2017 0:40       4.03228
7/1/2017 0:50       5.011924
12/1/2017 1:00      3.9309888

我想对压力数据制作一个每月的直方图(已规范化),最后将绘图写入PDF.我了解我需要使用GroupbyNumpy.hist选项,但是我不确定如何使用它们. (我是Python的新手).请帮忙!

I want to make a month-wise histogram (NORMALIZED) on the pressure data and finally write the plots into PDF. I understand that I need to use Groupby and Numpy.hist option,but I'm not sure how to use them. (I'm a newbie to Python). Please help!

代码1:

n = len(df) // 5
for tmp_df in (df[i:i+n] for i in range(0, len(df), n)):
    gb_tmp = tmp_df.groupby(pd.Grouper(freq='M'))
    ax = gb_tmp.hist()
    plt.setp(ax.xaxis.get_ticklabels(),rotation=90)
    plt.show()
    plt.close()

这会给我以下错误消息:

This gives me the following error message:

ValueError: range() arg 3 must not be zero

代码2:

df1 = df.groupby(pd.Grouper(freq='M'))
np.histogram(df1,bins=10,range=None,normed=True)

这将返回另一个错误消息:

This returns another error message:

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

我尝试了上面的代码,但遇到了这些错误.不确定我使用的方式是否正确.

I tried the above codes, but got these errors. Not sure if I'm using it right.

推荐答案

几个简单的步骤.首先,您需要将数据文件读取到单元格数组中.一旦有了列表或条目行的列表(无论您要叫什么),就需要收集每个月的所有观测值并取每个集合的平均值.在这里,我实现了一个简单的存储桶类,以方便将压力汇总到每个月的各个组中,并为每个组提供均值.最后,我用matplotlib绘制了结果.

A few simple steps. First you need to read your data file, into an array of cells. once you have your list of lists or rows of entry ( what ever you want to call them ) you need to collect all the observations for each month and take the average of each collection. Here I have implemented a simple buckets class to facilitate the aggregation of pressures into groups my months and provide the mean for each group. Lastly I plotted the result with matplotlib.

def readData(fn):
    fh = open(fn)
    lines = fh.read().split("\n")
    ret = [k.split("       ") for k in lines[1:]]
    fh.close()
    return(ret)

class buckets:
    def __init__(self):
        self.data = {}
    def add(self,key,value):
        if not(key in self.data.keys()):
            self.data[key]=[]
        self.data[key].append(value)
    def getMean(self,key):
        nums = []
        for k in range(0,len(self.data[key])):
            try:
                nums.append(self.data[key][k])
            except:
                pass
        return(sum(nums)/float(len(nums)))
    def keys(self):
        return(self.data.keys())

import matplotlib
import numpy as np

data = readData("data.txt")
container = buckets()

for k in data:
    print(k)
    container.add(k[0].split("/")[0],float(k[1]))

histoBars = []
histoTicks = [int(k) for k in list(container.keys())]
histoTicks.sort()
histoTicks = [str(k) for k in histoTicks]
x = np.arange(len(histoTicks))

for k in histoTicks:
        histoBars.append(container.getMean(k))

print(len(histoBars))
print(len(histoTicks))

import matplotlib.pyplot as plt
print(histoBars)
print(histoTicks)
fig, ax = plt.subplots()
plt.bar(x, histoBars)
plt.xticks( x, histoTicks )
plt.show()

最后一个简短的说明,我不太确定文件的数据格式是什么,看起来2列用7个空格隔开,但是其中一个样本只有6个空格,因此您可能必须更改定界符或清理表以确保所有行都读取无误.

A last quick note, I'm not really sure what data format your file is, it looked like the 2 columns were seperated by 7 spaces but then one of the samples only had 6, so you might have to change the delimiter or clean the table to make sure all the rows read without error.

这篇关于Python:绘制按月归一化的直方图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆