浮点数的直方图直方图 [英] Numpy Histogram Representing Floats with Approximate Values as The Same

查看:114
本文介绍了浮点数的直方图直方图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一段代码,如果给定范围为[0,1),它会生成介于-10到10之间的某个值该代码从-10到10之间取值,并将根据其概率将其附加到列表中.例如,由于-10对应于值0,因此会在列表中放置-10次,而10对应于范围中的1,则将被放置100次(作为规范化).

I have code that generates a certain value from -10 to 10 given a range from [0,1) The code takes the value from -10 to 10 and it will append it to a list, according to its probability. For example, -10 would be put in the list 0 times since it corresponds to the value 0, and 10 would be put 100 times (as a normalization) since it corresponds to 1 in the range.

这是代码:

#!/usr/bin/env python

import math
import numpy as np
import matplotlib.pyplot as plt

pos = []
ceilingValue = 0.82
pValues = np.linspace(0.00, ceilingValue, num=100*ceilingValue)

for i in xrange(int(100*ceilingValue)):
    p = pValues[i]
    y = -11.63*math.log(-2.36279*(p - 1))
    for j in xrange(i):
        pos.append(y)

avg = np.average(pos)    
std = np.std(pos)    

hist, bins = np.histogram(pos,bins = 100)
width = 0.7*(bins[1]-bins[0])
center = (bins[:-1]+bins[1:])/2
plt.bar(center, hist, align = 'center', width = width)
plt.show()  

问题在于直方图将生成准确的图,但某些值会破坏趋势.例如,-5.88对应于频率计数中的大约30个条目,大约为70.我认为python可以看到两个值并将它们简单地组合在一起,但是我不确定如何解决.但是,如果只是直方图做错了什么,那就没关系,我真的不需要它.我只需要列表,如果它是正确的话.

The problem is that the histogram will generate an accurate plot, but certain values will break the trend. For example, -5.88 which corresponds to about 30 entries in the frequency count will be at about 70. I think python sees the two values and simply lumps them together but I'm not sure how to fix it. But if it's just the histogram that's doing something wrong, then it doesn't matter, I don't really need it. I just need the list, if it is right in the first place.

推荐答案

我认为根本的问题是您的bin大小是一致的,而 pos 中唯一值之间的差异则呈指数增长.因此,您总是会以奇怪的尖峰"结尾,其中两个附近的唯一值落在同一个垃圾箱中,或者是很多空垃圾箱(特别是如果您只是增加垃圾箱数量以摆脱尖峰").

I think the underlying issue is that your bin size is uniform, whereas the differences between the unique values in pos scale exponentially. Because of that you'll always end up either with weird 'spikes' where two nearby unique values fall within the same bin, or lots of empty bins (especially if you just increase the bin count to get rid of the 'spikes').

您可以尝试根据 pos 中的实际唯一值设置垃圾箱,以使它们的宽度不一致:

You could try setting your bins according to the actual unique values in pos, so that their widths are non-uniform:

 # the " + [10,]" forces the rightmost bin edge to == 10
 uvals = np.unique(pos+[10,])
 hist, bins = np.histogram(pos,bins=uvals)
 plt.bar(bins[:-1],hist,width=np.diff(bins))

这篇关于浮点数的直方图直方图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆