为什么"numpy.mean"返回"INF"? [英] Why does "numpy.mean" return 'inf'?

查看:1070
本文介绍了为什么"numpy.mean"返回"INF"?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要计算具有超过1000行的数组的列中的均值.

I need to calculate the mean in columns of an array with more than 1000 rows.

np.mean(some_array)给了我 inf作为输出

但是我很确定这些值还可以.我正在将此处中的csv加载到我的Data变量中,并且水泥"列为我认为健康".

but i am pretty sure the values are ok. I am loading a csv from here into my Data variable and column 'cement' is "healthy" from my point of view.

In[254]:np.mean(Data[:230]['Cement'])
Out[254]:275.75

但是如果我增加行数 问题开始了:

but if I increase the number of rows the problem starts:

In [259]:np.mean(Data[:237]['Cement'])
Out[259]:inf

但是当我查看数据时

In [261]:Data[230:237]['Cement']
Out[261]:
 array([[ 425. ],
        [ 333.  ],
        [ 250.25],
        [ 491.  ],
        [ 160.  ],
        [ 229.75],
        [ 338.  ]], dtype=float16)

我找不到此行为的原因 P.S这在使用wakari(基于云的Ipython)的Python 3.x中发生

i do not find a reason for this behaviour P.S This happens in Python 3.x using wakari (cloud based Ipython)

Numpy版本"1.8.1"

Numpy Version '1.8.1'

我正在使用以下数据加载数据:

I am loading the Data with:

No_Col=9
conv = lambda valstr: float(valstr.replace(',','.'))

c={}
for i in range(0,No_Col,1):
    c[i] = conv

Data=np.genfromtxt(get_data,dtype=float16 , delimiter='\t', skip_header=0, names=True,   converters=c)

推荐答案

我猜这个问题是精确的(正如其他人也评论过的那样).直接从我们看到的mean()文档中引用

I will guess that the problem is precision (as others have also commented). Quoting directly from the documentation for mean() we see

注释

算术平均值是沿轴的元素之和除以 通过元素的数量.

The arithmetic mean is the sum of the elements along the axis divided by the number of elements.

请注意,对于浮点输入,均值使用 输入具有相同的精度.根据输入数据,这可以 导致结果不准确,尤其是对于float32(请参见 下面的示例).使用 dtype关键字可以缓解此问题.

Note that for floating-point input, the mean is computed using the same precision the input has. Depending on the input data, this can cause the results to be inaccurate, especially for float32 (see example below). Specifying a higher-precision accumulator using the dtype keyword can alleviate this issue.

由于数组的类型为float16,因此精度非常有限.使用dtype=np.float64可能会减轻溢出.另请参见mean()文档中的示例.

Since your array is of type float16 you have very limited precision. Using dtype=np.float64 will probably alleviate the overflow. Also see the examples in the mean() documentation.

这篇关于为什么"numpy.mean"返回"INF"?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆