将大量存入numpy数组 [英] Stocking large numbers into numpy array

查看:102
本文介绍了将大量存入numpy数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据集,我试图在该数据集上应用某种算术方法. 问题是它给了我相对较大的数字,当我使用numpy进行存储时,它们的存储量为0.

I have a dataset on which I'm trying to apply some arithmetical method. The thing is it gives me relatively large numbers, and when I do it with numpy, they're stocked as 0.

奇怪的是,当我计算数字appart时,它们具有一个int值,当我使用numpy计算它们时,它们仅变为零.

The weird thing is, when I compute the numbers appart, they have an int value, they only become zeros when I compute them using numpy.

x = np.array([18,30,31,31,15])
10*150**x[0]/x[0]
Out[1]:36298069767006890

vector = 10*150**x/x
vector
Out[2]: array([0, 0, 0, 0, 0])

我当然已经检查了他们的类型:

I have off course checked their types:

type(10*150**x[0]/x[0]) == type(vector[0])
Out[3]:True

如何使用numpy计算这么大的数字而又看不到它们变成零?

How can I compute this large numbers using numpy without seeing them turned into zeros?

请注意,如果我们一开始就删除因数10,则问题会发生急剧变化(但我认为可能是类似的原因):

Note that if we remove the factor 10 at the beggining the problem slitghly changes (but I think it might be a similar reason):

x = np.array([18,30,31,31,15])
150**x[0]/x[0]
Out[4]:311075541538526549

vector = 150**x/x
vector
Out[5]: array([-329406144173384851, -230584300921369396, 224960293581823801,
   -224960293581823801, -368934881474191033])

负数表示python中int64类型的最大数,不是吗?

The negative numbers indicate the largest numbers of the int64 type in python as been crossed don't they?

推荐答案

正如Nils Werner所述,numpy的本机ctypes无法保存那么大的数字,但是python本身可以保存,因为int对象使用任意长度的实现. 因此,您 所要做的就是告诉numpy不要将数字转换为ctypes,而要使用python对象.这会慢一些,但是会起作用.

As Nils Werner already mentioned, numpy's native ctypes cannot save numbers that large, but python itself can since the int objects use an arbitrary length implementation. So what you can do is tell numpy not to convert the numbers to ctypes but use the python objects instead. This will be slower, but it will work.

In [14]: x = np.array([18,30,31,31,15], dtype=object)

In [15]: 150**x
Out[15]: 
array([1477891880035400390625000000000000000000L,
       191751059232884086668491363525390625000000000000000000000000000000L,
       28762658884932613000273704528808593750000000000000000000000000000000L,
       28762658884932613000273704528808593750000000000000000000000000000000L,
       437893890380859375000000000000000L], dtype=object)

在这种情况下,numpy数组本身不会存储数字,而是引用相应的int对象.当您执行算术运算时,它们不会在numpy数组上执行,而是在引用后面的对象上执行.
我认为您仍然可以使用大多数numpy函数,但它们肯定比平时慢很多.

In this case the numpy array will not store the numbers themselves but references to the corresponding int objects. When you perform arithmetic operations they won't be performed on the numpy array but on the objects behind the references.
I think you're still able to use most of the numpy functions with this workaround but they will definitely be a lot slower than usual.

但是,当您处理这么大的数字时,这就是您得到的:D
也许外面有个图书馆可以更好地解决这个问题.

But that's what you get when you're dealing with numbers that large :D
Maybe somewhere out there is a library that can deal with this issue a little better.

出于完整性考虑,如果精度不成问题,还可以使用浮点数:

Just for completeness, if precision is not an issue, you can also use floats:

In [19]: x = np.array([18,30,31,31,15], dtype=np.float64)

In [20]: 150**x
Out[20]: 
array([  1.47789188e+39,   1.91751059e+65,   2.87626589e+67,
         2.87626589e+67,   4.37893890e+32])

这篇关于将大量存入numpy数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆