将大量数字存入 numpy 数组 [英] Stocking large numbers into numpy array

查看:70
本文介绍了将大量数字存入 numpy 数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据集,我正在尝试对其应用一些算术方法.问题是它给了我相对较大的数字,当我用 numpy 进行时,它们的库存为 0.

I have a dataset on which I'm trying to apply some arithmetical method. The thing is it gives me relatively large numbers, and when I do it with numpy, they're stocked as 0.

奇怪的是,当我计算数字时,它们有一个 int 值,当我使用 numpy 计算它们时,它们只会变成零.

The weird thing is, when I compute the numbers appart, they have an int value, they only become zeros when I compute them using numpy.

x = np.array([18,30,31,31,15])
10*150**x[0]/x[0]
Out[1]:36298069767006890

vector = 10*150**x/x
vector
Out[2]: array([0, 0, 0, 0, 0])

我当然检查了它们的类型:

I have off course checked their types:

type(10*150**x[0]/x[0]) == type(vector[0])
Out[3]:True

如何使用 numpy 计算这些大数字而不看到它们变成零?

How can I compute this large numbers using numpy without seeing them turned into zeros?

请注意,如果我们在开始时删除因子 10,问题会发生轻微的变化(但我认为这可能是类似的原因):

Note that if we remove the factor 10 at the beggining the problem slitghly changes (but I think it might be a similar reason):

x = np.array([18,30,31,31,15])
150**x[0]/x[0]
Out[4]:311075541538526549

vector = 150**x/x
vector
Out[5]: array([-329406144173384851, -230584300921369396, 224960293581823801,
   -224960293581823801, -368934881474191033])

负数表示python中int64类型的最大数被交叉了不是吗?

The negative numbers indicate the largest numbers of the int64 type in python as been crossed don't they?

推荐答案

正如 Nils Werner 已经提到的,numpy 的原生 ctypes 不能保存那么大的数字,但 python 本身可以,因为 int 对象使用任意长度的实现.所以你可以做的是告诉numpy不要将数字转换为ctypes,而是使用python对象.这会更慢,但它会起作用.

As Nils Werner already mentioned, numpy's native ctypes cannot save numbers that large, but python itself can since the int objects use an arbitrary length implementation. So what you can do is tell numpy not to convert the numbers to ctypes but use the python objects instead. This will be slower, but it will work.

In [14]: x = np.array([18,30,31,31,15], dtype=object)

In [15]: 150**x
Out[15]: 
array([1477891880035400390625000000000000000000L,
       191751059232884086668491363525390625000000000000000000000000000000L,
       28762658884932613000273704528808593750000000000000000000000000000000L,
       28762658884932613000273704528808593750000000000000000000000000000000L,
       437893890380859375000000000000000L], dtype=object)

在这种情况下,numpy 数组不会存储数字本身,而是存储对相应 int 对象的引用.当您执行算术运算时,它们不会在 numpy 数组上执行,而是在引用后面的对象上执行.
我认为您仍然可以通过这种解决方法使用大多数 numpy 函数,但它们肯定会比平时慢很多.

In this case the numpy array will not store the numbers themselves but references to the corresponding int objects. When you perform arithmetic operations they won't be performed on the numpy array but on the objects behind the references.
I think you're still able to use most of the numpy functions with this workaround but they will definitely be a lot slower than usual.

但是当你处理这么大的数字时,这就是你得到的:D
也许某个地方有一个图书馆可以更好地处理这个问题.

But that's what you get when you're dealing with numbers that large :D
Maybe somewhere out there is a library that can deal with this issue a little better.

为了完整性,如果精度不是问题,您也可以使用浮点数:

Just for completeness, if precision is not an issue, you can also use floats:

In [19]: x = np.array([18,30,31,31,15], dtype=np.float64)

In [20]: 150**x
Out[20]: 
array([  1.47789188e+39,   1.91751059e+65,   2.87626589e+67,
         2.87626589e+67,   4.37893890e+32])

这篇关于将大量数字存入 numpy 数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆