如何将具有对象 dtype 的 Numpy 2D 数组转换为常规的 2D 浮点数组 [英] How to convert a Numpy 2D array with object dtype to a regular 2D array of floats

查看:28
本文介绍了如何将具有对象 dtype 的 Numpy 2D 数组转换为常规的 2D 浮点数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

作为我正在处理的更广泛程序的一部分,我最终得到了包含字符串、3D 坐标等的对象数组.我知道与结构化数组相比,对象数组可能不是很受欢迎,但我希望在不更改大量代码的情况下解决这个问题.

让我们假设数组 obj_array 的每一行(有 N 行)的格式为

obj_array 的单个条目/对象:['NAME',[10.0,20.0,30.0],....]

现在,我正在尝试加载此对象数组并对 3D 坐标块进行切片.到这里为止,一切正常,只需询问让我们说 for .

obj_array[:,[1,2,3]]

然而,结果也是一个对象数组,我将面临问题,因为我想用以下方法形成一个二维浮点数组:

大小 [N,3] 的 N 行和 X,Y,Z 坐标的 3 个条目

现在,我正在遍历行并将每一行分配给目标二维浮点数组的一行以解决该问题.我想知道 numpy 的数组转换工具是否有更好的方法?我尝试了一些东西,但无法解决.

Centers = np.zeros([N,3])对于范围内的行(obj_array.shape[0]):中心[行,:] = obj_array[行,1]

谢谢

解决方案

讨厌的小问题...我一直在玩这个玩具示例:

<预><代码>>>>arr = np.array([['one', [1, 2, 3]],['two', [4, 5, 6]]], dtype=np.object)>>>阿尔数组([['一个', [1, 2, 3]],['二', [4, 5, 6]]], dtype=object)

我的第一个猜测是:

<预><代码>>>>np.array(arr[:, 1])数组([[1, 2, 3], [4, 5, 6]], dtype=object)

但是这会保留 object dtype,所以也许:

<预><代码>>>>np.array(arr[:, 1], dtype=np.float)回溯(最近一次调用最后一次):文件<stdin>",第 1 行,在 <module> 中ValueError:使用序列设置数组元素.

您通常可以通过以下方式解决此问题:

<预><代码>>>>np.array(arr[:, 1], dtype=[('', np.float)]*3).view(np.float).reshape(-1, 3)回溯(最近一次调用最后一次):文件<stdin>",第 1 行,在 <module> 中类型错误:应为可读缓冲区对象

虽然不在这里,这有点令人费解.显然,事实是数组中的对象是列表,因此将其替换为元组是可行的:

<预><代码>>>>np.array([tuple(j) for j in arr[:, 1]],... dtype=[('', np.float)]*3).view(np.float).reshape(-1, 3)数组([[ 1., 2., 3.],[ 4., 5., 6.]])

由于似乎没有任何完全令人满意的解决方案,最简单的方法可能是:

<预><代码>>>>np.array(list(arr[:, 1]), dtype=np.float)数组([[ 1., 2., 3.],[ 4., 5., 6.]])

虽然这样做效率不高,但最好使用以下方法:

<预><代码>>>>np.fromiter((tuple(j) for j in arr[:, 1]), dtype=[('', np.float)]*3,... count=len(arr)).view(np.float).reshape(-1, 3)数组([[ 1., 2., 3.],[ 4., 5., 6.]])

As part of broader program I am working on, I ended up with object arrays with strings, 3D coordinates and etc all mixed. I know object arrays might not be very favorite in comparison to structured arrays but I am hoping to get around this without changing a lot of codes.

Lets assume every row of my array obj_array (with N rows) has format of

Single entry/object of obj_array:  ['NAME',[10.0,20.0,30.0],....] 

Now, I am trying to load this object array and slice the 3D coordinate chunk. Up to here, everything works fine with simply asking lets say for .

obj_array[:,[1,2,3]]

However the result is also an object array and I will face problem as I want to form a 2D array of floats with:

size [N,3] of N rows and 3 entries of X,Y,Z coordinates

For now, I am looping over rows and assigning every row to a row of a destination 2D flot array to get around the problem. I am wondering if there is any better way with array conversion tools of numpy ? I tried a few things and could not get around it.

Centers   = np.zeros([N,3])

for row in range(obj_array.shape[0]):
    Centers[row,:] = obj_array[row,1]

Thanks

解决方案

Nasty little problem... I have been fooling around with this toy example:

>>> arr = np.array([['one', [1, 2, 3]],['two', [4, 5, 6]]], dtype=np.object)
>>> arr
array([['one', [1, 2, 3]],
       ['two', [4, 5, 6]]], dtype=object)

My first guess was:

>>> np.array(arr[:, 1])
array([[1, 2, 3], [4, 5, 6]], dtype=object)

But that keeps the object dtype, so perhaps then:

>>> np.array(arr[:, 1], dtype=np.float)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: setting an array element with a sequence.

You can normally work around this doing the following:

>>> np.array(arr[:, 1], dtype=[('', np.float)]*3).view(np.float).reshape(-1, 3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: expected a readable buffer object

Not here though, which was kind of puzzling. Apparently it is the fact that the objects in your array are lists that throws this off, as replacing the lists with tuples works:

>>> np.array([tuple(j) for j in arr[:, 1]],
...          dtype=[('', np.float)]*3).view(np.float).reshape(-1, 3)
array([[ 1.,  2.,  3.],
       [ 4.,  5.,  6.]])

Since there doesn't seem to be any entirely satisfactory solution, the easiest is probably to go with:

>>> np.array(list(arr[:, 1]), dtype=np.float)
array([[ 1.,  2.,  3.],
       [ 4.,  5.,  6.]])

Although that will not be very efficient, probably better to go with something like:

>>> np.fromiter((tuple(j) for j in arr[:, 1]), dtype=[('', np.float)]*3,
...             count=len(arr)).view(np.float).reshape(-1, 3)
array([[ 1.,  2.,  3.],
       [ 4.,  5.,  6.]])

这篇关于如何将具有对象 dtype 的 Numpy 2D 数组转换为常规的 2D 浮点数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆