将元组列表转换为数组时,如何阻止元组创建第3维? [英] When turning a list of lists of tuples to an array, how can I stop tuples from creating a 3rd dimension?

查看:89
本文介绍了将元组列表转换为数组时,如何阻止元组创建第3维?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个元组(每个相同长度的元组2)的列表(每个相同长度的子列表).每个子列表代表一个句子,元组是该句子的双字母组.

I have a list of lists (each sublist of the same length) of tuples (each tuple of the same length, 2). Each sublist represents a sentence, and the tuples are bigrams of that sentence.

当使用np.asarray将其转换为数组时,python似乎在解释元组,因为我要求创建第3维.

When using np.asarray to turn this into an array, python seems to interpret the tuples as me asking for a 3rd dimension to be created.

完整的工作代码在这里:

Full working code here:

import numpy as np 
from nltk import bigrams  

arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

bi_grams = []
for sent in arr:
    bi_grams.append(list(bigrams(sent)))
bi_grams = np.asarray(bi_grams)
print(bi_grams)

因此,在将bi_grams转换为数组之前,它是这样的:[[(1, 2), (2, 3)], [(4, 5), (5, 6)], [(7, 8), (8, 9)]]

So before turning bi_grams to an array it looks like this: [[(1, 2), (2, 3)], [(4, 5), (5, 6)], [(7, 8), (8, 9)]]

以上代码的输出:

array([[[1, 2],
        [2, 3]],

       [[4, 5],
        [5, 6]],

       [[7, 8],
        [8, 9]]])

以这种方式将列表列表转换为数组通常很好,并创建了一个2D数组,但是似乎python将元组解释为一个附加维,因此输出实际上是(3, 2, 2)的形状我想要并且一直期望形状为(3, 2).

Converting a list of lists to an array in this way is normally fine, and creates a 2D array, but it seems that python interprets the tuples as an added dimension, so the output is of shape (3, 2, 2), when in fact I want, and was expecting, a shape of (3, 2).

我想要的输出是:

array([[(1, 2), (2, 3)],
       [(4, 5), (5, 6)],
       [(7, 8), (8, 9)]])

形状为(3, 2)

.

which is of shape (3, 2).

为什么会这样?如何获得所需形状/形状的数组?

Why does this happen? How can I achieve the array in the form/shape that I want?

推荐答案

对于np.array,元组列表的列表与列表列表的列表没有什么不同.从头到尾都是可迭代的. np.array尝试创建尽可能高的尺寸数组.在这种情况下是3d.

To np.array, your list of lists of tuples isn't any different from a list of lists of lists. It's iterables all the way down. np.array tries to create as high a dimensional array as possible. In this case that is 3d.

有一些方法可以一步一步地制作一个包含对象的2d数组,其中这些对象是元组之类的东西.但是,正如评论中指出的那样,您为什么要这么做?

There are ways of side stepping that and making a 2d array that contains objects, where those objects are things like tuples. But as noted in the comments, why would you want that?

在最近的问题中,我想到了这个问题将nd数组转换为(nm)-d形状的对象数组的方法:

In a recent SO question, I came up with this way of turning a n-d array into an object array of (n-m)-d shape:

In [267]: res = np.empty((3,2),object)
In [268]: arr = np.array(alist)
In [269]: for ij in np.ndindex(res.shape):
     ...:     res[ij] = arr[ij]
     ...:     
In [270]: res
Out[270]: 
array([[array([1, 2]), array([2, 3])],
       [array([4, 5]), array([5, 6])],
       [array([7, 8]), array([8, 9])]], dtype=object)

但这是数组的二维数组,而不是元组.

But that's a 2d array of arrays, not of tuples.

In [271]: for ij in np.ndindex(res.shape):
     ...:     res[ij] = tuple(arr[ij].tolist())
     ...:     
     ...:     
In [272]: res
Out[272]: 
array([[(1, 2), (2, 3)],
       [(4, 5), (5, 6)],
       [(7, 8), (8, 9)]], dtype=object)

那更好(或者是?)

或者我可以直接索引嵌套列表:

Or I could index the nested list directly:

In [274]: for i,j in np.ndindex(res.shape):
     ...:     res[i,j] = alist[i][j]
     ...:     
In [275]: res
Out[275]: 
array([[(1, 2), (2, 3)],
       [(4, 5), (5, 6)],
       [(7, 8), (8, 9)]], dtype=object)

我正在使用ndindex生成(3,2)数组的所有索引.

I'm using ndindex to generate the all the indices of a (3,2) array.

注释中提到的结构化数组起作用是因为对于复合dtype,元组与列表不同.

The structured array mentioned in the comments works because for a compound dtype, tuples are distinct from lists.

In [277]: np.array(alist, 'i,i')
Out[277]: 
array([[(1, 2), (2, 3)],
       [(4, 5), (5, 6)],
       [(7, 8), (8, 9)]], dtype=[('f0', '<i4'), ('f1', '<i4')])

但是,从技术上讲,这不是元组数组.它只是将数组的元素(或记录)表示为元组.

Technically, though, that isn't an array of tuples. It just represents the elements (or records) of the array as tuples.

在对象dtype数组中,数组的元素是指向列表中元组的指针(至少在Out[275]情况下).在结构化数组的情况下,数字以与3d数组相同的方式存储为数组数据缓冲区中的字节.

In the object dtype array, the elements of the array are pointers to the tuples in the list (at least in the Out[275] case). In the structured array case the numbers are stored in the same as with a 3d array, as bytes in the array data buffer.

这篇关于将元组列表转换为数组时,如何阻止元组创建第3维?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆