如何从可重复的元组创建多维numpy数组? [英] How do you create a multidimensional numpy array from an iterable of tuples?
问题描述
我想从一个可迭代对象创建一个numpy数组,该数组会生成值的元组,例如数据库查询.
I would like to create a numpy array from an iterable, which yields tuples of values, such as a database query.
像这样:
data = db.execute('SELECT col1, col2, col3, col4 FROM data')
A = np.array(list(data))
有没有一种更快的方法,而无需先将可迭代对象转换为列表?
Is there a way faster way of doing so, without converting the iterable to a list first?
推荐答案
我不是numpy
的有经验的用户,但是以下是一般问题的可能解决方案:
I am not an experienced user of numpy
, but here is a possible solution for the general question:
>>> i = iter([(1, 11), (2, 22)])
>>> i
<listiterator at 0x5b2de30> # a sample iterable of tuples
>>> rec_array = np.fromiter(i, dtype='i4,i4') # mind the dtype
>>> rec_array # rec_array is a record array
array([(1, 11), (2, 22)],
dtype=[('f0', '<i4'), ('f1', '<i4')])
>>> rec_array['f0'], rec_array[0] # each field has a default name
(array([1, 2]), (1, 11))
>>> a = rec_array.view(np.int32).reshape(-1,2) # let's create a view
>>> a
array([[ 1, 11],
[ 2, 22]])
>>> rec_array[0][1] = 23
>>> a # a is a view, not a copy!
array([[ 1, 23],
[ 2, 22]])
我假设所有列都属于同一类型,否则rec_array已经是您想要的.
I assume that all columns are of the same type, otherwise rec_array is already what you want.
关于您的特殊情况,我不完全理解您的示例中的db
是什么.如果它是一个游标对象,则可以调用其fetchall
方法并获取元组列表.在大多数情况下,数据库库不想保留部分读取的查询结果,而是等待您的代码处理每一行,也就是说,当execute
方法返回时,所有数据已经存储在列表中,并且在那里使用fetchall
而不是迭代cursor
实例几乎不是问题.
Concerning your particular case, I do not completely understand what is db
in your example. If it is a cursor object, then you can just call its fetchall
method and get a list of tuples. In most cases, the database library does not want to keep a partially read query result, waiting for your code processing each line, that is by the moment when the execute
method returns, all data is already stored in a list, and there is hardly a problem of using fetchall
instead of iterating cursor
instance.
这篇关于如何从可重复的元组创建多维numpy数组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!