列表列表到没有循环或列表理解的元组列表 [英] list of lists to list of tuples without loops or list comprehensions

查看:55
本文介绍了列表列表到没有循环或列表理解的元组列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个列表列表,说[[1,2,2,[2,3],[1,3]]表示元组列表[[1,2,2,(2,3),(1, 3)].列表理解为

I have a list of lists, say [[1,2], [2,3], [1,3]] to list of tuples [(1,2), (2,3), (1,3)]. This can be accomplished easily by list comprehensions as

[tuple(l) for l in list]

但是,这对于大型列表而言会很慢.因此,我想使用纯numpy操作执行相同的操作.

This however will be slow for large lists. So I would like to perform the same using pure numpy operations.

编辑1 ,我将尝试使其更加清晰.

Edit 1 I will try to make it more clear.

我有一个函数foo(),它将返回列表的python列表

I have a function say foo() which will return a python list of lists

def foo(*args):
   # Do something
   return arr

arr将具有列表结构arr = [[a,b], [c,d],...]的列表. 每个内部列表(例如[a, b])的长度为2个元素,而arr将包含大量此类列表(通常大于90,000).

arr will have a list of lists structure arr = [[a,b], [c,d],...]. Each inner list ( e.g [a, b]) will be 2 elements long, and arr will contain a large number of such lists (typically larger than 90,000).

但是,为了不可变,我要求每个内部列表都必须是一个元组,例如

I however, require each inner list to be a tuple, for immutability, like

arr = [(a,b), (c, d),...]

这可以通过列表理解为

def result(arr):
    return [tuple(l) for l in arr]

但是,考虑到列表很大,我会避免这种情况,并使用纯numpy函数来完成. (如@hpaulj建议使用arr.view(),请在下面的答案中使用dict()和zip()查看他的其他方法).

However, considering that the list is large, I would avoid this, and use pure numpy functions to accomplish this. (as @hpaulj suggested using arr.view(), see his other method using dict() and zip() in his answer below).

我想知道这是否可行.如果可行,请告诉我如何.

I would like to know if this is feasible or not. If feasible, please tell me how.

推荐答案

您的示例列表,以及由此创建的数组:

Your sample list, and an array made from it:

In [26]: alist = [[1,2], [2,3], [1,3]]
In [27]: arr = np.array(alist)
In [28]: arr
Out[28]: 
array([[1, 2],
       [2, 3],
       [1, 3]])

tolist是解包"数组的相对快速的方法,但它会产生列表列表-就像我们从以下内容开始一样:

tolist is a relatively fast way of 'unpacking' an array, but it produces a list of lists - just like we started with:

In [29]: arr.tolist()
Out[29]: [[1, 2], [2, 3], [1, 3]]

因此,将其转换为元组列表需要相同的列表理解:

So converting that to a list of tuples requires the same list comprehension:

In [30]: [tuple(x) for x in arr.tolist()]
Out[30]: [(1, 2), (2, 3), (1, 3)]
In [31]: [tuple(x) for x in alist]
Out[31]: [(1, 2), (2, 3), (1, 3)]

现在,如果数组具有复合dtype,则tolist确实会生成一个元组列表.相反,要从列表创建结构化数组,我们需要一个元组列表:

Now if the array has a compound dtype, the tolist does produce a list of tuples. Conversely, to create a structured array from a list, we need a list of tuples:

In [33]: arr1 = np.array([tuple(x) for x in alist], dtype='i,i')
In [34]: arr1
Out[34]: array([(1, 2), (2, 3), (1, 3)], dtype=[('f0', '<i4'), ('f1', '<i4')])
In [35]: arr1.tolist()
Out[35]: [(1, 2), (2, 3), (1, 3)]

从2d数组构造结构化数组有点棘手:

Constructing a structured array from a 2d array, is kind of tricky:

In [37]: arr.view('i,i')
Out[37]: 
array([[(1, 0), (2, 0)],
       [(2, 0), (3, 0)],
       [(1, 0), (3, 0)]], dtype=[('f0', '<i4'), ('f1', '<i4')])

astype并不好.实际上,我不止一次建议使用tolist路线:

astype isn't much better. In fact, more than once I've recommended going the tolist route:

np.array([tuple(x) for x in arr.tolist()],'i,i')

In[33]是其中元组列表很重要的一种情况.这是因为numpy开发人员已选择将元组解释为结构数组标记".

In[33] is one case where where a list of tuples matters. That's because numpy developers have chosen to interpret the tuple as a structure array 'marker'.

我想不出常规的Python案例是否需要元组列表,而列表列表则不需要.通常,元组和列表之间的显着区别是元组是不可变的.好的,在构造字典键(或集合元素)时,不变性很重要.

I can't think of regular Python cases were a list of tuples is required and a list of lists won't do. Usually the significant difference between tuples and lists is that tuples are immutable. OK, that immutability does matter when constructing dictionary keys (or set elements).

In [42]: dict(zip(alist,['a','b','c']))
....
TypeError: unhashable type: 'list'
In [43]: dict(zip([tuple(x) for x in alist],['a','b','c']))
Out[43]: {(1, 2): 'a', (1, 3): 'c', (2, 3): 'b'}

更正了view到结构化数组的转换

我以前使用view的尝试是错误的,因为我使用了错误的dtype:

corrected view conversion to structured array

My earlier attempt at using view was wrong because I used the wrong dtype:

In [45]: arr.dtype
Out[45]: dtype('int64')
In [46]: arr.view('i8,i8')
Out[46]: 
array([[(1, 2)],
       [(2, 3)],
       [(1, 3)]], dtype=[('f0', '<i8'), ('f1', '<i8')])
In [47]: arr.view('i8,i8').tolist()
Out[47]: [[(1, 2)], [(2, 3)], [(1, 3)]]

更好-尽管现在列表中有元组.

Better - though now I have tuples within lists.

In [48]: arr.view('i8,i8').reshape(3).tolist()
Out[48]: [(1, 2), (2, 3), (1, 3)]

这可以避免列表理解,但是速度并不快:

This avoids the list comprehension, but it isn't faster:

In [49]: timeit arr.view('i8,i8').reshape(3).tolist()
21.4 µs ± 51.1 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [50]: timeit [tuple(x) for x in arr]
6.26 µs ± 5.51 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

从列表列表到元组列表创建字典的时间测试:

Time tests for creating dictionary from list of lists vs. list of tuples:

In [51]: timeit dict(zip([tuple(x) for x in alist],['a','b','c']))
2.67 µs ± 21.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [52]: timeit dict(zip(Out[48],['a','b','c']))
1.31 µs ± 5.96 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

很显然,您需要对实际问题进行时间测试,但是这个小例子说明了解决这些问题的方法.尽管所有关于numpy操作的讨论都比较快,但列表理解并没有那么糟糕,特别是如果结果无论如何都是Python对象列表的话.

Obviously you need to do time tests on realistic problems, but this small example suggests the way that those will go. Despite all the talk about numpy operations being fast(er), list comprehensions aren't that bad, especially if the result is going to be a list of Python objects anyways.

这篇关于列表列表到没有循环或列表理解的元组列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆