如何从元组列表中提取列表 [英] How to extract list from list of tuples

查看:66
本文介绍了如何从元组列表中提取列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这样的元组列表:

data = [(4, [1, 2]),(10, [3, 13]),(9, [14, 6]),(10, [7, 5]),(19, [2, 7]),(15, [15, 5]),(21, [9, 12]),(250, [11, 11]),(25, [5, 5]),(100, [2, 10]),(120, [8, 11]),]

如何在不使用循环和迭代的情况下从中获取三个单独的列表,例如:

a = [4,10,9,19,...]b = [1,3,14,7,2,...]c = [2,13,6,5,7,...]

我的循环尝试

a = []b = []c = []对于数据中的 t:a.append(t[0])b.append(t[1][0])c.append(t[1][1])

解决方案

你基本上想要扁平化和解压缩":

<预><代码>>>>def展平(行):... a,[b,c] = 行...返回 a,b,c...>>>数据 = [(4, [1, 2]),(10, [3, 13]),(9, [14, 6]),(10, [7, 5]),(19, [2, 7]),(15, [15, 5]),(21, [9, 12]),(250, [11, 11]),(25, [5, 5]),(100, [2, 10]),(120, [8, 11]),]>>>a,b,c = zip(*map(flatten, data))>>>一种(4, 10, 9, 10, 19, 15, 21, 250, 25, 100, 120)>>>乙(1, 3, 14, 7, 2, 15, 9, 11, 5, 2, 8)>>>C(2, 13, 6, 5, 7, 5, 12, 11, 5, 10, 11)

或者作为列表理解:

a, b, c = zip(*[(a,b,c) for a, [b, c] in data])

但老实说,您拥有的循环很好.我可能只是使用它,但像这样清理它:

a = []b = []c = []对于数据中的 i, (j, k):a.append(i)b.append(j)c.append(k)

编辑

这里有一些时间,我将使用列表理解版本,因为它应该比使用 map 更快(没有重复的函数调用开销):

<预><代码>>>>def with_unzip(data):... a,b,c = zip(*((a,b,c) for a,[b,c] in data))...返回 a,b,c...>>>def with_loop(data):... a = []... b = []... c = []...对于数据中的 i, [j, k]:... a.append(i)... b.append(j)... c.append(k)...返回 a,b,c...

现在,让我们设置一个相当大的数据集来测试:

<预><代码>>>>导入时间>>>data_big = 数据 * 10_000>>>打印(f'{len(data_big):,d}')110,000>>>timeit.timeit(lambda : with_unzip(data_big), number=100)2.4092896859999655>>>timeit.timeit(lambda : with_loop(data_big), number=100)2.0086487390001366

看到它的规模不大:

<预><代码>>>>数据大 = 数据大 * 100>>>打印(f'{len(data_big):,d}')11,000,000>>>timeit.timeit(lambda : with_unzip(data_big), number=10) # 使数字变小27.03781091399992>>>timeit.timeit(lambda : with_loop(data_big), number=10) # 使数字变小17.5005345510001

这可能是因为 a, b, c = zip(*whatever) 最终对数据进行了两次传递,因为将参数解包到 zip 中不是自由.如果我们通过缓存 .append 方法解析来微优化循环版本,你真的可以看到这个效果:

<预><代码>>>>def with_loop_microp(data):... a = []... b = []... c = []... a_append = a.append... b_append = b.append... c_append = c.append...对于数据中的 i, (j, k):... a_append(i)... b_append(j)... c_append(k)...返回 a,b,c...>>>timeit.timeit(lambda : with_loop_microp(data_big), number=10) # 使数字变小10.746374250000144

I have list of tuples like this:

data = [(4, [1, 2]),
 (10, [3, 13]),
 (9, [14, 6]),
 (10, [7, 5]),
 (19, [2, 7]),
 (15, [15, 5]),
 (21, [9, 12]),
 (250, [11, 11]),
 (25, [5, 5]),
 (100, [2, 10]),
 (120, [8, 11]),
]

How to get three separate lists from it without using loop and iteration, like :

a = [4,10,9,19,...]
b = [1,3,14,7,2,...]
c = [2,13,6,5,7,...]

My attempt with loop

a = []
b = []
c = []

for t in data:
  a.append(t[0])
  b.append(t[1][0])
  c.append(t[1][1])

解决方案

You basically want to flatten and "unzip":

>>> def flatten(row):
...     a,[b,c] = row
...     return a,b,c
...
>>> data = [(4, [1, 2]),
 (10, [3, 13]),
 (9, [14, 6]),
 (10, [7, 5]),
 (19, [2, 7]),
 (15, [15, 5]),
 (21, [9, 12]),
 (250, [11, 11]),
 (25, [5, 5]),
 (100, [2, 10]),
 (120, [8, 11]),
]
>>> a,b,c = zip(*map(flatten, data))
>>> a
(4, 10, 9, 10, 19, 15, 21, 250, 25, 100, 120)
>>> b
(1, 3, 14, 7, 2, 15, 9, 11, 5, 2, 8)
>>> c
(2, 13, 6, 5, 7, 5, 12, 11, 5, 10, 11)

Or as a list comprehension:

a, b, c = zip(*[(a,b,c) for a, [b, c] in data])

But honestly, the loop you have is fine. I would probably just use that,but clean it up like this:

a = []
b = []
c = []
for i, (j, k) in data:
    a.append(i)
    b.append(j)
    c.append(k)

EDIT

Here are some timings, I'll use the list-comprehension version, because it should be faster than using map (no repeated function-call overhead):

>>> def with_unzip(data):
...     a,b,c =  zip(*((a,b,c) for a,[b,c] in data))
...     return a,b,c
...
>>> def with_loop(data):
...     a = []
...     b = []
...     c = []
...     for i, [j, k] in data:
...         a.append(i)
...         b.append(j)
...         c.append(k)
...     return a,b,c
...

Now, let's setup an appreciably large dataset to test this with:

>>> import timeit
>>> data_big  = data * 10_000
>>> print(f'{len(data_big):,d}')
110,000    
>>> timeit.timeit(lambda : with_unzip(data_big), number=100)
2.4092896859999655
>>> timeit.timeit(lambda : with_loop(data_big), number=100)
2.0086487390001366

And see it scales not great:

>>> data_big = data_big * 100
>>> print(f'{len(data_big):,d}')
11,000,000
>>> timeit.timeit(lambda : with_unzip(data_big), number=10) # made number smaller
27.03781091399992
>>> timeit.timeit(lambda : with_loop(data_big), number=10) # made number smaller
17.5005345510001

This is probably because a, b, c = zip(*whatever) ultimately does two-passes over the data, because unpacking the arguments into zip isn't free. You can really see the effect of this if we micro-optimize the looping version by caching the .append method resolution:

>>> def with_loop_microp(data):
...     a = []
...     b = []
...     c = []
...     a_append = a.append
...     b_append = b.append
...     c_append = c.append
...     for i, (j, k) in data:
...         a_append(i)
...         b_append(j)
...         c_append(k)
...     return a,b,c
...
>>> timeit.timeit(lambda : with_loop_microp(data_big), number=10) # made number smaller
10.746374250000144

这篇关于如何从元组列表中提取列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆