如何从元组列表中提取列表 [英] How to extract list from list of tuples
问题描述
我有这样的元组列表:
data = [(4, [1, 2]),(10, [3, 13]),(9, [14, 6]),(10, [7, 5]),(19, [2, 7]),(15, [15, 5]),(21, [9, 12]),(250, [11, 11]),(25, [5, 5]),(100, [2, 10]),(120, [8, 11]),]
如何在不使用循环和迭代的情况下从中获取三个单独的列表,例如:
a = [4,10,9,19,...]b = [1,3,14,7,2,...]c = [2,13,6,5,7,...]
我的循环尝试
a = []b = []c = []对于数据中的 t:a.append(t[0])b.append(t[1][0])c.append(t[1][1])
你基本上想要扁平化和解压缩":
<预><代码>>>>def展平(行):... a,[b,c] = 行...返回 a,b,c...>>>数据 = [(4, [1, 2]),(10, [3, 13]),(9, [14, 6]),(10, [7, 5]),(19, [2, 7]),(15, [15, 5]),(21, [9, 12]),(250, [11, 11]),(25, [5, 5]),(100, [2, 10]),(120, [8, 11]),]>>>a,b,c = zip(*map(flatten, data))>>>一种(4, 10, 9, 10, 19, 15, 21, 250, 25, 100, 120)>>>乙(1, 3, 14, 7, 2, 15, 9, 11, 5, 2, 8)>>>C(2, 13, 6, 5, 7, 5, 12, 11, 5, 10, 11)或者作为列表理解:
a, b, c = zip(*[(a,b,c) for a, [b, c] in data])
但老实说,您拥有的循环很好.我可能只是使用它,但像这样清理它:
a = []b = []c = []对于数据中的 i, (j, k):a.append(i)b.append(j)c.append(k)
编辑
这里有一些时间,我将使用列表理解版本,因为它应该比使用 map
更快(没有重复的函数调用开销):
现在,让我们设置一个相当大的数据集来测试:
<预><代码>>>>导入时间>>>data_big = 数据 * 10_000>>>打印(f'{len(data_big):,d}')110,000>>>timeit.timeit(lambda : with_unzip(data_big), number=100)2.4092896859999655>>>timeit.timeit(lambda : with_loop(data_big), number=100)2.0086487390001366看到它的规模不大:
<预><代码>>>>数据大 = 数据大 * 100>>>打印(f'{len(data_big):,d}')11,000,000>>>timeit.timeit(lambda : with_unzip(data_big), number=10) # 使数字变小27.03781091399992>>>timeit.timeit(lambda : with_loop(data_big), number=10) # 使数字变小17.5005345510001这可能是因为 a, b, c = zip(*whatever)
最终对数据进行了两次传递,因为将参数解包到 zip
中不是自由.如果我们通过缓存 .append
方法解析来微优化循环版本,你真的可以看到这个效果:
I have list of tuples like this:
data = [(4, [1, 2]),
(10, [3, 13]),
(9, [14, 6]),
(10, [7, 5]),
(19, [2, 7]),
(15, [15, 5]),
(21, [9, 12]),
(250, [11, 11]),
(25, [5, 5]),
(100, [2, 10]),
(120, [8, 11]),
]
How to get three separate lists from it without using loop and iteration, like :
a = [4,10,9,19,...]
b = [1,3,14,7,2,...]
c = [2,13,6,5,7,...]
My attempt with loop
a = []
b = []
c = []
for t in data:
a.append(t[0])
b.append(t[1][0])
c.append(t[1][1])
You basically want to flatten and "unzip":
>>> def flatten(row):
... a,[b,c] = row
... return a,b,c
...
>>> data = [(4, [1, 2]),
(10, [3, 13]),
(9, [14, 6]),
(10, [7, 5]),
(19, [2, 7]),
(15, [15, 5]),
(21, [9, 12]),
(250, [11, 11]),
(25, [5, 5]),
(100, [2, 10]),
(120, [8, 11]),
]
>>> a,b,c = zip(*map(flatten, data))
>>> a
(4, 10, 9, 10, 19, 15, 21, 250, 25, 100, 120)
>>> b
(1, 3, 14, 7, 2, 15, 9, 11, 5, 2, 8)
>>> c
(2, 13, 6, 5, 7, 5, 12, 11, 5, 10, 11)
Or as a list comprehension:
a, b, c = zip(*[(a,b,c) for a, [b, c] in data])
But honestly, the loop you have is fine. I would probably just use that,but clean it up like this:
a = []
b = []
c = []
for i, (j, k) in data:
a.append(i)
b.append(j)
c.append(k)
EDIT
Here are some timings, I'll use the list-comprehension version, because it should be faster than using map
(no repeated function-call overhead):
>>> def with_unzip(data):
... a,b,c = zip(*((a,b,c) for a,[b,c] in data))
... return a,b,c
...
>>> def with_loop(data):
... a = []
... b = []
... c = []
... for i, [j, k] in data:
... a.append(i)
... b.append(j)
... c.append(k)
... return a,b,c
...
Now, let's setup an appreciably large dataset to test this with:
>>> import timeit
>>> data_big = data * 10_000
>>> print(f'{len(data_big):,d}')
110,000
>>> timeit.timeit(lambda : with_unzip(data_big), number=100)
2.4092896859999655
>>> timeit.timeit(lambda : with_loop(data_big), number=100)
2.0086487390001366
And see it scales not great:
>>> data_big = data_big * 100
>>> print(f'{len(data_big):,d}')
11,000,000
>>> timeit.timeit(lambda : with_unzip(data_big), number=10) # made number smaller
27.03781091399992
>>> timeit.timeit(lambda : with_loop(data_big), number=10) # made number smaller
17.5005345510001
This is probably because a, b, c = zip(*whatever)
ultimately does two-passes over the data, because unpacking the arguments into zip
isn't free. You can really see the effect of this if we micro-optimize the looping version by caching the .append
method resolution:
>>> def with_loop_microp(data):
... a = []
... b = []
... c = []
... a_append = a.append
... b_append = b.append
... c_append = c.append
... for i, (j, k) in data:
... a_append(i)
... b_append(j)
... c_append(k)
... return a,b,c
...
>>> timeit.timeit(lambda : with_loop_microp(data_big), number=10) # made number smaller
10.746374250000144
这篇关于如何从元组列表中提取列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!