Python:使用Dis分析列表理解 [英] Python: analyze a list comprehension with dis
问题描述
最近,我对SO进行了讨论(关于上下文),涉及以下两个代码段:
res = [d.get(next((k for k in d if k in s), None), s) for s in lst]
并且:
res = [next((v for k,v in d.items() if k in s), s) for s in lst]
两者都循环访问列表lst
中的字符串s
,并在字典d
中查找s
.如果找到s
,则返回关联的值,否则返回s
.我很确定第二段代码要比第一段更快,因为(对于每个s
)字典中没有查找,而只是对(键,值)对进行了迭代.
问题是: 如何检查这到底是怎么回事?
我第一次尝试使用dis
模块,但结果令人失望(python 3.6.3):
>>> dis.dis("[d.get(next((k for k in d if k in s), None), s) for s in lst]")
1 0 LOAD_CONST 0 (<code object <listcomp> at 0x7f8e302039c0, file "<dis>", line 1>)
2 LOAD_CONST 1 ('<listcomp>')
4 MAKE_FUNCTION 0
6 LOAD_NAME 0 (lst)
8 GET_ITER
10 CALL_FUNCTION 1
12 RETURN_VALUE
>>> dis.dis("[next((v for k,v in d.items() if k in s), s) for s in lst]")
1 0 LOAD_CONST 0 (<code object <listcomp> at 0x7f8e302038a0, file "<dis>", line 1>)
2 LOAD_CONST 1 ('<listcomp>')
4 MAKE_FUNCTION 0
6 LOAD_NAME 0 (lst)
8 GET_ITER
10 CALL_FUNCTION 1
12 RETURN_VALUE
如何获取更详细的信息?
编辑
正如@abarnert在第一条评论中所建议的那样,我尝试timeit
两种解决方案.我玩了以下代码:
from faker import Faker
from timeit import timeit
fake = Faker()
d = {fake.word():fake.word() for _ in range(50000)}
lst = fake.words(500000)
def f():return [d.get(next((k for k in d if k in s), None), s) for s in lst]
def g():return [next((v for k,v in d.items() if k in s), s) for s in lst]
print(timeit(f, number=1))
print(timeit(g, number=1))
assert f() == g()
也许我错过了一些东西,但是令我惊讶的是,第一段代码(f
)总是比第二段代码(g
)快.因此,第二个问题是:有人在解释吗?
编辑2 以下是反汇编代码中最有趣的部分(只需稍加格式化即可插入内部循环).
对于f
:
2 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 36 (to 42)
6 STORE_DEREF 0 (s)
8 LOAD_GLOBAL 0 (d)
10 LOAD_ATTR 1 (get)
12 LOAD_GLOBAL 2 (next)
14 LOAD_CLOSURE 0 (s)
16 BUILD_TUPLE 1
18 LOAD_CONST 0 (<code object <genexpr> at 0x7ff191b1d8a0, file "test.py", line 2>)
2 0 LOAD_FAST 0 (.0)
>> 2 FOR_ITER 18 (to 22)
4 STORE_FAST 1 (k)
6 LOAD_FAST 1 (k)
8 LOAD_DEREF 0 (s)
10 COMPARE_OP 6 (in)
12 POP_JUMP_IF_FALSE 2
14 LOAD_FAST 1 (k)
16 YIELD_VALUE
18 POP_TOP
20 JUMP_ABSOLUTE 2
>> 22 LOAD_CONST 0 (None)
24 RETURN_VALUE
20 LOAD_CONST 1 ('f.<locals>.<listcomp>.<genexpr>')
22 MAKE_FUNCTION 8
24 LOAD_GLOBAL 0 (d)
26 GET_ITER
28 CALL_FUNCTION 1
30 LOAD_CONST 2 (None)
32 CALL_FUNCTION 2
34 LOAD_DEREF 0 (s)
36 CALL_FUNCTION 2
38 LIST_APPEND 2
40 JUMP_ABSOLUTE 4
>> 42 RETURN_VALUE
对于g
:
3 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 32 (to 38)
6 STORE_DEREF 0 (s)
8 LOAD_GLOBAL 0 (next)
10 LOAD_CLOSURE 0 (s)
12 BUILD_TUPLE 1
14 LOAD_CONST 0 (<code object <genexpr> at 0x7ff1905171e0, file "test.py", line 3>)
3 0 LOAD_FAST 0 (.0)
>> 2 FOR_ITER 22 (to 26)
4 UNPACK_SEQUENCE 2
6 STORE_FAST 1 (k)
8 STORE_FAST 2 (v)
10 LOAD_FAST 1 (k)
12 LOAD_DEREF 0 (s)
14 COMPARE_OP 6 (in)
16 POP_JUMP_IF_FALSE 2
18 LOAD_FAST 2 (v)
20 YIELD_VALUE
22 POP_TOP
24 JUMP_ABSOLUTE 2
>> 26 LOAD_CONST 0 (None)
28 RETURN_VALUE
16 LOAD_CONST 1 ('g.<locals>.<listcomp>.<genexpr>')
18 MAKE_FUNCTION 8
20 LOAD_GLOBAL 1 (d)
22 LOAD_ATTR 2 (items)
24 CALL_FUNCTION 0
26 GET_ITER
28 CALL_FUNCTION 1
30 LOAD_DEREF 0 (s)
32 CALL_FUNCTION 2
34 LIST_APPEND 2
36 JUMP_ABSOLUTE 4
>> 38 RETURN_VALUE
可以看到(再次由@abarnert提出),g
的内部循环包含一些额外的费用:
- (隐藏)由
d.items()
上的迭代器构造的2轴
- 一个
UNPACK_SEQUENCE 2
会打开这两个2包的包装,然后将k
和v
放到堆栈上 - 两个
STORE_FAST
,它们从堆栈中弹出k
和v
并将它们存储在co_varnames
中.
在最终加载k
之前,将其与f
中的s
进行比较.这个内部循环被迭代|lst|*|d|
,并且似乎这些操作有所不同.
如果按照我的想法进行了优化,则d.items()
迭代器将首先将k
放在堆栈上以测试k in s
,然后,仅当k in s
为true时,才将v
放在YIELD_VALUE
的堆栈.
您已经获得了所有有关评估列表理解代码的详细信息.
但是列表推导等效于创建然后调用一个函数. (这是它们具有自己的作用域的方式,因此它们没有(例如,将循环变量泄漏到外部作用域中.)因此,您真正希望查看的代码是自动生成的名为<listcomp>
的函数.>
如果要拆卸它,那么,请注意LOAD_CONST 0
表示正在加载<code object <listcomp> at 0x7f8e302038a0
吗?那正是你想要的.但是我们无法做到这一点,因为我们所做的只是为了分解而编译了一个字符串,然后将结果扔掉了,所以listcomp函数不再存在了.
但是用真实的代码很容易看到:
>>> def f():
... return [next((v for k,v in d.items() if k in s), s) for s in lst]
>>> dis.dis(f)
2 0 LOAD_CONST 1 (<code object <listcomp> at 0x11da9c660, file "<ipython-input-942-698335d58585>", line 2>)
2 LOAD_CONST 2 ('f.<locals>.<listcomp>')
4 MAKE_FUNCTION 0
6 LOAD_GLOBAL 0 (lst)
8 GET_ITER
10 CALL_FUNCTION 1
12 RETURN_VALUE
又有一个代码对象const了-但现在,它不仅仅是我们编译并立即丢弃的const,它也是我们可以访问的函数的一部分.
我们如何访问它?好吧,这记录在 inspect
模块文档中,不是您要寻找的第一位.函数的__code__
成员具有一个代码对象,代码对象的co_consts
成员具有一个常量序列,我们正在寻找常量#1,因此:
>>> dis.dis(f.__code__.co_consts[1])
2 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 32 (to 38)
6 STORE_DEREF 0 (s)
8 LOAD_GLOBAL 0 (next)
10 LOAD_CLOSURE 0 (s)
12 BUILD_TUPLE 1
14 LOAD_CONST 0 (<code object <genexpr> at 0x11dd20030, file "<ipython-input-942-698335d58585>", line 2>)
16 LOAD_CONST 1 ('f.<locals>.<listcomp>.<genexpr>')
18 MAKE_FUNCTION 8
20 LOAD_GLOBAL 1 (d)
22 LOAD_ATTR 2 (items)
24 CALL_FUNCTION 0
26 GET_ITER
28 CALL_FUNCTION 1
30 LOAD_DEREF 0 (s)
32 CALL_FUNCTION 2
34 LIST_APPEND 2
36 JUMP_ABSOLUTE 4
>> 38 RETURN_VALUE
当然,您有一个生成器表达式嵌套在列表推导中,并且,您可能会猜到,这等效于创建然后调用生成器函数.但是,该生成器函数的代码也很容易找到(如果输入起来更麻烦):f.__code__.co_consts[1].co_consts[0]
.
Recently, I had a discussion on SO (see it for the context) about the two following pieces of code:
res = [d.get(next((k for k in d if k in s), None), s) for s in lst]
And:
res = [next((v for k,v in d.items() if k in s), s) for s in lst]
Both iterate through strings s
in a list lst
and look for s
in a dict d
. If s
is found, then the associated value is returned, else s
is returned. I'm pretty sure the second piece of code is faster than the first, because (for each s
) there is no lookup in the dictionary, just an iteration on the (key, value) pairs.
The question is: How to check that this is really what happens under the hood?
I tried, for the first time, the dis
module, but the result was disappointing (python 3.6.3):
>>> dis.dis("[d.get(next((k for k in d if k in s), None), s) for s in lst]")
1 0 LOAD_CONST 0 (<code object <listcomp> at 0x7f8e302039c0, file "<dis>", line 1>)
2 LOAD_CONST 1 ('<listcomp>')
4 MAKE_FUNCTION 0
6 LOAD_NAME 0 (lst)
8 GET_ITER
10 CALL_FUNCTION 1
12 RETURN_VALUE
>>> dis.dis("[next((v for k,v in d.items() if k in s), s) for s in lst]")
1 0 LOAD_CONST 0 (<code object <listcomp> at 0x7f8e302038a0, file "<dis>", line 1>)
2 LOAD_CONST 1 ('<listcomp>')
4 MAKE_FUNCTION 0
6 LOAD_NAME 0 (lst)
8 GET_ITER
10 CALL_FUNCTION 1
12 RETURN_VALUE
How do I get a more detailed information?
EDIT
As suggested by @abarnert in the first comment, I tried to timeit
both solutions. I played around with the following code:
from faker import Faker
from timeit import timeit
fake = Faker()
d = {fake.word():fake.word() for _ in range(50000)}
lst = fake.words(500000)
def f():return [d.get(next((k for k in d if k in s), None), s) for s in lst]
def g():return [next((v for k,v in d.items() if k in s), s) for s in lst]
print(timeit(f, number=1))
print(timeit(g, number=1))
assert f() == g()
Maybe I missed something but, to my surprise, the first piece of code (f
) was always faster than the second (g
). Hence the secondary question: does anyone have an explanation?
EDIT 2 Here are the most interesting parts of the disassembled code (with a little formatting to insert the inner loop).
For f
:
2 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 36 (to 42)
6 STORE_DEREF 0 (s)
8 LOAD_GLOBAL 0 (d)
10 LOAD_ATTR 1 (get)
12 LOAD_GLOBAL 2 (next)
14 LOAD_CLOSURE 0 (s)
16 BUILD_TUPLE 1
18 LOAD_CONST 0 (<code object <genexpr> at 0x7ff191b1d8a0, file "test.py", line 2>)
2 0 LOAD_FAST 0 (.0)
>> 2 FOR_ITER 18 (to 22)
4 STORE_FAST 1 (k)
6 LOAD_FAST 1 (k)
8 LOAD_DEREF 0 (s)
10 COMPARE_OP 6 (in)
12 POP_JUMP_IF_FALSE 2
14 LOAD_FAST 1 (k)
16 YIELD_VALUE
18 POP_TOP
20 JUMP_ABSOLUTE 2
>> 22 LOAD_CONST 0 (None)
24 RETURN_VALUE
20 LOAD_CONST 1 ('f.<locals>.<listcomp>.<genexpr>')
22 MAKE_FUNCTION 8
24 LOAD_GLOBAL 0 (d)
26 GET_ITER
28 CALL_FUNCTION 1
30 LOAD_CONST 2 (None)
32 CALL_FUNCTION 2
34 LOAD_DEREF 0 (s)
36 CALL_FUNCTION 2
38 LIST_APPEND 2
40 JUMP_ABSOLUTE 4
>> 42 RETURN_VALUE
For g
:
3 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 32 (to 38)
6 STORE_DEREF 0 (s)
8 LOAD_GLOBAL 0 (next)
10 LOAD_CLOSURE 0 (s)
12 BUILD_TUPLE 1
14 LOAD_CONST 0 (<code object <genexpr> at 0x7ff1905171e0, file "test.py", line 3>)
3 0 LOAD_FAST 0 (.0)
>> 2 FOR_ITER 22 (to 26)
4 UNPACK_SEQUENCE 2
6 STORE_FAST 1 (k)
8 STORE_FAST 2 (v)
10 LOAD_FAST 1 (k)
12 LOAD_DEREF 0 (s)
14 COMPARE_OP 6 (in)
16 POP_JUMP_IF_FALSE 2
18 LOAD_FAST 2 (v)
20 YIELD_VALUE
22 POP_TOP
24 JUMP_ABSOLUTE 2
>> 26 LOAD_CONST 0 (None)
28 RETURN_VALUE
16 LOAD_CONST 1 ('g.<locals>.<listcomp>.<genexpr>')
18 MAKE_FUNCTION 8
20 LOAD_GLOBAL 1 (d)
22 LOAD_ATTR 2 (items)
24 CALL_FUNCTION 0
26 GET_ITER
28 CALL_FUNCTION 1
30 LOAD_DEREF 0 (s)
32 CALL_FUNCTION 2
34 LIST_APPEND 2
36 JUMP_ABSOLUTE 4
>> 38 RETURN_VALUE
One can see that (again as suggested by @abarnert) the inner loop of g
contains some extra cost:
- (hidden) the construction of the 2-uples by the iterator on
d.items()
- an
UNPACK_SEQUENCE 2
which unpacks those 2-uples and then putsk
andv
on the stack - two
STORE_FAST
which popk
andv
from the stack to store them inco_varnames
.
Before it finally loads k
to compare it with s
as in f
. This inner loop is iterated |lst|*|d|
and It seems that those operations make the difference.
If this was optimized as I thought it was, the d.items()
iterator would have put first k
on the stack to test k in s
, and then, only if k in s
was true, put v
on the stack for the YIELD_VALUE
.
You've already got all the detailed information there is about the code that evaluates the list comprehension.
But list comprehensions are equivalent to creating and then calling a function. (This is how they have their own scope, so they don't, e.g., leak loop variables into the outer scope.) So that automatically-generated function named <listcomp>
is what you really want to see the code for.
If you want to disassemble it—well, notice that LOAD_CONST 0
says it's loading a <code object <listcomp> at 0x7f8e302038a0
? That's what you want. But we can't get to it, because all we did was compile a string for the sake of disassembling it, then throw the result away, so the listcomp function isn't around anymore.
But it's pretty easy to see with real code:
>>> def f():
... return [next((v for k,v in d.items() if k in s), s) for s in lst]
>>> dis.dis(f)
2 0 LOAD_CONST 1 (<code object <listcomp> at 0x11da9c660, file "<ipython-input-942-698335d58585>", line 2>)
2 LOAD_CONST 2 ('f.<locals>.<listcomp>')
4 MAKE_FUNCTION 0
6 LOAD_GLOBAL 0 (lst)
8 GET_ITER
10 CALL_FUNCTION 1
12 RETURN_VALUE
There's that code object const again—but now it's not just a const we compiled and immediately threw away, it's part of a function we can access.
How do we access it? Well, this is documented in the inspect
module docs, which probably isn't the first place you'd look. Functions have a code object in their __code__
member, code objects have a sequence of constants in their co_consts
member, and we're looking for constant #1, so:
>>> dis.dis(f.__code__.co_consts[1])
2 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 32 (to 38)
6 STORE_DEREF 0 (s)
8 LOAD_GLOBAL 0 (next)
10 LOAD_CLOSURE 0 (s)
12 BUILD_TUPLE 1
14 LOAD_CONST 0 (<code object <genexpr> at 0x11dd20030, file "<ipython-input-942-698335d58585>", line 2>)
16 LOAD_CONST 1 ('f.<locals>.<listcomp>.<genexpr>')
18 MAKE_FUNCTION 8
20 LOAD_GLOBAL 1 (d)
22 LOAD_ATTR 2 (items)
24 CALL_FUNCTION 0
26 GET_ITER
28 CALL_FUNCTION 1
30 LOAD_DEREF 0 (s)
32 CALL_FUNCTION 2
34 LIST_APPEND 2
36 JUMP_ABSOLUTE 4
>> 38 RETURN_VALUE
Of course you have a generator expression nested inside your list comprehension, and, as you can probably guess, that's also equivalent to creating and then calling a generator function. But that generator function's code is just as easy to find (if even more tedious to type out): f.__code__.co_consts[1].co_consts[0]
.
这篇关于Python:使用Dis分析列表理解的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!