Python:使用Dis分析列表理解 [英] Python: analyze a list comprehension with dis

查看:48
本文介绍了Python:使用Dis分析列表理解的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最近,我对SO进行了讨论(关于上下文),涉及以下两个代码段:

res = [d.get(next((k for k in d if k in s), None), s) for s in lst]

并且:

res = [next((v for k,v in d.items() if k in s), s) for s in lst]

两者都循环访问列表lst中的字符串s,并在字典d中查找s.如果找到s,则返回关联的值,否则返回s.我很确定第二段代码要比第一段更快,因为(对于每个s)字典中没有查找,而只是对(键,值)对进行了迭代.

问题是: 如何检查这到底是怎么回事?

我第一次尝试使用dis模块,但结果令人失望(python 3.6.3):

>>> dis.dis("[d.get(next((k for k in d if k in s), None), s) for s in lst]")
  1           0 LOAD_CONST               0 (<code object <listcomp> at 0x7f8e302039c0, file "<dis>", line 1>)
              2 LOAD_CONST               1 ('<listcomp>')
              4 MAKE_FUNCTION            0
              6 LOAD_NAME                0 (lst)
              8 GET_ITER
             10 CALL_FUNCTION            1
             12 RETURN_VALUE
>>> dis.dis("[next((v for k,v in d.items() if k in s), s) for s in lst]")
  1           0 LOAD_CONST               0 (<code object <listcomp> at 0x7f8e302038a0, file "<dis>", line 1>)
              2 LOAD_CONST               1 ('<listcomp>')
              4 MAKE_FUNCTION            0
              6 LOAD_NAME                0 (lst)
              8 GET_ITER
             10 CALL_FUNCTION            1
             12 RETURN_VALUE

如何获取更详细的信息?

编辑 正如@abarnert在第一条评论中所建议的那样,我尝试timeit两种解决方案.我玩了以下代码:

from faker import Faker
from timeit import timeit

fake = Faker()

d = {fake.word():fake.word() for _ in range(50000)}
lst = fake.words(500000)

def f():return [d.get(next((k for k in d if k in s), None), s) for s in lst]
def g():return [next((v for k,v in d.items() if k in s), s) for s in lst]

print(timeit(f, number=1))
print(timeit(g, number=1))

assert f() == g()

也许我错过了一些东西,但是令我惊讶的是,第一段代码(f)总是比第二段代码(g)快.因此,第二个问题是:有人在解释吗?

编辑2 以下是反汇编代码中最有趣的部分(只需稍加格式化即可插入内部循环). 对于f:

2           0 BUILD_LIST               0
          2 LOAD_FAST                0 (.0)
    >>    4 FOR_ITER                36 (to 42)
          6 STORE_DEREF              0 (s)
          8 LOAD_GLOBAL              0 (d)
         10 LOAD_ATTR                1 (get)
         12 LOAD_GLOBAL              2 (next)
         14 LOAD_CLOSURE             0 (s)
         16 BUILD_TUPLE              1
         18 LOAD_CONST               0 (<code object <genexpr> at 0x7ff191b1d8a0, file "test.py", line 2>)
         2           0 LOAD_FAST                0 (.0)
               >>    2 FOR_ITER                18 (to 22)
                     4 STORE_FAST               1 (k)
                     6 LOAD_FAST                1 (k)
                     8 LOAD_DEREF               0 (s)
                    10 COMPARE_OP               6 (in)
                    12 POP_JUMP_IF_FALSE        2
                    14 LOAD_FAST                1 (k)
                    16 YIELD_VALUE
                    18 POP_TOP
                    20 JUMP_ABSOLUTE            2
               >>   22 LOAD_CONST               0 (None)
                    24 RETURN_VALUE
         20 LOAD_CONST               1 ('f.<locals>.<listcomp>.<genexpr>')
         22 MAKE_FUNCTION            8
         24 LOAD_GLOBAL              0 (d)
         26 GET_ITER
         28 CALL_FUNCTION            1
         30 LOAD_CONST               2 (None)
         32 CALL_FUNCTION            2
         34 LOAD_DEREF               0 (s)
         36 CALL_FUNCTION            2
         38 LIST_APPEND              2
         40 JUMP_ABSOLUTE            4
    >>   42 RETURN_VALUE

对于g:

3           0 BUILD_LIST               0
          2 LOAD_FAST                0 (.0)
    >>    4 FOR_ITER                32 (to 38)
          6 STORE_DEREF              0 (s)
          8 LOAD_GLOBAL              0 (next)
         10 LOAD_CLOSURE             0 (s)
         12 BUILD_TUPLE              1
         14 LOAD_CONST               0 (<code object <genexpr> at 0x7ff1905171e0, file "test.py", line 3>)
         3           0 LOAD_FAST                0 (.0)
               >>    2 FOR_ITER                22 (to 26)
                     4 UNPACK_SEQUENCE          2
                     6 STORE_FAST               1 (k)
                     8 STORE_FAST               2 (v)
                    10 LOAD_FAST                1 (k)
                    12 LOAD_DEREF               0 (s)
                    14 COMPARE_OP               6 (in)
                    16 POP_JUMP_IF_FALSE        2
                    18 LOAD_FAST                2 (v)
                    20 YIELD_VALUE
                    22 POP_TOP
                    24 JUMP_ABSOLUTE            2
               >>   26 LOAD_CONST               0 (None)
                    28 RETURN_VALUE
         16 LOAD_CONST               1 ('g.<locals>.<listcomp>.<genexpr>')
         18 MAKE_FUNCTION            8
         20 LOAD_GLOBAL              1 (d)
         22 LOAD_ATTR                2 (items)
         24 CALL_FUNCTION            0
         26 GET_ITER
         28 CALL_FUNCTION            1
         30 LOAD_DEREF               0 (s)
         32 CALL_FUNCTION            2
         34 LIST_APPEND              2
         36 JUMP_ABSOLUTE            4
    >>   38 RETURN_VALUE

可以看到(再次由@abarnert提出),g的内部循环包含一些额外的费用:

  1. (隐藏)由d.items()
  2. 上的迭代器构造的2轴
  3. 一个UNPACK_SEQUENCE 2会打开这两个2包的包装,然后将kv放到堆栈上
  4. 两个STORE_FAST,它们从堆栈中弹出kv并将它们存储在co_varnames中.

在最终加载k之前,将其与f中的s进行比较.这个内部循环被迭代|lst|*|d|,并且似乎这些操作有所不同.

如果按照我的想法进行了优化,则d.items()迭代器将首先将k放在堆栈上以测试k in s,然后,仅当k in s为true时,才将v放在YIELD_VALUE的堆栈.

解决方案

您已经获得了所有有关评估列表理解代码的详细信息.

但是列表推导等效于创建然后调用一个函数. (这是它们具有自己的作用域的方式,因此它们没有(例如,将循环变量泄漏到外部作用域中.)因此,您真正希望查看的代码是自动生成的名为<listcomp>的函数.

如果要拆卸它,那么,请注意LOAD_CONST 0表示正在加载<code object <listcomp> at 0x7f8e302038a0吗?那正是你想要的.但是我们无法做到这一点,因为我们所做的只是为了分解而编译了一个字符串,然后将结果扔掉了,所以listcomp函数不再存在了.

但是用真实的代码很容易看到:

>>> def f():
...     return [next((v for k,v in d.items() if k in s), s) for s in lst]
>>> dis.dis(f)
  2           0 LOAD_CONST               1 (<code object <listcomp> at 0x11da9c660, file "<ipython-input-942-698335d58585>", line 2>)
              2 LOAD_CONST               2 ('f.<locals>.<listcomp>')
              4 MAKE_FUNCTION            0
              6 LOAD_GLOBAL              0 (lst)
              8 GET_ITER
             10 CALL_FUNCTION            1
             12 RETURN_VALUE

又有一个代码对象const了-但现在,它不仅仅是我们编译并立即丢弃的const,它也是我们可以访问的函数的一部分.

我们如何访问它?好吧,这记录在 inspect 模块文档中,不是您要寻找的第一位.函数的__code__成员具有一个代码对象,代码对象的co_consts成员具有一个常量序列,我们正在寻找常量#1,因此:

>>> dis.dis(f.__code__.co_consts[1])
  2           0 BUILD_LIST               0
              2 LOAD_FAST                0 (.0)
        >>    4 FOR_ITER                32 (to 38)
              6 STORE_DEREF              0 (s)
              8 LOAD_GLOBAL              0 (next)
             10 LOAD_CLOSURE             0 (s)
             12 BUILD_TUPLE              1
             14 LOAD_CONST               0 (<code object <genexpr> at 0x11dd20030, file "<ipython-input-942-698335d58585>", line 2>)
             16 LOAD_CONST               1 ('f.<locals>.<listcomp>.<genexpr>')
             18 MAKE_FUNCTION            8
             20 LOAD_GLOBAL              1 (d)
             22 LOAD_ATTR                2 (items)
             24 CALL_FUNCTION            0
             26 GET_ITER
             28 CALL_FUNCTION            1
             30 LOAD_DEREF               0 (s)
             32 CALL_FUNCTION            2
             34 LIST_APPEND              2
             36 JUMP_ABSOLUTE            4
        >>   38 RETURN_VALUE

当然,您有一个生成器表达式嵌套在列表推导中,并且,您可能会猜到,这等效于创建然后调用生成器函数.但是,该生成器函数的代码也很容易找到(如果输入起来更麻烦):f.__code__.co_consts[1].co_consts[0].

Recently, I had a discussion on SO (see it for the context) about the two following pieces of code:

res = [d.get(next((k for k in d if k in s), None), s) for s in lst]

And:

res = [next((v for k,v in d.items() if k in s), s) for s in lst]

Both iterate through strings s in a list lst and look for s in a dict d. If s is found, then the associated value is returned, else s is returned. I'm pretty sure the second piece of code is faster than the first, because (for each s) there is no lookup in the dictionary, just an iteration on the (key, value) pairs.

The question is: How to check that this is really what happens under the hood?

I tried, for the first time, the dis module, but the result was disappointing (python 3.6.3):

>>> dis.dis("[d.get(next((k for k in d if k in s), None), s) for s in lst]")
  1           0 LOAD_CONST               0 (<code object <listcomp> at 0x7f8e302039c0, file "<dis>", line 1>)
              2 LOAD_CONST               1 ('<listcomp>')
              4 MAKE_FUNCTION            0
              6 LOAD_NAME                0 (lst)
              8 GET_ITER
             10 CALL_FUNCTION            1
             12 RETURN_VALUE
>>> dis.dis("[next((v for k,v in d.items() if k in s), s) for s in lst]")
  1           0 LOAD_CONST               0 (<code object <listcomp> at 0x7f8e302038a0, file "<dis>", line 1>)
              2 LOAD_CONST               1 ('<listcomp>')
              4 MAKE_FUNCTION            0
              6 LOAD_NAME                0 (lst)
              8 GET_ITER
             10 CALL_FUNCTION            1
             12 RETURN_VALUE

How do I get a more detailed information?

EDIT As suggested by @abarnert in the first comment, I tried to timeit both solutions. I played around with the following code:

from faker import Faker
from timeit import timeit

fake = Faker()

d = {fake.word():fake.word() for _ in range(50000)}
lst = fake.words(500000)

def f():return [d.get(next((k for k in d if k in s), None), s) for s in lst]
def g():return [next((v for k,v in d.items() if k in s), s) for s in lst]

print(timeit(f, number=1))
print(timeit(g, number=1))

assert f() == g()

Maybe I missed something but, to my surprise, the first piece of code (f) was always faster than the second (g). Hence the secondary question: does anyone have an explanation?

EDIT 2 Here are the most interesting parts of the disassembled code (with a little formatting to insert the inner loop). For f:

2           0 BUILD_LIST               0
          2 LOAD_FAST                0 (.0)
    >>    4 FOR_ITER                36 (to 42)
          6 STORE_DEREF              0 (s)
          8 LOAD_GLOBAL              0 (d)
         10 LOAD_ATTR                1 (get)
         12 LOAD_GLOBAL              2 (next)
         14 LOAD_CLOSURE             0 (s)
         16 BUILD_TUPLE              1
         18 LOAD_CONST               0 (<code object <genexpr> at 0x7ff191b1d8a0, file "test.py", line 2>)
         2           0 LOAD_FAST                0 (.0)
               >>    2 FOR_ITER                18 (to 22)
                     4 STORE_FAST               1 (k)
                     6 LOAD_FAST                1 (k)
                     8 LOAD_DEREF               0 (s)
                    10 COMPARE_OP               6 (in)
                    12 POP_JUMP_IF_FALSE        2
                    14 LOAD_FAST                1 (k)
                    16 YIELD_VALUE
                    18 POP_TOP
                    20 JUMP_ABSOLUTE            2
               >>   22 LOAD_CONST               0 (None)
                    24 RETURN_VALUE
         20 LOAD_CONST               1 ('f.<locals>.<listcomp>.<genexpr>')
         22 MAKE_FUNCTION            8
         24 LOAD_GLOBAL              0 (d)
         26 GET_ITER
         28 CALL_FUNCTION            1
         30 LOAD_CONST               2 (None)
         32 CALL_FUNCTION            2
         34 LOAD_DEREF               0 (s)
         36 CALL_FUNCTION            2
         38 LIST_APPEND              2
         40 JUMP_ABSOLUTE            4
    >>   42 RETURN_VALUE

For g:

3           0 BUILD_LIST               0
          2 LOAD_FAST                0 (.0)
    >>    4 FOR_ITER                32 (to 38)
          6 STORE_DEREF              0 (s)
          8 LOAD_GLOBAL              0 (next)
         10 LOAD_CLOSURE             0 (s)
         12 BUILD_TUPLE              1
         14 LOAD_CONST               0 (<code object <genexpr> at 0x7ff1905171e0, file "test.py", line 3>)
         3           0 LOAD_FAST                0 (.0)
               >>    2 FOR_ITER                22 (to 26)
                     4 UNPACK_SEQUENCE          2
                     6 STORE_FAST               1 (k)
                     8 STORE_FAST               2 (v)
                    10 LOAD_FAST                1 (k)
                    12 LOAD_DEREF               0 (s)
                    14 COMPARE_OP               6 (in)
                    16 POP_JUMP_IF_FALSE        2
                    18 LOAD_FAST                2 (v)
                    20 YIELD_VALUE
                    22 POP_TOP
                    24 JUMP_ABSOLUTE            2
               >>   26 LOAD_CONST               0 (None)
                    28 RETURN_VALUE
         16 LOAD_CONST               1 ('g.<locals>.<listcomp>.<genexpr>')
         18 MAKE_FUNCTION            8
         20 LOAD_GLOBAL              1 (d)
         22 LOAD_ATTR                2 (items)
         24 CALL_FUNCTION            0
         26 GET_ITER
         28 CALL_FUNCTION            1
         30 LOAD_DEREF               0 (s)
         32 CALL_FUNCTION            2
         34 LIST_APPEND              2
         36 JUMP_ABSOLUTE            4
    >>   38 RETURN_VALUE

One can see that (again as suggested by @abarnert) the inner loop of g contains some extra cost:

  1. (hidden) the construction of the 2-uples by the iterator on d.items()
  2. an UNPACK_SEQUENCE 2 which unpacks those 2-uples and then puts k and v on the stack
  3. two STORE_FAST which pop k and v from the stack to store them in co_varnames.

Before it finally loads k to compare it with s as in f. This inner loop is iterated |lst|*|d| and It seems that those operations make the difference.

If this was optimized as I thought it was, the d.items() iterator would have put first k on the stack to test k in s, and then, only if k in s was true, put v on the stack for the YIELD_VALUE.

解决方案

You've already got all the detailed information there is about the code that evaluates the list comprehension.

But list comprehensions are equivalent to creating and then calling a function. (This is how they have their own scope, so they don't, e.g., leak loop variables into the outer scope.) So that automatically-generated function named <listcomp> is what you really want to see the code for.

If you want to disassemble it—well, notice that LOAD_CONST 0 says it's loading a <code object <listcomp> at 0x7f8e302038a0? That's what you want. But we can't get to it, because all we did was compile a string for the sake of disassembling it, then throw the result away, so the listcomp function isn't around anymore.

But it's pretty easy to see with real code:

>>> def f():
...     return [next((v for k,v in d.items() if k in s), s) for s in lst]
>>> dis.dis(f)
  2           0 LOAD_CONST               1 (<code object <listcomp> at 0x11da9c660, file "<ipython-input-942-698335d58585>", line 2>)
              2 LOAD_CONST               2 ('f.<locals>.<listcomp>')
              4 MAKE_FUNCTION            0
              6 LOAD_GLOBAL              0 (lst)
              8 GET_ITER
             10 CALL_FUNCTION            1
             12 RETURN_VALUE

There's that code object const again—but now it's not just a const we compiled and immediately threw away, it's part of a function we can access.

How do we access it? Well, this is documented in the inspect module docs, which probably isn't the first place you'd look. Functions have a code object in their __code__ member, code objects have a sequence of constants in their co_consts member, and we're looking for constant #1, so:

>>> dis.dis(f.__code__.co_consts[1])
  2           0 BUILD_LIST               0
              2 LOAD_FAST                0 (.0)
        >>    4 FOR_ITER                32 (to 38)
              6 STORE_DEREF              0 (s)
              8 LOAD_GLOBAL              0 (next)
             10 LOAD_CLOSURE             0 (s)
             12 BUILD_TUPLE              1
             14 LOAD_CONST               0 (<code object <genexpr> at 0x11dd20030, file "<ipython-input-942-698335d58585>", line 2>)
             16 LOAD_CONST               1 ('f.<locals>.<listcomp>.<genexpr>')
             18 MAKE_FUNCTION            8
             20 LOAD_GLOBAL              1 (d)
             22 LOAD_ATTR                2 (items)
             24 CALL_FUNCTION            0
             26 GET_ITER
             28 CALL_FUNCTION            1
             30 LOAD_DEREF               0 (s)
             32 CALL_FUNCTION            2
             34 LIST_APPEND              2
             36 JUMP_ABSOLUTE            4
        >>   38 RETURN_VALUE

Of course you have a generator expression nested inside your list comprehension, and, as you can probably guess, that's also equivalent to creating and then calling a generator function. But that generator function's code is just as easy to find (if even more tedious to type out): f.__code__.co_consts[1].co_consts[0].

这篇关于Python:使用Dis分析列表理解的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆