Python编译/解释过程 [英] Python Compilation/Interpretation Process

查看:141
本文介绍了Python编译/解释过程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图更清楚地理解python编译器/解释器过程。不幸的是,我没有参加一个类的解释器,也没有读过很多关于他们。



基本上,我现在的理解是Python代码从.py文件是第一编译成python字节码(我假设是.pyc文件,我偶尔看到?)。接下来,字节码被编译成机器码,处理器实际理解的语言。
很多,我读过这个线程为什么python在解释之前将源代码编译成字节码?



有人可以给我一个很好的解释整个过程,记住我的编译器/解释器的知识是几乎不存在?



感谢



除了这些,我们还需要解决方案。你的描述正确。字节码被加载到Python运行时中并由虚拟机解释,该虚拟机是读取字节码中的每个指令并执行指示的任何操作的一段代码。您可以使用 dis 模块查看此字节码,如下所示:

  >>>> def fib(n):return n if n < 2 else fib(n-2)+ fib(n-1)
...
>>> fib(10)
55
>>>> import dis
>>>> dis.dis(fib)
1 0 LOAD_FAST 0(n)
3 LOAD_CONST 1(2)
6 COMPARE_OP 0(<)
9 JUMP_IF_FALSE 5(至17)
12 POP_TOP
13 LOAD_FAST 0(n)
16 RETURN_VALUE
>> 17 POP_TOP
18 LOAD_GLOBAL 0(fib)
21 LOAD_FAST 0(n)
24 LOAD_CONST 1(2)
27 BINARY_SUBTRACT
28 CALL_FUNCTION 1
31 LOAD_GLOBAL 0(fib)
34 LOAD_FAST 0(n)
37 LOAD_CONST 2(1)
40 BINARY_SUBTRACT
41 CALL_FUNCTION 1
44 BINARY_ADD
45 RETURN_VALUE
>>>>



详细说明



要理解上面的代码永远不会由你的CPU执行;也不会被转换为(至少,不是在Python的官方C实现)的东西。 CPU执行虚拟机代码,其执行由字节码指令指示的工作。当解释器想要执行 fib 函数时,它每次读取一个指令,并执行它们要做的事情。它查看第一条指令 LOAD_FAST 0 ,从而获取参数0( n 传递给 fib )从保存参数,并将其推送到解释器的堆栈(Python的解释器是一个堆栈机)。在读取下一个指令 LOAD_CONST 1 时,它从函数拥有的常量集合中获取常量数1,在这种情况下恰好是数字2,到堆栈。您可以看到这些常数:

 >>> fib.func_code.co_consts 
(None,2,1)

code> COMPARE_OP 0 告诉解释器弹出两个最顶层的堆栈元素,并在它们之间执行不等式比较,将布尔结果推回堆栈。第四指令基于布尔值确定是向前跳跃五个指令还是继续下一个指令。所有的语句解释如果n < < c> fib 中的条件表达式的<2 部分。这将是一个高度指导性的练习,你可以弄清楚 fib 字节码的其余部分的含义和行为。唯一的一个,我不确定是 POP_TOP ;我猜想 JUMP_IF_FALSE 被定义为在堆栈上保留其布尔参数而不是弹出它,因此必须显式弹出。



更有指导意义的是检查 fib 的原始字节码:

 >>> code = fib.func_code.co_code 
>>>> code
'| \x00\x00d\x01\x00j\x00\x00o\x05\x00\x01 | \x00\x00S\x01t\x00\x00 | \x00\x00d\x01\x00\x18\x83\x01\x00t\x00\x00 | \x00\x00d\x02\x00\x18\x83 \x01\x00\x17S'
>>>> import opcode
>>>> op = code [0]
>>>> op
'|'
>>>> op = ord(op)
>>>> op
124
>>>> opcode.opname [op]
'LOAD_FAST'
>>>>

因此,您可以看到字节码的第一个字节是 LOAD_FAST 指令。下一对字节'\x00\x00'(16位中的数字0)是 LOAD_FAST ,并告诉字节码解释器将参数0加载到堆栈。​​


I'm trying to understand the python compiler/interpreter process more clearly. Unfortunately, I have not taken a class in interpreters nor have I read much about them.

Basically, what I understand right now is that Python code from .py files is first compiled into python bytecode (which i assume are the .pyc files i see occasionally?). Next, the bytecode is compiled into machine code, a language the processor actually understands. Pretty much, I've read this thread Why python compile the source to bytecode before interpreting?

Could somebody give me a good explanation of the whole process keeping in mind that my knowledge of compilers/interpreters is almost non-existent? Or, if that's not possible, maybe give me some resources that give quick overviews of compilers/interpreters?

Thanks

解决方案

The bytecode is not actually interpreted to machine code, unless you are using some exotic implementation such as pypy.

Other than that, you have the description correct. The bytecode is loaded into the Python runtime and interpreted by a virtual machine, which is a piece of code that reads each instruction in the bytecode and executes whatever operation is indicated. You can see this bytecode with the dis module, as follows:

>>> def fib(n): return n if n < 2 else fib(n - 2) + fib(n - 1)
... 
>>> fib(10)
55
>>> import dis
>>> dis.dis(fib)
  1           0 LOAD_FAST                0 (n)
              3 LOAD_CONST               1 (2)
              6 COMPARE_OP               0 (<)
              9 JUMP_IF_FALSE            5 (to 17)
             12 POP_TOP             
             13 LOAD_FAST                0 (n)
             16 RETURN_VALUE        
        >>   17 POP_TOP             
             18 LOAD_GLOBAL              0 (fib)
             21 LOAD_FAST                0 (n)
             24 LOAD_CONST               1 (2)
             27 BINARY_SUBTRACT     
             28 CALL_FUNCTION            1
             31 LOAD_GLOBAL              0 (fib)
             34 LOAD_FAST                0 (n)
             37 LOAD_CONST               2 (1)
             40 BINARY_SUBTRACT     
             41 CALL_FUNCTION            1
             44 BINARY_ADD          
             45 RETURN_VALUE        
>>> 

Detailed explanation

It is quite important to understand that the above code is never executed by your CPU; nor is it ever converted into something that is (at least, not on the official C implementation of Python). The CPU executes the virtual machine code, which performs the work indicated by the bytecode instructions. When the interpreter wants to execute the fib function, it reads the instructions one at a time, and does what they tell it to do. It looks at the first instruction, LOAD_FAST 0, and thus grabs parameter 0 (the n passed to fib) from wherever parameters are held and pushes it onto the interpreter's stack (Python's interpreter is a stack machine). On reading the next instruction, LOAD_CONST 1, it grabs constant number 1 from a collection of constants owned by the function, which happens to be the number 2 in this case, and pushes that onto the stack. You can actually see these constants:

>>> fib.func_code.co_consts
(None, 2, 1)

The next instruction, COMPARE_OP 0, tells the interpreter to pop the two topmost stack elements and perform an inequality comparison between them, pushing the Boolean result back onto the stack. The fourth instruction determines, based on the Boolean value, whether to jump forward five instructions or continue on with the next instruction. All that verbiage explains the if n < 2 part of the conditional expression in fib. It will be a highly instructive exercise for you to tease out the meaning and behaviour of the rest of the fib bytecode. The only one, I'm not sure about is POP_TOP; I'm guessing JUMP_IF_FALSE is defined to leave its Boolean argument on the stack rather than popping it, so it has to be popped explicitly.

Even more instructive is to inspect the raw bytecode for fib thus:

>>> code = fib.func_code.co_code
>>> code
'|\x00\x00d\x01\x00j\x00\x00o\x05\x00\x01|\x00\x00S\x01t\x00\x00|\x00\x00d\x01\x00\x18\x83\x01\x00t\x00\x00|\x00\x00d\x02\x00\x18\x83\x01\x00\x17S'
>>> import opcode
>>> op = code[0]
>>> op
'|'
>>> op = ord(op)
>>> op
124
>>> opcode.opname[op]
'LOAD_FAST'
>>> 

Thus you can see that the first byte of the bytecode is the LOAD_FAST instruction. The next pair of bytes, '\x00\x00' (the number 0 in 16 bits) is the argument to LOAD_FAST, and tells the bytecode interpreter to load parameter 0 onto the stack.

这篇关于Python编译/解释过程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆