Julia中@ code_native,@ code_typed和@code_llvm有什么区别? [英] What is the difference between @code_native, @code_typed and @code_llvm in Julia?

查看:60
本文介绍了Julia中@ code_native,@ code_typed和@code_llvm有什么区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在阅读julia时,我希望具有类似于python dis模块的功能. 通过网络浏览,我发现Julia社区已经解决了这个问题,并给出了这些建议( https ://github.com/JuliaLang/julia/issues/218 )

While going through julia, I wanted to have a functionality similar to python's dis module. Going through over the net, I found out that the Julia community have worked over this issue and given these (https://github.com/JuliaLang/julia/issues/218)

finfer -> code_typed
methods(function, types) -> code_lowered
disassemble(function, types, true) -> code_native
disassemble(function, types, false) -> code_llvm

我已经使用Julia REPL亲自尝试了这些,但是我似乎很难理解.

I have tried these personally using the Julia REPL, but I quite seem to find it hard to understand.

在Python中,我可以反汇编这样的函数.

In Python, I can disassemble a function like this.

>>> import dis
>>> dis.dis(lambda x: 2*x)
  1           0 LOAD_CONST               1 (2)
              3 LOAD_FAST                0 (x)
              6 BINARY_MULTIPLY     
              7 RETURN_VALUE        
>>>

使用这些工具的任何人都可以帮助我更多地了解它们吗?谢谢.

Can anyone who has worked with these help me understand them more? Thanks.

推荐答案

Python的标准CPython实现解析源代码,并对源代码进行一些预处理和简化(又称为降低"),将其转换为对机器友好的代码,易于理解的格式,称为"字节码".这是反汇编" Python函数时显示的内容.该代码不可通过硬件执行-CPython解释器可执行". CPython的字节码格式相当简单,部分是因为这是解释器往往会做的很好的事情-如果字节码太复杂,它将使解释器变慢-一部分是因为Python社区倾向于对简化性给予很高的重视,有时会付出代价高性能.

The standard CPython implementation of Python parses source code and does some pre-processing and simplification of it – aka "lowering" – transforming it to a machine-friendly, easy-to-interpret format called "bytecode". This is what is displayed when you "disassemble" a Python function. This code is not executable by the hardware – it is "executable" by the CPython interpreter. CPython's bytecode format is fairly simple, partly because that's what interpreters tend to do well with – if the bytecode is too complex, it slows down the interpreter – and partly because the Python community tends to put a high premium on simplicity, sometimes at the cost of high performance.

Julia的实现未得到解释,它是即时(JIT)编译的.这意味着,当您调用一个函数时,它将转换为机器代码,该机器代码直接由本机硬件执行.这个过程比解析和降低为Python的字节码要复杂得多,但是为了换取这种复杂性,Julia获得了标志性的速度. (用于Python的PyPy JIT也比CPython复杂得多,但通常也要快得多–复杂性增加是相当典型的速度成本.)Julia代码的四个反汇编"级别使您可以访问Julia方法的表示形式在从源代码到机器代码的转换的不同阶段,对特定参数类型的实现.我将使用以下函数来计算其参数后的下一个斐波那契数作为示例:

Julia's implementation is not interpreted, it is just-in-time (JIT) compiled. This means that when you call a function, it is transformed to machine code which is executed directly by the native hardware. This process is quite a bit more complex than the parsing and lowering to bytecode that Python does, but in exchange for that complexity, Julia gets its hallmark speed. (The PyPy JIT for Python is also much more complex than CPython but also typically much faster – increased complexity is a fairly typical cost for speed.) The four levels of "disassembly" for Julia code give you access to the representation of a Julia method implementation for particular argument types at different stages of the transformation from source code to machine code. I'll use the following function which computes the next Fibonacci number after its argument as an example:

function nextfib(n)
    a, b = one(n), one(n)
    while b < n
        a, b = b, a + b
    end
    return b
end

julia> nextfib(5)
5

julia> nextfib(6)
8

julia> nextfib(123)
144

降低的代码.@code_lowered宏以与Python字节码最接近的格式显示代码,而不是由解释器执行,而是由a编译器.此格式主要是内部格式,不适合人类使用.该代码被转换为"单静态分配"的形式,其中每个变量只被分配了一次,并且每个变量在使用前都已定义".使用单个unless/goto构造将循环和条件转换为gotos和标签(这在用户级别的Julia中没有公开).这是我们的示例代码,其格式较低(在Julia 0.6.0-pre.beta.134中,这正是我碰巧可用的代码):

Lowered code. The @code_lowered macro displays code in a format that is the closest to Python byte code, but rather than being intended for execution by an interpreter, it's intended for further transformation by a compiler. This format is largely internal and not intended for human consumption. The code is transformed into "single static assignment" form in which "each variable is assigned exactly once, and every variable is defined before it is used". Loops and conditionals are transformed into gotos and labels using a single unless/goto construct (this is not exposed in user-level Julia). Here's our example code in lowered form (in Julia 0.6.0-pre.beta.134, which is just what I happen to have available):

julia> @code_lowered nextfib(123)
CodeInfo(:(begin
        nothing
        SSAValue(0) = (Main.one)(n)
        SSAValue(1) = (Main.one)(n)
        a = SSAValue(0)
        b = SSAValue(1) # line 3:
        7:
        unless b < n goto 16 # line 4:
        SSAValue(2) = b
        SSAValue(3) = a + b
        a = SSAValue(2)
        b = SSAValue(3)
        14:
        goto 7
        16:  # line 6:
        return b
    end))

您可以看到SSAValue节点以及unless/goto构造和标签编号.这并不难读,但同样,它也不是很容易为人类食用.降低的代码不依赖于参数的类型,除非它们确定要调用的方法主体–只要调用相同的方法,就应用相同的降低的代码.

You can see the SSAValue nodes and unless/goto constructs and label numbers. This is not that hard to read, but again, it's also not really meant to be easy for human consumption. Lowered code doesn't depend on the types of the arguments, except in as far as they determine which method body to call – as long as the same method is called, the same lowered code applies.

已键入代码. @code_typed宏在类型推断内联.该代码的化身类似于简化的形式,但是带有用类型信息注释的表达式,并用其实现替换了一些泛型函数调用.例如,这是示例函数的类型代码:

Typed code. The @code_typed macro presents a method implementation for a particular set of argument types after type inference and inlining. This incarnation of the code is similar to the lowered form, but with expressions annotated with type information and some generic function calls replaced with their implementations. For example, here is the type code for our example function:

julia> @code_typed nextfib(123)
CodeInfo(:(begin
        a = 1
        b = 1 # line 3:
        4:
        unless (Base.slt_int)(b, n)::Bool goto 13 # line 4:
        SSAValue(2) = b
        SSAValue(3) = (Base.add_int)(a, b)::Int64
        a = SSAValue(2)
        b = SSAValue(3)
        11:
        goto 4
        13:  # line 6:
        return b
    end))=>Int64

one(n)的调用已替换为文字Int641(在我的系统上,默认整数类型为Int64).根据slt_int intrinsic (小于)的有符号整数,并且其结果已经用返回类型Bool进行了注释.根据add_int内在函数并将其结果类型标注为Int64的方式,表达式a + b也已替换为其实现.整个函数体的返回类型都被注释为Int64.

Calls to one(n) have been replaced with the literal Int64 value 1 (on my system the default integer type is Int64). The expression b < n has been replaced with its implementation in terms of the slt_int intrinsic ("signed integer less than") and the result of this has been annotated with return type Bool. The expression a + b has been also replaced with its implementation in terms of the add_int intrinsic and its result type annotated as Int64. And the return type of the entire function body has been annotated as Int64.

不同于降低代码,后者仅取决于参数类型来确定调用哪个方法主体,而键入代码的详细信息取决于参数类型:

Unlike lowered code, which depends only on argument types to determine which method body is called, the details of typed code depend on argument types:

julia> @code_typed nextfib(Int128(123))
CodeInfo(:(begin
        SSAValue(0) = (Base.sext_int)(Int128, 1)::Int128
        SSAValue(1) = (Base.sext_int)(Int128, 1)::Int128
        a = SSAValue(0)
        b = SSAValue(1) # line 3:
        6:
        unless (Base.slt_int)(b, n)::Bool goto 15 # line 4:
        SSAValue(2) = b
        SSAValue(3) = (Base.add_int)(a, b)::Int128
        a = SSAValue(2)
        b = SSAValue(3)
        13:
        goto 6
        15:  # line 6:
        return b
    end))=>Int128

这是Int128参数的nextfib函数的类型化版本.文字1必须符号扩展为Int128,并且运算的结果类型为Int128类型,而不是Int64类型.如果类型的实现有很大不同,则键入的代码可能会完全不同.例如,BigIntsnextfib远远比Int64Int128的简单位类型"要复杂得多:

This is the typed version of the nextfib function for an Int128 argument. The literal 1 must be sign extended to Int128 and the result types of operations are of type Int128 instead of Int64. The typed code can be quite different if the implementation of a type is considerably different. For example nextfib for BigInts is significantly more involved than for simple "bits types" like Int64 and Int128:

julia> @code_typed nextfib(big(123))
CodeInfo(:(begin
        $(Expr(:inbounds, false))
        # meta: location number.jl one 164
        # meta: location number.jl one 163
        # meta: location gmp.jl convert 111
        z@_5 = $(Expr(:invoke, MethodInstance for BigInt(), :(Base.GMP.BigInt))) # line 112:
        $(Expr(:foreigncall, (:__gmpz_set_si, :libgmp), Void, svec(Ptr{BigInt}, Int64), :(&z@_5), :(z@_5), 1, 0))
        # meta: pop location
        # meta: pop location
        # meta: pop location
        $(Expr(:inbounds, :pop))
        $(Expr(:inbounds, false))
        # meta: location number.jl one 164
        # meta: location number.jl one 163
        # meta: location gmp.jl convert 111
        z@_6 = $(Expr(:invoke, MethodInstance for BigInt(), :(Base.GMP.BigInt))) # line 112:
        $(Expr(:foreigncall, (:__gmpz_set_si, :libgmp), Void, svec(Ptr{BigInt}, Int64), :(&z@_6), :(z@_6), 1, 0))
        # meta: pop location
        # meta: pop location
        # meta: pop location
        $(Expr(:inbounds, :pop))
        a = z@_5
        b = z@_6 # line 3:
        26:
        $(Expr(:inbounds, false))
        # meta: location gmp.jl < 516
        SSAValue(10) = $(Expr(:foreigncall, (:__gmpz_cmp, :libgmp), Int32, svec(Ptr{BigInt}, Ptr{BigInt}), :(&b), :(b), :(&n), :(n)))
        # meta: pop location
        $(Expr(:inbounds, :pop))
        unless (Base.slt_int)((Base.sext_int)(Int64, SSAValue(10))::Int64, 0)::Bool goto 46 # line 4:
        SSAValue(2) = b
        $(Expr(:inbounds, false))
        # meta: location gmp.jl + 258
        z@_7 = $(Expr(:invoke, MethodInstance for BigInt(), :(Base.GMP.BigInt))) # line 259:
        $(Expr(:foreigncall, ("__gmpz_add", :libgmp), Void, svec(Ptr{BigInt}, Ptr{BigInt}, Ptr{BigInt}), :(&z@_7), :(z@_7), :(&a), :(a), :(&b), :(b)))
        # meta: pop location
        $(Expr(:inbounds, :pop))
        a = SSAValue(2)
        b = z@_7
        44:
        goto 26
        46:  # line 6:
        return b
    end))=>BigInt

这反映了以下事实:对BigInts的操作非常复杂,涉及内存分配和对外部GMP库(libgmp)的调用.

This reflects the fact that operations on BigInts are pretty complicated and involve memory allocation and calls to the external GMP library (libgmp).

LLVM IR. Julia使用 LLVM编译器框架生成机器代码. LLVM定义了一种类似于汇编语言,该语言用作不同编译器之间的共享中间表示(IR)优化过程和框架中的其他工具. LLVM IR有三种同构形式:

LLVM IR. Julia uses the LLVM compiler framework to generate machine code. LLVM defines an assembly-like language which it uses as a shared intermediate representation (IR) between different compiler optimization passes and other tools in the framework. There are three isomorphic forms of LLVM IR:

  1. 紧凑且机器可读的二进制表示形式.
  2. 冗长且易于理解的文本表示形式.
  3. 由LLVM库生成和使用的内存中表示形式.

Julia使用LLVM的C ++ API在内存中构造LLVM IR(表格3),然后在该表格上调用一些LLVM优化传递.当您执行@code_llvm时,您会看到生成后的LLVM IR和一些高级优化.这是我们正在进行的示例的LLVM代码:

Julia uses LLVM's C++ API to construct LLVM IR in memory (form 3) and then call some LLVM optimization passes on that form. When you do @code_llvm you see the LLVM IR after generation and some high-level optimizations. Here's LLVM code for our ongoing example:

julia> @code_llvm nextfib(123)

define i64 @julia_nextfib_60009(i64) #0 !dbg !5 {
top:
  br label %L4

L4:                                               ; preds = %L4, %top
  %storemerge1 = phi i64 [ 1, %top ], [ %storemerge, %L4 ]
  %storemerge = phi i64 [ 1, %top ], [ %2, %L4 ]
  %1 = icmp slt i64 %storemerge, %0
  %2 = add i64 %storemerge, %storemerge1
  br i1 %1, label %L4, label %L13

L13:                                              ; preds = %L4
  ret i64 %storemerge
}

这是用于nextfib(123)方法实现的内存LLVM IR的文本形式. LLVM不易阅读-多数时候不打算由人们编写或阅读-但它完全指定并记录.一旦掌握了它,就不难理解了.该代码跳转到标签L4,并使用i64(Int64的LLVM名称)值1初始化寄存器" %storemerge1%storemerge(当它们从不同的地址跳转时,它们的值是不同的.位置-这就是phi指令的作用).然后,它执行icmp slt%storemerge与寄存器%0进行比较-该寄存器在整个方法执行过程中保持不变-将比较结果保存到寄存器%1中.它在%storemerge%storemerge1上执行add i64,并将结果保存到寄存器%2中.如果%1为true,则分支回到L4,否则分支到L13.当代码循环回到L4时,寄存器%storemerge1获取%storemerge的先前值,而%storemerge获取%2的先前值.

This is the textual form of the in-memory LLVM IR for the nextfib(123) method implementation. LLVM is not easy to read – it's not intended to be written or read by people most of the time – but it is thoroughly specified and documented. Once you get the hang of it, it's not hard to understand. This code jumps to the label L4 and initializes the "registers" %storemerge1 and %storemerge with the i64 (LLVM's name for Int64) value 1 (their values are derived differently when jumped to from different locations – that's what the phi instruction does). It then does a icmp slt comparing %storemerge with register %0 – which holds the argument untouched for the entire method execution – and saves the comparison result into the register %1. It does an add i64 on %storemerge and %storemerge1 and saves the result into register %2. If %1 is true, it branches back to L4 and otherwise it branches to L13. When the code loops back to L4 the register %storemerge1 gets the previous values of %storemerge and %storemerge gets the previous value of %2.

本机代码.由于Julia执行本机代码,因此方法实现的最后一种形式是计算机实际执行的操作.这只是内存中的二进制代码,很难读取,因此很久以前,人们发明了各种形式的汇编语言",它们以名称表示指令和寄存器,并具有一些简单的语法来帮助表达指令的作用.通常,汇编语言与机器代码保持接近一一对应的关系,特别是,人们总是可以将机器代码反汇编"为汇编代码.这是我们的示例:

Native code. Since Julia executes native code, the last form a method implementation takes is what the machine actually executes. This is just binary code in memory, which is rather hard to read, so long ago people invented various forms of "assembly language" which represent instructions and registers with names and have some amount of simple syntax to help express what instructions do. In general, assembly language remains close to one-to-one correspondence with machine code, in particular, one can always "disassemble" machine code into assembly code. Here's our example:

julia> @code_native nextfib(123)
    .section    __TEXT,__text,regular,pure_instructions
Filename: REPL[1]
    pushq   %rbp
    movq    %rsp, %rbp
    movl    $1, %ecx
    movl    $1, %edx
    nop
L16:
    movq    %rdx, %rax
Source line: 4
    movq    %rcx, %rdx
    addq    %rax, %rdx
    movq    %rax, %rcx
Source line: 3
    cmpq    %rdi, %rax
    jl  L16
Source line: 6
    popq    %rbp
    retq
    nopw    %cs:(%rax,%rax)

这是在x86_64 CPU家族的Intel Core i7上.它仅使用标准的整数指令,因此架构无关紧要,但是您可以根据您的机器的特定架构,对某些代码获得不同的结果,因为JIT代码可以在不同的系统上不同.开头的pushqmovq指令是标准功能的前同步码,将寄存器保存到堆栈中.类似地,popq恢复寄存器,而retq从函数返回. nopw是2字节的指令,不执行任何操作,仅用于填充函数的长度.因此,代码的内容就是这样:

This is on an Intel Core i7, which is in the x86_64 CPU family. It only uses standard integer instructions, so it doesn't matter beyond that what the architecture is, but you can get different results for some code depending on the specific architecture of your machine, since JIT code can be different on different systems. The pushq and movq instructions at the beginning are a standard function preamble, saving registers to the stack; similarly, popq restores the registers and retq returns from the function; nopw is a 2-byte instruction that does nothing, included just to pad the length of the function. So the meat of the code is just this:

    movl    $1, %ecx
    movl    $1, %edx
    nop
L16:
    movq    %rdx, %rax
Source line: 4
    movq    %rcx, %rdx
    addq    %rax, %rdx
    movq    %rax, %rcx
Source line: 3
    cmpq    %rdi, %rax
    jl  L16

顶部的movl指令使用1个值初始化寄存器. movq指令在寄存器之间移动值,而addq指令添加寄存器. cmpq指令比较两个寄存器,jl要么跳回L16,要么继续从函数返回.这条紧紧的循环中的少数整数机器指令正是执行Julia函数调用时执行的内容,以更易于理解的形式呈现.很容易看出为什么它运行速度快.

The movl instructions at the top initialize registers with 1 values. The movq instructions move values between registers and the addq instruction adds registers. The cmpq instruction compares two registers and jl either jumps back to L16 or continues to return from the function. This handful of integer machine instructions in a tight loop is exactly what executes when your Julia function call runs, presented in slightly more pleasant human-readable form. It's easy to see why it runs fast.

如果您对一般的JIT编译(与解释的实现相比)感兴趣,那么Eli Bendersky撰写了很多博客文章,他介绍了从一种语言的简单解释器实现到针对同一语言的(简单)优化JIT的过程. :

If you're interested in JIT compilation in general as compared to interpreted implementations, Eli Bendersky has a great pair of blog posts where he goes from a simple interpreter implementation of a language to a (simple) optimizing JIT for the same language:

  1. http://eli.thegreenplace .net/2017/adventures-in-jit-compilation-part-1-an-interpreter/
  2. http://eli.thegreenplace.net/2017/adventures-in-jit-compilation-part-2-an-x64-jit.html
  1. http://eli.thegreenplace.net/2017/adventures-in-jit-compilation-part-1-an-interpreter/
  2. http://eli.thegreenplace.net/2017/adventures-in-jit-compilation-part-2-an-x64-jit.html

这篇关于Julia中@ code_native,@ code_typed和@code_llvm有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆