如何在运行时生成和执行机器代码? [英] How could I generate and execute machine code at runtime?

查看:101
本文介绍了如何在运行时生成和执行机器代码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

与组装最接近的是构建自己的Java类库,该库加载类文件,并允许您创建,编译和反编译类。在致力于该项目的同时,我想知道Java虚拟机如何在JIT优化过程中在运行时实际生成本机代码。

The closest I have gotten to assembly is building my own Java Class library which loads class files and allows you to create, compile, and decompile classes. While endeavoring this project, I wondered how the Java Virtual Machine actually generated native machine code at runtime during JIT optimizations.

让我思考:如何生成机器代码并

It got me thinking: how could one generate machine code and execute it at runtime with assembly, and as a bonus, without a JIT compiler library, or "manually"?

推荐答案

您的问题发生了重大变化(2017年7月)。初始变体指的是 EX(执行)指令

Your question changed substantially (in july 2017). The initial variant referred to the EX (execute) instruction of IBM mainframes.


一个人如何生成机器代码并在运行时通过汇编将其执行...?

how could one generate machine code and execute it at runtime with assembly...?

在实践中,您将使用一些 JIT编译库,其中有很多。或者,您可以使用某些动态加载程序。在最低级别上,他们都写了一些字节序列,这些字节序列表示有效的机器代码 –许多序列机器指令-必须位于(虚拟地址空间的内存段中) a href = https://en.wikipedia.org/wiki/Executable_space_protection rel = nofollow noreferrer>可执行(了解 NX位),然后您的某些代码会间接跳转到该地址,或更常见的是间接调用该地址,即通过函数指针。大多数 JVM 实现都使用JIT编译技术。

In practice, you would use some JIT compilation library, and there are many of them. Or you would use some dynamic loader. At the lowest level, they all write some byte sequences representing valid machine code—a sequence of many machine instructions—in a memory segment (of your virtual address space) which has to be made executable (read about the NX bit), and then some of your code would jump indirectly to that address or more often call it indirectly—that is call through a function pointer. Most JVM implementations use JIT compilation techniques.


...并且作为奖励,没有JIT编译器库,还是手动?

...and as a bonus, without a JIT compiler library, or "manually"?

假设您对程序当前正在执行的处理器体系结构有一些有效的机器代码,例如,您可以获得一个内存段(例如 mmap(2)在Linux上),然后使其可执行(例如 mprotect(2))。其他大多数操作系统提供的系统调用

Supposing you have some valid machine code for the processor architecture that your program is currently executing on, for example, you could get a memory segment (e.g. mmap(2) on Linux), and then make it executable (e.g. mprotect(2)). Most other operating systems provide similar system calls.

如果您使用JIT诸如 asmjit libjit libgccjit 或< a href = http://llvm.org rel = nofollow noreferrer> LLVM 或许多其他方法,您首先要在内存中构造一个表示形式(类似于某些抽象语法树),然后要求JIT库为其发出机器代码。您甚至可以编写自己的JIT编译代码,但这是一项很多工作(您需要了解指令集,例如 x86 个人电脑)。顺便说一下,生成快速运行的机器代码确实非常困难,因为您需要进行优化 编译器这样做(并关心诸如指令调度寄存器分配等...另请参见),这就是为什么使用现有的JIT编译库的原因(例如 libgccjit LLVM )是首选(相反,更简单的JIT库,例如 asmjit libjit 或GNU 闪电不会进行太多优化并生成较差的机器代码)。

If you use a JIT compilation library like asmjit or libjit or libgccjit or LLVM or many others, you first construct in memory a representation (similar to some abstract syntax tree) of the code to be generated, then ask the JIT library to emit machine code for it. You could even write your own JIT compilation code, but it is a lot of work (you need to understand all the details of your instruction set, e.g. x86 for PCs). By the way, generating fast-running machine code is really difficult, because you need to optimize like compilers do (and to care about details like instruction scheduling, register allocation, etc... see also this), and that is why using an existing JIT compilation library (like libgccjit or LLVM) is preferable (a contrario, simpler JIT libraries like asmjit or libjit or GNU lightning don't optimize much and generate poor machine code).

如果您使用动态加载器(例如在POSIX上 dlopen(3)),您可以使用一些外部编译器生成共享库(即插件),然后您要求动态链接器将其加载到您的进程中(并处理适当的重定位),通过名称获取(使用 dlsym(3))一些功能

If you use a dynamic loader (e.g. dlopen(3) on POSIX) you would use some external compiler to produce a shared library (that is a plugin) and then you ask the dynamic linker to load it in your process (and handle appropriate relocations) and get by name (using dlsym(3)) some function addresses from it.

某些语言实现(尤其是 SBCL for Common Lisp)可以在每个 REPL 互动。从本质上讲,它们的运行时采用了完整的编译器(包含JIT编译部分)。

Some language implementations (notably SBCL for Common Lisp) are able to emit on the fly some good machine code at every REPL interaction. In essence their runtime embark a full compiler (containing a JIT compilation part).

我经常使用是在运行时在某些临时文件(正在将特定于领域的语言编译为C或C ++)中发出一些C(或C ++)代码,派生一个将其编译作为插件,并动态加载。对于当前的(笔记本电脑,台式机,服务器)计算机,其速度足以保持与交互式循环的兼容性。

A trick I often use on Linux is to emit some C (or C++) code at runtime in some temporary file (that is compiling some domain specific language to C or to C++), fork a compilation of it as a plugin, and dynamically load it. With current (laptops, desktops, servers) computers it is fast enough to stay compatible with an interactive loop.

另请参阅评估(尤其是著名的 SICP 书),元编程多阶段编程自修改代码延续,编译器(《龙书》 ),斯科特的 编程语言实用程序 ,和 J.Pitrat的博客

Read also about eval (in particular the famous SICP book), metaprogramming, multistage programming, self-modifying code, continuations, compilers (the Dragon Book), Scott's Programming Language Pragmatics, and J.Pitrat's blog.

这篇关于如何在运行时生成和执行机器代码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆