使用libclang从内存中的C代码生成程序集 [英] Generate assembly from C code in memory using libclang

查看:98
本文介绍了使用libclang从内存中的C代码生成程序集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要实现一个使用LLVM/Clang作为后端将C代码编译为eBPF字节码的库.这些代码将从内存中读取,我也需要在内存中获取生成的汇编代码.

I need to implement a library that compiles C code to eBPF bytecode using LLVM/Clang as backend. The codes will be read from memory and I need to get the resultant assembly code in memory too.

到目前为止,我已经可以使用以下代码编译为LLVM IR:

Until now, I have been able to compile to LLVM IR using the following code:

#include <string>
#include <vector>

#include <clang/Frontend/CompilerInstance.h>
#include <clang/Basic/DiagnosticOptions.h>
#include <clang/Frontend/TextDiagnosticPrinter.h>
#include <clang/CodeGen/CodeGenAction.h>
#include <clang/Basic/TargetInfo.h>
#include <llvm/Support/TargetSelect.h>

using namespace std;
using namespace clang;
using namespace llvm;

int main() {

    constexpr auto testCodeFileName = "test.cpp";
    constexpr auto testCode = "int test() { return 2+2; }";

    // Prepare compilation arguments
    vector<const char *> args;
    args.push_back(testCodeFileName);

    // Prepare DiagnosticEngine 
    DiagnosticOptions DiagOpts;
    TextDiagnosticPrinter *textDiagPrinter =
            new clang::TextDiagnosticPrinter(errs(),
                                         &DiagOpts);
    IntrusiveRefCntPtr<clang::DiagnosticIDs> pDiagIDs;
    DiagnosticsEngine *pDiagnosticsEngine =
            new DiagnosticsEngine(pDiagIDs,
                                         &DiagOpts,
                                         textDiagPrinter);

    // Initialize CompilerInvocation
    CompilerInvocation *CI = new CompilerInvocation();
    CompilerInvocation::CreateFromArgs(*CI, &args[0], &args[0] +     args.size(), *pDiagnosticsEngine);

    // Map code filename to a memoryBuffer
    StringRef testCodeData(testCode);
    unique_ptr<MemoryBuffer> buffer = MemoryBuffer::getMemBufferCopy(testCodeData);
    CI->getPreprocessorOpts().addRemappedFile(testCodeFileName, buffer.get());


    // Create and initialize CompilerInstance
    CompilerInstance Clang;
    Clang.setInvocation(CI);
    Clang.createDiagnostics();

    // Set target (I guess I can initialize only the BPF target, but I don't know how)
    InitializeAllTargets();
    const std::shared_ptr<clang::TargetOptions> targetOptions = std::make_shared<clang::TargetOptions>();
    targetOptions->Triple = string("bpf");
    TargetInfo *pTargetInfo = TargetInfo::CreateTargetInfo(*pDiagnosticsEngine,targetOptions);
    Clang.setTarget(pTargetInfo);

    // Create and execute action
    // CodeGenAction *compilerAction = new EmitLLVMOnlyAction();
    CodeGenAction *compilerAction = new EmitAssemblyAction();
    Clang.ExecuteAction(*compilerAction);

    buffer.release();
}

要编译,我使用以下CMakeLists.txt:

To compile I use the following CMakeLists.txt:

cmake_minimum_required(VERSION 3.3.2)
project(clang_backend CXX)

set(CMAKE_CXX_COMPILER "clang++")

execute_process(COMMAND llvm-config --cxxflags OUTPUT_VARIABLE LLVM_CONFIG OUTPUT_STRIP_TRAILING_WHITESPACE)
execute_process(COMMAND llvm-config --libs OUTPUT_VARIABLE LLVM_LIBS OUTPUT_STRIP_TRAILING_WHITESPACE)

set(CMAKE_CXX_FLAGS ${LLVM_CONFIG})

set(CLANG_LIBS clang clangFrontend clangDriver clangSerialization clangParse
    clangCodeGen  clangSema clangAnalysis clangEdit clangAST clangLex
    clangBasic )

add_executable(clang_backend main.cpp)
target_link_libraries(clang_backend ${CLANG_LIBS})
target_link_libraries(clang_backend ${LLVM_LIBS})

如果我理解正确,如果将编译器操作更改为EmitAssemblyAction(),则应该能够生成汇编代码,但是由于llvm :: TargetPassConfig ::中出现分段错误,我可能未初始化某些内容.在/tmp/llvm-3.7.1.src/lib/CodeGen/Passes.cpp:419

If I understood correctly, I should be able to generate assembly code if I change the compiler action to EmitAssemblyAction(), but I'm probably not initializing something as I'm getting a segmentation fault in llvm::TargetPassConfig::addPassesToHandleExceptions (this=this@entry=0x6d8d30) at /tmp/llvm-3.7.1.src/lib/CodeGen/Passes.cpp:419

此行的代码是:

switch (TM->getMCAsmInfo()->getExceptionHandlingType()) {

有人有一个例子或知道我所缺少的吗?

Does anyone have an example or knows what I'm missing?

推荐答案

因此,如果在启用asserts的情况下编译LLVM,该错误将更加清楚,并且实际上会告诉您您需要做什么:

So, if you compile LLVM with asserts on, the error is much clearer, and it will actually tell you what you need to do:

x: .../src/llvm/lib/CodeGen/LLVMTargetMachine.cpp:63: 
void llvm::LLVMTargetMachine::initAsmInfo(): 
Assertion `TmpAsmInfo && "MCAsmInfo not initialized. " 
"Make sure you include the correct TargetSelect.h" 
"and that InitializeAllTargetMCs() is being invoked!"' failed.

(我为此添加了一些换行符,因为它打印为一条长行).

(I added some line-breaks to that, since it printed as a single long line).

main的开头添加所需的InitializeAllTargetMCs()之后,出现了另一个错误.看着编译器的目标文件生成,我猜测"这是另一个InitializeAll*调用的问题.经过一点测试,结果您还需要InitializeAllAsmPrinters();-考虑到要生成汇编代码,这很有意义.

After adding the required InitializeAllTargetMCs() at the beginning of main, I got another error. Looking at the object file generation of my compiler, I "guessed" that it was a problem with another InitializeAll* call. A little bit of testing, and it turns out that you also need InitializeAllAsmPrinters(); - which makes sense given that you want to produce assembly code.

我不太确定如何从代码中看到"结果,但是将这两个添加到main的开头会使它运行到完成而不是断言,出错或崩溃(通常是退出)朝正确方向迈出的重要一步.

I'm not entirely sure how to "see" the results from your code, but adding those two to the beginning of main makes it run to completion rather than assert, exit with an error or crash - which is typically a good step in the right direction.

这就是main在我的"代码中的样子:

So this is what main looks like in "my" code:

int main() {

    constexpr auto testCodeFileName = "test.cpp";
    constexpr auto testCode = "int test() { return 2+2; }";

    InitializeAllTargetMCs();
    InitializeAllAsmPrinters();

    // Prepare compilation arguments
    vector<const char *> args;
    args.push_back(testCodeFileName);

    // Prepare DiagnosticEngine 
    DiagnosticOptions DiagOpts;
    TextDiagnosticPrinter *textDiagPrinter =
            new clang::TextDiagnosticPrinter(errs(),
                                         &DiagOpts);
    IntrusiveRefCntPtr<clang::DiagnosticIDs> pDiagIDs;
    DiagnosticsEngine *pDiagnosticsEngine =
            new DiagnosticsEngine(pDiagIDs,
                                         &DiagOpts,
                                         textDiagPrinter);

    // Initialize CompilerInvocation
    CompilerInvocation *CI = new CompilerInvocation();
    CompilerInvocation::CreateFromArgs(*CI, &args[0], &args[0] +     args.size(), *pDiagnosticsEngine);

    // Map code filename to a memoryBuffer
    StringRef testCodeData(testCode);
    unique_ptr<MemoryBuffer> buffer = MemoryBuffer::getMemBufferCopy(testCodeData);
    CI->getPreprocessorOpts().addRemappedFile(testCodeFileName, buffer.get());


    // Create and initialize CompilerInstance
    CompilerInstance Clang;
    Clang.setInvocation(CI);
    Clang.createDiagnostics();

    // Set target (I guess I can initialize only the BPF target, but I don't know how)
    InitializeAllTargets();
    const std::shared_ptr<clang::TargetOptions> targetOptions = std::make_shared<clang::TargetOptions>();
    targetOptions->Triple = string("bpf");
    TargetInfo *pTargetInfo = TargetInfo::CreateTargetInfo(*pDiagnosticsEngine,targetOptions);
    Clang.setTarget(pTargetInfo);

    // Create and execute action
    // CodeGenAction *compilerAction = new EmitLLVMOnlyAction();
    CodeGenAction *compilerAction = new EmitAssemblyAction();
    Clang.ExecuteAction(*compilerAction);

    buffer.release();
}

我强烈建议,如果要使用clang& LLVM进行开发,请构建clang& LLVM的调试版本-这将有助于跟踪为什么",还可以及早发现问题以及在更明显的地方发现问题.将-DCMAKE_BUILD_TYPE=Debugcmake结合使用可获得味道.

I strongly suggest that if you want to develop with clang&LLVM, that you build a debug version of Clang&LLVM - this will help both in tracking down "why" and also catch problems early and where it is more obvious. Use -DCMAKE_BUILD_TYPE=Debug with cmake to get that flavour.

我获取LLVM&的完整脚本用Clang构建:

My complete script for getting LLVM & Clang to build:

export CC=clang
export CXX=clang++ 
cmake -DCMAKE_BUILD_TYPE=Debug -DCMAKE_INSTALL_PREFIX=/usr/local/llvm-debug -DLLVM_TAR
GETS_TO_BUILD=X86 ../llvm

[我正在使用3.8的较晚预发布版本对此进行测试,但我非常怀疑它在这方面与3.7.1有很大不同]

[I was using a late pre-release of 3.8 to test this, but I very much doubt that it's much different from 3.7.1 in this respect]

这篇关于使用libclang从内存中的C代码生成程序集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆