如何在编译/链接时做地址计算? [英] How to do computations with addresses at compile/linking time?

查看:248
本文介绍了如何在编译/链接时做地址计算?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写了一些code初始化的 IDT ,存储32位地址有两种非相邻的16位半。该IDT可以在任何地方保存,你在哪里通过运行激光损伤阈值指令告诉CPU。

I wrote some code for initializing the IDT, which stores 32-bit addresses in two non-adjacent 16-bit halves. The IDT can be stored anywhere, and you tell the CPU where by running the LIDT instruction.

这是code初始化表:

This is the code for initializing the table:

void idt_init(void) {
    /* Unfortunately, we can't write this as loops. The first option,
     * initializing the IDT with the addresses, here looping over it, and
     * reinitializing the descriptors didn't work because assigning a
     * a uintptr_t (from (uintptr_t) handler_func) to a descr (a.k.a.
     * uint64_t), according to the compiler, "isn't computable at load
     * time."
     * The second option, storing the addresses as a local array, simply is
     * inefficient (took 0.020ms more when profiling with the "time" command
     * line program!).
     * The third option, storing the addresses as a static local array,
     * consumes too much space (the array will probably never be used again
     * during the whole kernel runtime).
     * But IF my argument against the third option will be invalidated in
     * the future, THEN it's the best option I think. */

    /* Initialize descriptors of exception handlers. */
    idt[EX_DE_VEC] = idt_trap(ex_de);
    idt[EX_DB_VEC] = idt_trap(ex_db);
    idt[EX_NMI_VEC] = idt_trap(ex_nmi);
    idt[EX_BP_VEC] = idt_trap(ex_bp);
    idt[EX_OF_VEC] = idt_trap(ex_of);
    idt[EX_BR_VEC] = idt_trap(ex_br);
    idt[EX_UD_VEC] = idt_trap(ex_ud);
    idt[EX_NM_VEC] = idt_trap(ex_nm);
    idt[EX_DF_VEC] = idt_trap(ex_df);
    idt[9] = idt_trap(ex_res);  /* unused Coprocessor Segment Overrun */
    idt[EX_TS_VEC] = idt_trap(ex_ts);
    idt[EX_NP_VEC] = idt_trap(ex_np);
    idt[EX_SS_VEC] = idt_trap(ex_ss);
    idt[EX_GP_VEC] = idt_trap(ex_gp);
    idt[EX_PF_VEC] = idt_trap(ex_pf);
    idt[15] = idt_trap(ex_res);
    idt[EX_MF_VEC] = idt_trap(ex_mf);
    idt[EX_AC_VEC] = idt_trap(ex_ac);
    idt[EX_MC_VEC] = idt_trap(ex_mc);
    idt[EX_XM_VEC] = idt_trap(ex_xm);
    idt[EX_VE_VEC] = idt_trap(ex_ve);

    /* Initialize descriptors of reserved exceptions.
     * Thankfully we compile with -std=c11, so declarations within
     * for-loops are possible! */
    for (size_t i = 21; i < 32; ++i)
        idt[i] = idt_trap(ex_res);

    /* Initialize descriptors of hardware interrupt handlers (ISRs). */
    idt[INT_8253_VEC] = idt_int(int_8253);
    idt[INT_8042_VEC] = idt_int(int_8042);
    idt[INT_CASC_VEC] = idt_int(int_casc);
    idt[INT_SERIAL2_VEC] = idt_int(int_serial2);
    idt[INT_SERIAL1_VEC] = idt_int(int_serial1);
    idt[INT_PARALL2_VEC] = idt_int(int_parall2);
    idt[INT_FLOPPY_VEC] = idt_int(int_floppy);
    idt[INT_PARALL1_VEC] = idt_int(int_parall1);
    idt[INT_RTC_VEC] = idt_int(int_rtc);
    idt[INT_ACPI_VEC] = idt_int(int_acpi);
    idt[INT_OPEN2_VEC] = idt_int(int_open2);
    idt[INT_OPEN1_VEC] = idt_int(int_open1);
    idt[INT_MOUSE_VEC] = idt_int(int_mouse);
    idt[INT_FPU_VEC] = idt_int(int_fpu);
    idt[INT_PRIM_ATA_VEC] = idt_int(int_prim_ata);
    idt[INT_SEC_ATA_VEC] = idt_int(int_sec_ata);

    for (size_t i = 0x30; i < IDT_SIZE; ++i)
        idt[i] = idt_trap(ex_res);
}

idt_trap idt_int ,并定义如下:

#define idt_entry(off, type, priv) \
    ((descr) (uintptr_t) (off) & 0xffff) | ((descr) (KERN_CODE & 0xff) << \
    0x10) | ((descr) ((type) & 0x0f) << 0x28) | ((descr) ((priv) & \
    0x03) << 0x2d) | (descr) 0x800000000000 | \
    ((descr) ((uintptr_t) (off) & 0xffff0000) << 0x30)

#define idt_int(off) idt_entry(off, 0x0e, 0x00)
#define idt_trap(off) idt_entry(off, 0x0f, 0x00)

IDT 的uint64_t中一个数组,所以这些宏被隐式转换为该类型。 uintptr_t形式为保证能够保持指针值作为整数和32位系统通常宽32位的类型。 (64位IDT有16个字节的条目;这code是32位​​)。

idt is an array of uint64_t, so these macros are implicitly cast to that type. uintptr_t is the type guaranteed to be capable of holding pointer values as integers and on 32-bit systems usually 32 bits wide. (A 64-bit IDT has 16-byte entries; this code is for 32-bit).

我得到的警告,初始元素不是由于剧中地址修改不变。结果
这是绝对肯定的地址链接时称。结果
有什么我可以做,使这项工作?制作 IDT 阵列自动将工作但这需要整个内核的上下文中运行一个功能,这将是一些不好的麻烦,我想。

I get the warning that the initializer element is not constant due to the address modification in play.
It is absolutely sure that the address is known at linking time.
Is there anything I can do to make this work? Making the idt array automatic would work but this would require the whole kernel to run in the context of one function and this would be some bad hassle, I think.

我能在运行这项工作通过一些额外的工作(如Linux的0.01也一样),但它只是让我很烦的东西在链接时技术上是可行的其实就是可行的。

I could make this work by some additional work at runtime (as Linux 0.01 also does) but it just annoys me that something technically feasible at linking time is actually infeasible.

推荐答案

主要的问题是,函数地址链接时间常数的的严格编译时间常数。编译器不能只是得到32B二进制整数,并坚持认为到数据段在两个独立的部分。相反,它具有使用对象文件格式来表明到链接器,它应该在最终值填补(+偏移量),该码元的时连接已完成。在常见的情况是作为一个立即操作数的指令,在一个有效地址的位移,或在数据段的值。

The main problem is that function addresses are link-time constants, not strictly compile time constants. The compiler can't just get 32b binary integers and stick that into the data segment in two separate pieces. Instead, it has to use the object file format to indicate to the linker where it should fill in the final value (+ offset) of which symbol when linking is done. The common cases are as an immediate operand to an instruction, a displacement in an effective address, or a value in the data section.

这本来是可能的ELF到已被设计来存储一个符号参照在链接时用的地址的复函数被取代,但实际上允许的唯一功能是加/减,在使用的东西像 MOV EAX,[ext_symbol + 16]

It would have been possible for ELF to have been designed to store a symbol reference to be substituted at link time with a complex function of an address, but in fact the only function allowed is addition/subtraction, for use in things like mov eax, [ext_symbol + 16].

这当然是可能的操作系统内核二进制文件有一个静态IDT与在构建时完全解析地址,因此你需要在运行时要做的就是执行一个激光损伤阈值指令。 然而,标准
构建工具链是一个障碍。你可能离不开后期处理可执行实现这一目标。

It is of course possible for your OS kernel binary to have a static IDT with fully resolved addresses at build time, so all you need to do at runtime is execute a single lidt instruction. However, the standard build toolchain is an obstacle. You probably can't achieve this without post-processing your executable.

例如。你可以写这种方式,产生一个表格,在最终的二进制充分填充,因此数据可以就地进行洗牌:

e.g. you could write it this way, to produce a table with the full padding in the final binary, so the data can be shuffled in-place:

#include <stdint.h>

#define PACKED __attribute__((packed))

typedef union idt_entry {

    // we will postprocess the linker output to have this format
    // (or convert at runtime)
    struct PACKED runtime {   // from OSdev wiki
       uint16_t offset_1; // offset bits 0..15
       uint16_t selector; // a code segment selector in GDT or LDT
       uint8_t zero;      // unused, set to 0
       uint8_t type_attr; // type and attributes, see below
       uint16_t offset_2; // offset bits 16..31
    } rt;

    // linker output will be in this format
    struct PACKED compiletime {
       void *ptr; // offset bits 0..31
       uint8_t zero;
       uint8_t type_attr;
       uint16_t selector; // to be swapped with the high16 of ptr
    } ct;
} idt_entry;

// #define idt_ct_entry(off, type, priv) { .ptr = off, .type_attr = type, .selector = priv }
#define idt_ct_trap(off) { .ct = { .ptr = off, .type_attr = 0x0f, .selector = 0x00 } }
// generate an entry in compile-time format

extern void ex_de();  // these are the raw interrupt handlers, written in ASM
extern void ex_db();  // they have to save/restore *all* registers, and end with  iret, rather than the usual C ABI.

// it might be easier to use asm macros to create this static data, 
// just so it can be in the same file and you don't need cross-file prototypes / declarations
// (but all the same limitations about link-time constants apply)
static union idt_entry idt[] = {
    idt_ct_trap(ex_de),
    idt_ct_trap(ex_db),
    // ...
};

// having this static probably takes less space than instructions to write it on the fly
// but not much more.  It would be easy to make a lidt function that took a struct pointer.
static const struct PACKED  idt_ptr {
  uint16_t len;
  void *ptr;
} idt_ptr = { sizeof idt / sizeof idt[0], idt };


/****** functions *********/

// inline
void load_static_idt(void) {
  asm volatile ("lidt  %0"
               : // no outputs
               : "m" (idt_ptr));
  // memory operand, instead of writing the addressing mode ourself, allows a RIP-relative addressing mode in 64bit mode
  // also allows it to work with -masm=intel or not.
}

// Do this once at at run-time
// **OR** run this to pre-process the binary, after link time, as part of your build
void idt_convert_to_runtime(void) {
#ifdef DEBUG
  static char already_done = 0;  // make sure this only runs once
  if (already_done)
    error;
  already_done = 1;
#endif
  const int count = sizeof idt / sizeof idt[0];
  for (int i=0 ; i<count ; i++) {
    uint16_t tmp1 = idt[i].rt.selector;
    uint16_t tmp2 = idt[i].rt.offset_2;
    idt[i].rt.offset_2 = tmp1;
    idt[i].rt.selector = tmp2;
    // or do this swap in fewer insns with SSE or MMX pshufw, but using vector instructions before setting up the IDT may be insane.
  }
}

这并编译。看看数据部分的布局(请注意, .value的。短路的代名词,而且是16B。 )

This does compile. Look at the layout in the data section (note that .value is a synonym for .short, and is 16b.)

其他选项,而不是让IDT在数据段静,是有它的BSS段,与数据存储在一个函数,将初始化立即数(或通过该功能来读取一个数组)。

The other option, instead of having the IDT static in the data section, is to have it in the bss section, with the data stored as immediate constants in a function that will initialize it (or in an array read by that function).

无论哪种方式,功能(及其数据),可在其内存您重新使用它完成后, .init 部分。 (Linux的这样做是为了回收从code而这只需要做一次,在启动数据存储器。)这会给你小的二进制尺寸的最优折衷(因为32B地址比64B IDT入口小),并没有运行时内存浪费在code来设置IDT。在启动时运行一次,小环是可以忽略的CPU时间。

Either way, that function (and its data) can be in a .init section whose memory you re-use after it's done. (Linux does this to reclaim memory from code and data that's only needed once, at startup.) This would give you the optimal tradeoff of small binary size (since 32b addresses are smaller than 64b IDT entries), and no runtime memory wasted on code to set up the IDT. A small loop that runs once at startup is negligible CPU time.

没有记忆重用haxx,可能是一个紧密循环改写到位64B项是要走的路。在编译的时候做会更好,但你需要一个自定义工具来运行uClinux内核二进制改造的探讨。

Without memory-reuse haxx, probably a tight loop to rewrite 64b entries in place is the way to go. Doing it at build time would be even better, but then you'd need a custom tool to run the tranformation on the kernel binary.

具有存储在立即数数据在理论上听起来不错,但code为每个条目可能会总额超过64b的,因为它不能循环。的code键的地址分成两个必须被充分展开。即使你有一个循环来存储所有相同的换多入口的东西,每个指针需要一个 MOV R32,imm32 来获取地址在寄存器中,那么 MOV字[IDT + I + 0],AX / SHR EAX,16 / MOV字[IDT + I + 6],AX 。这是一个很大的机器code字节。

Having the data stored in immediates sounds good in theory, but the code for each entry would probably total more than 64b, because it couldn't loop. The code to split an address into two would have to be fully unrolled. Even if you had a loop to store all the same-for-multiple-entries stuff, each pointer would need a mov r32, imm32 to get the address in a register, then mov word [idt+i + 0], ax / shr eax, 16 / mov word [idt+i + 6], ax. That's a lot of machine-code bytes.

这篇关于如何在编译/链接时做地址计算?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆