WebAssembly 堆栈/堆栈指针初始化和内存布局 [英] WebAssembly stack / stack pointer initialization and memory layout

查看:49
本文介绍了WebAssembly 堆栈/堆栈指针初始化和内存布局的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在尝试通过 LLVM 编译的 WebAssembly,但我还没有设法理解堆栈/堆栈指针以及它与整体内存布局的关系.

I am currently toying around with WebAssembly compiled through LLVM but I haven't yet managed to understand the stack / stack pointer and how it relates to the overall memory layout.

我了解到我必须使用 s2wasm--allocate-stack N 来运行我的程序,我认为这基本上是添加 (data(i32.const 4) "8\00\00\00") (with N=8) 到我生成的垃圾,二进制部分显然是一个指向内存偏移量的指针,而 i32 常量是它的偏移量在线性记忆中.

I learned that I have to use s2wasm with --allocate-stack N to make my program run and I figured that this is basically adding (data (i32.const 4) "8\00\00\00") (with N=8) to my generated wast, with the binary part obviously being a pointer to a memory offset and the i32 constant being its offset in linear memory.

不过,我不太明白的是,为什么指针的值是 56(同样 N=8)以及该值如何与内存中堆栈的确切区域相关联,就我而言,目前看起来像:

What I do not quite understand, though, is why the pointer's value is 56 (again with N=8) and how this value relates to the exact region of the stack in memory, which, in my case, currently looks like:

<代码>0-3:零4-7:567-35:其他数据部分36-55:零56-59:零

我知道我可能更适合只使用 emscripten",但我也想了解这一点.

I know that I am probably more a candidate for "just use emscripten", but I'd also like to understand this.

  • 堆栈指针是否总是存储在线性内存中的偏移量 4 处?
  • 它的初始值是如何计算的?(对齐到数据后的下一个偏移量 %16==0 + N?)
  • 之前存储了什么,它指向的偏移量之后存储了什么?

推荐答案

我在 另一个问题.从 C++ 的堆栈中,实际上有 3 个值可以结束的地方:

I touched on this in another question. From C++'s stack there are actually 3 places where the values can end up:

  1. 在执行堆栈上(每个操作码都会压入和弹出值,所以 add 会先弹出 2,然后再压入 1).
  2. 作为本地人.
  3. 内存中.
  1. On the execution stack (each opcode pushes and pops values, so add pops 2 and then pushes 1).
  2. As a local.
  3. In the Memory.

请注意,您不能使用 1. 和 2 的地址.只有在这些情况下,我才会期望代码生成器使用 3.如何完成此操作不是由 WebAssembly 决定的,这取决于您使用的任何 ABI选择了.Emscripten 和其他工具所做的是将堆栈指针存储在地址 4 处,然后在程序的早期他们选择堆栈应该去的位置.不必始终为 4,但始终坚持该 ABI 更简单,尤其是在涉及动态链接时.

Notice that you can't take the address of 1. and 2. Only in these cases would I expect a code generator to go with 3. How this is done isn't dictated by WebAssembly, it's up to whatever ABI you chose. What Emscripten and other tools do is they store the stack pointer at address 4, and then very early in the program they choose a spot where the stack should go. It doesn't have to always be 4, but it's simpler to always stick to that ABI especially if dynamic linking is involved.

关于初始值:该位置必须足够大以容纳整个堆栈,并且 malloc 的实现必须知道它,因为它无法为其分配堆空间.这就是为什么某些工具允许您指定最大尺寸的原因.

On initial value: that location has to be big enough to hold the whole stack, and the implementation of malloc has to know about it because it can't allocate heap space over it. That's why some tooling allows you to specify max size.

任何东西都可以在之前/之后存储(尽管在您可能拥有之前的堆栈值之后).WebAssembly 目前没有保护页,因此耗尽内存堆栈将破坏堆值(除非代码生成器也发出堆栈检查).这就是所有内存安全",因为它仍然无法逃脱WebAssembly.Memory,因此浏览器无法拥有,但开发人员自己的代码可以完全拥有.构建在 WebAssembly 之上的内存安全语言必须在 WebAssembly.Memory 中强制执行内存安全.

Anything can be stored before / after (though after you'd likely have prior stack values). WebAssembly doesn't currently have guard pages, so exhausting the in-memory stack will clobber heap values (unless the code generator also emits stack checks). That's all "memory safe" in that it still can't escape the WebAssembly.Memory, so the browser can't get owned but the developer's own code can totally be owned. A memory-safe language built on top of WebAssembly would have to enforce memory safety within the WebAssembly.Memory.

请注意,我没有解释 1. 和 2.它们的存在意味着大多数 C++ 程序在 WebAssembly 中使用的内存堆栈比本机 C++ 程序使用堆栈要少.

Note that I haven't explained 1. and 2. Their existence means that most C++ programs will use less in-memory stack in WebAssembly than a native C++ program uses stack.

这篇关于WebAssembly 堆栈/堆栈指针初始化和内存布局的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆