如何在运行时检测堆栈对齐? [英] How to detect stack alignment at runtime?

查看:109
本文介绍了如何在运行时检测堆栈对齐?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找检测堆栈是否与一些可移植代码对齐8字节或16字节。我知道它将在64位操作系统和最新版本的Linux / BSD / OSX上进行16字节对齐,但我宁愿采用可靠的经验方法来确定它,而不是依赖基于操作系统的启发式或查找。我目前正在处理的模块中不允许每个新操作系统更改。



到目前为止:



I''m looking to detect whether the stack is 8-byte or 16-byte aligned with some portable code. I know it will be 16-byte aligned on 64bit OS and on recent versions of Linux/BSD/OSX but I''d rather have a reliable empirical way of determining it than rely on an OS based heuristic or lookup. No per-new-OS changes are allowed in the module I''m currently working on.

So far:

bool isStack16ByteAligned()
{
  bool bResult = ( sizeof( Cmp_uint_ptr ) == 8 ) ? true : false;

  //...

  return bResult;
}





任何使用的解决方案都会在我的下一篇QOR系列文章中记入: - )



Any solution that gets used will be credited in my next QOR series article :-)

推荐答案

这是一个简单的想法:声明一些基本类型的局部变量(我建议单个字节后跟一个16位或32位整数,请看下面),所以它们的内存将保留在当前堆栈帧上。获取指向这些对象的指针,将指针转换为无符号整数类型并比较数字。



这可能不是那么简单。首先,这些变量可能会被优化掉,你应该防止它。由于您可能需要使用此代码,即使应用了优化,您也需要以某种方式使用它们来防止优化变量。如果优化器试图通过打包某些内存区域(例如,如果检测到几个单独的字节大小的堆栈变量)来尝试违反(优化)16位对齐,则问题可能会更加困难,因为它是8位对齐。我不知道某些编译器是否会执行此类操作,因此这会对您的方法可移植性造成潜在危险。如果您尝试使用每个特定的编译器,则可以轻松找到它,但无法预测任何未知编译器的行为。您还应该了解优化可以重新排序堆栈上对象的位置。我真的不知道这种行为有多大可能。只有我记得当C ++编译器不是在现代平面模型中工作但使用实模式段/偏移时,他们在某些数据表示模式下做了更复杂的内存技巧。



为了防止上面推测的这种字节打包的影响,你可以组合大小的对象。正如我在第一段中提到的那样,一个字节后跟整数应该是做技巧的候选者,但是你应该更彻底地分析它,并学习有或没有优化和其他选项的反汇编代码,所以你可能想要思考一些更可靠的东西。如果优化在堆栈上重新排序对象位置,则需要考虑这种可能性。请仔细考虑一下。



作为最后一点,我要说任何依赖堆栈对齐的技术原则上都是有潜在危险的,通常不应该使用。问题不在于如何确定对齐类型选项,而是如何确定此结果,以及此技术的可靠性。所以,我只是对你的想法感到好奇。



-SA
Here is the simple idea: declare some local variables of primitive types (I would suggest a single byte followed by a 16- or 32-bit integer, please see below), so the memory for them will be reserved on a current stack frame. Get pointers to those objects, cast the pointers to unsigned integer type and compare the numbers.

This may be not so simple though. First, those variables may be optimized out, you should prevent it. As you may need this code working even if optimization is applied, you will need to prevent optimizing the variables out by using them somehow. The problem can be more difficult if the optimizer tries to violate ("optimize") 16-bit alignment by packing some memory area (for example, if several separate byte-size stack variables are detected) as it was 8-bit alignment. I don''t really know if some compilers do such things, so this is a potential danger to your method portability. You can easily find it out if you experimenting with each particular compiler, but you cannot predict the behavior of any unknown compiler. You also should understand that optimization can reorder the location of objects on stack. I really don''t know how likely such behavior might be. Only I remember that when C++ compilers worked not in modern flat model but used real-mode segments/offsets, they did much more complex memory tricks in certain data presentation modes.

To prevent the effect of such byte packing I speculated about above, you could combine the sized of objects. A byte followed by integer as I mentioned in the first paragraph should be a candidate to do the trick, but you should analyze it all more thoroughly them I did and learn disassembled code with or without optimization and other options, so you may want to think of something more reliable. If the optimization reorders object locations on stack, you need take this possibility into account. Just think thoroughly about it.

As a final note, I would say that any technique relying on stack alignment is potentially dangerous in principle and generally should not be used. The issue is not how to determine the alignment type option, but what you are going to do with this result, and how reliable this technique could be. So, I''m just curious about your idea.

—SA


Matthew,从你在这个论坛的答案我记得你是这里经验丰富的人之一。所以,请把我的答案视为一个建议,它可能只是补充你到目前为止发现的东西。



让我们首先澄清,究竟是什么意思通过堆栈对齐。我将其定义为:



CPU用于推送操作的最小大小。



根据这个定义,我将远离比较基于堆栈的变量的地址。编译器通常通过从堆栈指针中减去所需的总大小来分配函数的局部变量。在内存中,它会根据需要排列变量,使用打包,重新排序等。因此,两个单字节变量的地址差异很可能是1,或-1,或其他任何东西。



我试图使用的是编译器在堆栈上推送参数的顺序。通常是从左到右或从右到左,但在任何情况下都是严格的顺序。您可以构建,因为否则var_args机制将无法工作或至少很难实现。



所以我会写一个像这个:



Matthew, from your answers in this forum I recall that you are one of the experienced guys around here. So, please treat my answer just as a suggestion and it might just complement what you found out yourself so far.

Let''s first clarify, what exactly is meant by stack alignment. I''d define it as:

The minimum size the CPU uses for a push operation.

By that definition, I would stay away from comparing the addresses of stack-based variables. A compiler generally allocates the local variables of a function by just subtracting the required total size from the stack pointer. Inside that memory contingent it arranges variables just as it likes, using packing, reordering, etc. So, the address difference of two single-byte variables might very well be 1, or -1, or anything else.

What I would try to use is the sequence in which the compiler pushed arguments on the stack. Usually that will be either left to right or right to left, but in any case in strict sequence. That you can build upon, because otherwise the var_args mechanism would not work or at least would be very hard to implement.

So I''d write a little function like this:

UINT ArgAddressDiff (char arg1, char arg2, ...)
{
    return abs (&arg1 - &arg2);
}



优化可能会将参数放入寄存器中。这就是我包含...varg_args声明符的原因。这可能会阻止这样的优化,但我不确定。



您也可以尝试将两个参数作为函数的变量参数传递,其中编译器没有机会优化它们。这需要使用va_list和va_start宏进行一些调整。不过我还没试过。



我希望这对你的解决方案有所帮助。


Optimization might put the arguments into registers, though. That''s the reason I included the "..." varg_args declarator. That might prevent such an optimization, but I am not sure.

You could also try to pass the two arguments as the variable arguments of the function, in which the compiler has no chance to optimize them away. That would require some tweeking with the va_list and va_start macros. I haven''t tried that yet, though.

I hope that contributes somewhat to your solution.


事实证明虽然堆栈对齐由编译器控制,但最终由目标OS确定。当你想到它时并不完全令人惊讶,因为这是一个必须调用并被操作系统调用的操作系统的过程。

那时似乎唯一的正确解决方案是OS支持库上的接口,用于检索堆栈对齐的正确值,而不是尝试计算它。不是我想要的解决方案,但至少可以在不破坏设计的情况下工作。
It turns out that although the stack alignment is controlled by the compiler it is determined ultimately by the target OS. Not entirely surprising when you think about it as this is a process under that OS that has to call and be called by the OS.
It would seem then that the only ''correct'' solution is an interface on the OS support library to retrieve the correct value for stack alignment rather then trying to calculate it. Not the solution that I wanted but one at least that will work without breaking the design.


这篇关于如何在运行时检测堆栈对齐?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆