Windows系统:避免推挤堆栈完整的x86环境 [英] Windows: avoid pushing full x86 context on stack

查看:181
本文介绍了Windows系统:避免推挤堆栈完整的x86环境的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经实现 PARLANSE ,使用仙人掌堆栈MS Windows下的语言实现并行程序。堆叠块被分配在每个功能
基础是的只是的大小合适的处理局部变量,
前pression临时推/持久性有机污染物,并呼吁库(包括
堆栈空间的库函数在工作)。这样的堆栈
帧可以是在实践中32字节小并且通常是

I have implemented PARLANSE, a language under MS Windows that uses cactus stacks to implement parallel programs. The stack chunks are allocated on a per-function basis and are just the right size to handle local variables, expression temp pushes/pops, and calls to libraries (including stack space for the library routines to work in). Such stack frames can be as small as 32 bytes in practice and often are.

除非code做一些愚蠢的事而这一切的伟大工程
导致硬件陷阱......在这一点上的窗口的出现
坚持在堆栈上推整个x86机器上下文。
这是一些500+字节如果包括FP / MMX /等。寄存器,
其中它。当然,在32字节堆栈500字节推
捣毁的东西它不应该​​。 (硬件推动了几句话
 上的陷阱,但不是整个上下文)。

This all works great unless the code does something stupid and causes a hardware trap... at which point Windows appears to insist on pushing the entire x86 machine context "on the stack". This is some 500+ bytes if you include the FP/MMX/etc. registers, which it does. Naturally, a 500 byte push on a 32 byte stack smashes things it should not. (The hardware pushes a few words on a trap, but not the entire context).

我可以让Windows存储异常上下文块
别的地方(例如,特定于螺纹的位置)?
然后该软件可以采取异常
打线程和处理它没有我四溢
小堆栈帧。

Can I get Windows to store the exception context block someplace else (e.g., to a location specific to a thread)? Then the software could take the exception hit on the thread and process it without overflowing my small stack frames.

我不认为这是可能的,但我认为我会问一个更大
听众。是否有一个标准的操作系统呼叫/接口
可能会导致这种情况发生?

I don't think this is possible, but I thought I'd ask a much larger audience. Is there an OS standard call/interface that can cause this to happen?

这将是微不足道的操作系统做的,如果我能精读MS入让我
方法任选定义上下文存储位置,contextp,这
被初始化为默认启用当前传统行为。
然后更换interrrupt /陷阱向量codeE:

It would be trivial to do in the OS, if I could con MS into letting my process optionally define a context storage location, "contextp", which is initialized to enable the current legacy behavior by default. Then replacing the interrrupt/trap vector codee:

  hardwareint:   push  context
                mov   contextp, esp

... ...用

... with ...

  hardwareint:  mov <somereg> contextp
                test <somereg>
                jnz  $2
                push  context
                mov   contextp, esp
                jmp $1 
         $2:    store context @ somereg
         $1:    equ   *

与所需的明显变化,以节省somereg等。

with the obvious changes required to save somereg, etc.

[我现在做的是:检查生成的code每个功能。
如果它具有(例如,除以零)产生一个陷阱的机会,
或者我们正在调试(可能出现的坏指针DEREF等),加
足够的空间以用于对FP上下文堆栈帧。堆栈帧
现在最终被~~大小500-1000字节,程序不能
递归远,这有时对一个实际问题
applicaitons我们正在编写。因此,我们有一个可行的解决方案,
但复杂的调试]

[What I do now is: check the generated code for each function. If it has a chance of generating a trap (e.g., divide by zero), or we are debugging (possible bad pointer deref, etc.), add enough space to the stack frame for the FP context. Stack frames now end up being ~~ 500-1000 bytes in size, programs can't recurse as far, which is sometimes a real problem for the applicaitons we are writing. So we have a workable solution, but it complicates debugging]

编辑8月25日:我已经成功地得到这个故事微软内部工程师
谁拥有权力apparantly找出谁在MS实际上可能
关心。有可能是一个解决方案一线希望。

EDIT Aug 25: I've managed to get this story to a Microsoft internal engineer who has the authority apparantly to find out who in MS might actually care. There might be faint hope for a solution.

编辑9月14日:MS籽粒集团建筑师已经听到的故事和同情。他说,MS将考虑一个解决方案(如提议的),但不太可能在服务包。可能需要等待Windows的下一个版本。 (唉...我可能会老去......)

EDIT Sept 14: MS Kernal Group Architect has heard the story and is sympathetic. He said MS will consider a solution (like the one proposed) but unlikely to be in a service pack. Might have to wait for next version of Windows. (Sigh...I might grow old...)

编辑:2010年9月13日(1年后)。微软的任何操作。我最近的噩梦:不考虑在Windows上运行X64 32位过程中的陷阱,推动整个X64环境堆栈中的中断处理程序假货推32位的上下文之前?这会是更大(两倍多整数寄存器两倍宽,两倍多的SSE寄存器(?))?

Sept 13, 2010 (1 year later). No action on Microsoft's part. My latest nightmare: does taking a trap running a 32 bit process on Windows X64, push the entire X64 context on the stack before the interrupt handler fakes pushing a 32 bit context? That'd be even larger (twice as many integer registers twice as wide, twice as many SSE registers(?))?

编辑:2012年2月25日:(1.5年已经通过了...)对微软的一部分,没有反应。我想他们只是不关心我的一种并行性。我认为这是一个损害社会;在正常情况下使用的MS大栈模型通过吃大量的VM限制人们可以在任何一个时刻有活的并行计算的金额。该PARLANSE模式将因为一次有一百万居住在运行/待机的各种状态五谷的应用程序;这真的发生在我们的一些应用程序在哪里1亿节点图是并行处理。该PARLANSE方案与有关1GB的RAM,这是pretty管理做到这一点。如果您尝试与MS 1Mb的大烟囱,则需要10 ^ 12字节的虚拟机只是为堆栈空间,我pretty确保Windows将不会让你管理一个百万线程。

February 25, 2012: (1.5 years have gone by...) No reaction on Microsoft's part. I guess they just don't care about my kind of parallelism. I think this is a disservice to the community; the "big stack model" used by MS under normal circumstance limits the amount of parallel computations one can have alive at any one instant by eating vast amounts of VM. The PARLANSE model will let one have an application with a million live "grains" in various states of running/waiting; this really occurs in some of our applications where a 100 million node graph is processed "in parallel". The PARLANSE scheme can do this with about 1Gb of RAM, which is pretty manageable. If you tried that with MS 1Mb "big stacks" you'd need 10^12 bytes of VM just for the stack space and I'm pretty sure Windows won't let you manage a million threads.

编辑:2014年4月29日:(4年过去了)。的我猜MS只是没有所以读的我已经做了足够的PARLANSE工程,所以我们只付调试期间大栈帧的价格或出现FP操作回事,所以我们已经设法找到了非常实用的方式来生活与此有关。 MS继续令人失望;东西的量压入堆栈通过各种版本的Windows上似乎超出需要的只是硬件方面大大和异乎寻常而有所不同。这里也有一些暗示,一些这种变化是由非微软产品贴引起的(如防病毒)坚持其在异常处理链的鼻子;为什么他们不能做到这一点从我的地址空间之外?所有,我们通过简单地增加一个大坡系数FP /调试陷阱,并等待必然MS系统,超出该金额场处理这一切。

April 29, 2014: (4 years have gone by). I guess MS just doesn't read SO. I've done enough engineering on PARLANSE so we only pay the price of large stack frames during debugging or when there are FP operations going on, so we've managed to find very practical ways to live with this. MS has continued to disappoint; the amount of stuff pushed on the stack by various versions of Windows seems to vary considerably and egregiously above and beyond the need for just the hardware context. There's some hint that some of this variability is caused by non-MS products sticking (e.g. antivirus) sticking their nose in the exception handling chain; why can't they do that from outside my address space? Any, we handle all this by simply adding a large slop factor for FP/debug traps, and waiting for the inevitable MS system in the field that exceeds that amount.

推荐答案

基本上你就需要重新实现许多中断处理程序,即勾到自己在中断描述符表(IDT)。
问题是,你还需要重新实现一个内核模式 - >用户模式回调(SEH对于这个回调驻留在 ntdll.dll中并命名为 KiuserExceptionDispatcher ,这将触发所有的SEH逻辑)。问题是,该系统的其余部分依赖于工作SEH它现在做的方式,你的解决方案将打破东西,因为你全做系统。也许你可以检查你在中断的时间,这过程。
然而,整个概念是容易出错,而且非常严重影响了系统的稳定性恕我直言。

这些实际上的rootkit类似的技术。

Basically you would need to re-implement many interrupt handlers, i.e. hook yourself into the Interrupt Descriptor Table (IDT). The problem is, that you would also need to re-implement a kernelmode -> usermode callback (for SEH this callback resides in ntdll.dll and is named KiuserExceptionDispatcher, this triggers all the SEH logic). The point is, that the rest of the system relies upon SEH working the way it does right now, and your solution would break things because you were doing it system wide. Maybe you could check in which process you are at the time of the interrupt. However, the overall concept is prone to errors and very badly affects system stability imho.
These are actually rootkit-like techniques.

编辑:

一些细节:你为什么会需要重新实现中断处理的原因,即异常(如被零除)本质上是软件中断和那些总是通过IDT。当异常被抛出时,内核收集的背景和信号的异常给用户模式(通过NTDLL上述KiUserExceptionDispatcher)。你需要在这一点干扰,因此,你还需要提供一种机制来找回用户模式。 (有一个在NTDLL一个函数,它被用作从内核模式的切入点 - 我不记得名字,但它与KiUserACP东西......)


Some more details: the reason why you would need to re-implement interrupt handlers is, that exceptions (e.g. divide by zero) are essentially software interrupts and those always go through the IDT. When the exception has been thrown, the kernel collects the context and signals the exception back to usermode (through the aforementioned KiUserExceptionDispatcher in ntdll). You'd need to interfere at this point and therefore you would also need to provide a mechanism to get back to user mode. (There is a function in ntdll which is used as the entry point from kernel mode - I don't remember the name but its something with KiUserACP.....)

这篇关于Windows系统:避免推挤堆栈完整的x86环境的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆