如何在Visual Studio中添加运行时断点? [英] How does adding a run-time breakpoint in Visual Studio work?

查看:126
本文介绍了如何在Visual Studio中添加运行时断点?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我在运行时向某些C#代码添加断点时,它会被命中.这实际上是如何发生的?

When I add a breakpoint to some C# code during run-time, it gets hit. How does this actually happen?

我想说的是,在调试模式下运行时,Visual Studio具有代码块的引用,并且在运行时添加断点时,一旦在编译的代码中调用该引用,便会激活该断点.

I want to say that when running in debug mode, Visual Studio has references for code blocks, and when a breakpoint is added during run-time, it would be activated once that reference is called in the compiled code.

这是一个正确的假设吗?如果是这样,您能否提供有关其工作原理的更多详细信息?

Is that a correct assumption? If so, can you please provide more details about how that works?

推荐答案

这实际上是一个相当大而复杂的主题,并且它也是特定于体系结构的,因此我仅针对此答案提供有关此主题的摘要.英特尔(和兼容)x86微体系结构的常用方法.

This is actually a rather large and complicated topic, and it is also architecture-specific, so I'll only aim in this answer to provide a summary of the common approaches on the Intel (and compatible) x86 microarchitecture.

好消息是,它独立于语言,因此无论调试VB.NET,C#还是C ++代码,调试器都将以相同的方式工作.之所以如此,是因为 all 代码最终将进行编译(无论是静态的[ ie ,还是像C ++一样提前,还是与.NET这样的JIT编译器一样) ])或动态[例如,通过运行时解释器])到可以由处理器本地执行的目标代码.调试器最终可以使用此本地代码.

The good news is, it is language-independent, so the debugger is going to work the same way whether it's debugging VB.NET, C#, or C++ code. The reason why this is true is that all code is ultimately going to compile (whether statically [i.e., ahead-of-time like C++ or with a JIT compiler like .NET]) or dynamically [e.g., via a run-time interpreter]) to object code that can be natively executed by the processor. It is this native code that the debugger ultimately works on.

此外,这不仅限于Visual Studio.它的调试器肯定会按照我将描述的方式工作,但是其他Windows调试器也是如此,例如 GNU的GDB IDA的调试器,打开-source x64dbg ,依此类推.

Furthermore, this isn't limited to Visual Studio. Its debugger certainly works in the way that I'll describe, but so does any other Windows debugger, like the Debugging Tools for Windows debuggers (WinDbg, KD, CDB, NTSD, etc.), GNU's GDB, IDA's debugger, the open-source x64dbg, and so on.

让我们从一个简单的定义开始-什么是断点?这只是一种允许暂停执行的机制,因此您可以进行进一步的分析,无论是检查调用堆栈,打印变量的值,修改内存或寄存器的内容,甚至修改代码本身.

Let's start with a simple definition—what is a breakpoint? It's just a mechanism that allows execution to be paused so that you can conduct further analysis, whether that's examining the call stack, printing the values of variables, modifying the contents of memory or registers, or even modifying the code itself.

在x86架构上,有几种实现断点的基本方法.它们可以分为软件断点和硬件断点两大类.

On the x86 architecture, there are several fundamental ways that breakpoints can be implemented. They can be divided into the two general categories of software breakpoints and hardware breakpoints.

尽管软件断点使用处理器本身的功能,但它主要在软件中实现,因此命名为软件.具体来说,中断#3(汇编语言指令INT 3 )提供了断点中断.可以将其放置在可执行代码中的任何位置,并且当CPU在执行过程中点击此指令时,它将被捕获.然后,调试器可以捕获此陷阱并执行其想做的任何事情.如果程序未在调试器下运行,则操作系统将处理陷阱;否则,操作系统将处理陷阱.操作系统的默认处理程序只会终止程序.

Although a software breakpoint uses features of the processor itself, it is primarily implemented within software, hence the name. Specifically, interrupt #3 (the assembly language instruction INT 3) provides a breakpoint interrupt. This can be placed anywhere in the executable code, and when the CPU hits this instruction during execution, it will trap. The debugger can then catch this trap and do whatever it wants to do. If the program is not running under a debugger, then the operating system will handle the trap; the OS's default handler will simply terminate the program.

INT 3指令有两种可能的编码.也许最逻辑的编码是0xCD 0x03,其中0xCD表示INT,而0x03指定参数"或要触发的中断号.但是,由于断点非常重要,因此英特尔的设计人员还为INT 3添加了一种特殊情况的表示形式-单字节操作码0xCC.

There are two possible encodings for the INT 3 instruction. Perhaps the most logical encoding is 0xCD 0x03, where 0xCD means INT and 0x03 specifies the "argument", or the number of the interrupt that is to be triggered. However, because breakpoints are so important, the designers at Intel also added a special-case representation for INT 3—the single-byte opcode 0xCC.

这是一个单字节指令的好处是它可以在程序中的几乎任何位置插入而没有太大的困难.从概念上讲,这很简单,但是实际上的工作方式有些棘手.基本上有两种选择:

The nice thing about this being a one-byte instruction is that it can be inserted pretty much anywhere in a program without much difficulty. Conceptually, this is simple, but the way it actually works is somewhat tricky. Basically, there are two options:

  • 如果这是一个固定的断点,则调试器可以在编译该INT指令时将其插入代码中.然后,每次到达该点,它将执行该指令并中断.

  • If it's a fixed breakpoint, then the debugger can insert this INT instruction into the code when it is compiled. Then, every time you hit that point, it will execute that instruction and break.

在C/C ++中,可以通过调用 DebugBreak API函数,带有 __debugbreak内部,或使用内联汇编程序插入INT 3指令.在.NET代码中,您将使用 System.Diagnostics.Debugger.Break 发出固定的断点.

In C/C++, a fixed breakpoint might be inserted via a call to the DebugBreak API function, with the __debugbreak intrinsic, or using inline assembly to insert an INT 3 instruction. In .NET code, you would use System.Diagnostics.Debugger.Break to emit a fixed breakpoint.

在运行时,可以通过将一个字节的INT指令(0xCC)替换为一个字节的

At runtime, a fixed breakpoint can be easily removed by replacing the one-byte INT instruction (0xCC) with a one-byte NOP instruction (0x90). NOP is the mnemonic for no-op: it just causes the processor to waste a cycle without doing anything.

但是,如果这是一个 dynamic 断点,那么事情将变得更加复杂.调试器必须修改内存中的二进制文件并插入INT指令.但是要在哪里插入呢?即使在调试版本中,编译器也无法在每条指令之间合理地插入NOP,并且它事先也不知道您可能要在何处插入断点,因此即使有一个插入,也不会有空间.字节INT指令在代码中的任意位置.

But if it's a dynamic breakpoint, then things get more complicated. The debugger must modify the binary in-memory and insert the INT instruction. But where is it going to insert it? Even in a debugging build, a compiler cannot reasonably insert a NOP between every single instruction, and it doesn't know in advance where you might want to insert a breakpoint, so there won't be space to insert even a one-byte INT instruction at an arbitrary location in the code.

因此,它要做的是在请求的位置插入INT指令(0xCC),覆盖当前存在的任何指令.如果这是一个单字节指令(例如INC),则只需将其替换为INT.如果这是一条多字节指令(大多数都是),则仅将该指令的第一个字节替换为0xCC.然后,原始指令将变为无效,因为它已被部分覆盖.但这没关系,因为一旦处理器按下INT指令,它就会在该点捕获并停止执行.部分,已损坏的原始指令将不会被点击.调试器捕获到由INT指令触发的陷阱并闯入"后,它将撤消内存中的修改,将插入的0xCC字节替换为原始指令的正确字节表示形式.这样,当您从该点恢复执行时,代码是正确的,并且不会一遍又一遍地碰到相同的断点.请注意,所有这些修改都发生在存储在内存中的二进制可执行文件的当前映像中.它直接在内存中打补丁,而无需修改磁盘上的文件. (使用 ReadProcessMemory WriteProcessMemory API函数,专门为调试器设计.)

So what it does instead is insert the INT instruction (0xCC) at the requested location, writing over whatever instruction is currently there. If this is a one-byte instruction (such as an INC), then it is simply replaced by an INT. If this is a multi-byte instruction (most of them are), then only the first byte of that instruction is replaced by 0xCC. The original instruction then becomes invalid because it's been partially overwritten. But that's okay, because once the processor hits the INT instruction, it will trap and stop executing at precisely that point. The partial, corrupted, original instruction will not be hit. Once the debugger catches the trap triggered by the INT instruction and "breaks" in, it undoes the in-memory modification, replacing the inserted 0xCC byte with the correct byte representation for the original instruction. That way, when you resume execution from that point, the code is correct and you don't hit the same breakpoint over and over. Note that all of this modification happens to the current image of the binary executable stored in memory; it is patched directly in memory, without ever modifying the file on disk. (This is done using the ReadProcessMemory and WriteProcessMemory API functions, specifically designed for debuggers.)

这里是机器代码,显示原始字节和汇编语言助记符:

Here it is in machine code, showing both the raw bytes as well as the assembly-language mnemonics:

31 C0             xor  eax, eax     ; clear EAX register to 0
BA 02 00 00 00    mov  edx, 2       ; set EDX register to 2
01 D0             add  eax, edx     ; add EDX to EAX
C3                ret               ; return, with result in EAX

如果要在添加值的源代码行上设置断点(反汇编中的ADD指令),ADD指令(0x01)的第一个字节将替换为0xCC,将其余字节保留为无意义的垃圾:

If we were to set a breakpoint on the line of source code that added the values (the ADD instruction in the disassembly), the first byte of the ADD instruction (0x01) would be replaced with 0xCC, leaving the remaining bytes as meaningless garbage:

31 C0             xor  eax, eax     ; clear EAX register to 0
BA 02 00 00 00    mov  edx, 2       ; set EDX register to 2
CC                int  3            ; BREAKPOINT!
D0                ???               ; meaningless garbage, never executed
C3                ret               ; also meaningless garbage from CPU's perspective

希望您能够遵循所有这些规则,因为这实际上是最简单的案例.软件断点是您最常使用的时间.调试器的许多最常用功能都是使用软件断点实现的,包括单步执行调用,执行所有代码直至特定点以及运行到函数末尾.在幕后,所有这些程序都使用一个临时软件断点,该断点在第一次被击中时会自动删除.

Hopefully you were able to follow all of that, because that is actually the simplest case. Software breakpoints are what you use most of the time. Many of the most commonly used features of a debugger are implemented using software breakpoints, including stepping over a call, executing all code up to a particular point, and running to the end of a function. Behind the scenes, all of these use a temporary software breakpoint that is automatically removed the first time that it is hit.

但是,在处理器的直接协助下,有一种更复杂,更强大的方法来设置断点.这些称为硬件断点. x86指令集提供6个特殊的调试寄存器. (它们被称为DB0DB7,表示总共8个,但是DR4DR5DR6DR7相同,因此实际上只有6个.) 4个调试寄存器(DR0DR3)存储内存地址或I/O位置,可以使用MOV指令的特殊形式设置其值. DR6(等同于DR4)是一个包含标志的状态寄存器,而DR7(等同于DR5)是一个控制寄存器.相应地设置控制寄存器后,处理器尝试访问这四个位置之一将导致硬件断点(特别是会产生INT 1中断),然后调试器可以捕获该断点.同样,细节很复杂,可以在网上或在

However, there is a more complicated and more powerful way to set a breakpoint with the direct assistance of the processor. These are known as hardware breakpoints. The x86 instruction set provides 6 special debug registers. (They are referred to as DB0 through DB7, suggesting a total of 8, but DR4 and DR5 are the same as DR6 and DR7, so there are actually only 6.) The first 4 debug registers (DR0 through DR3) store either a memory address or an I/O location, whose values can be set using a special form of the MOV instruction. DR6 (equivalent to DR4) is a status register that contains flags, and DR7 (equivalent to DR5) is a control register. When the control register is set accordingly, an attempt by the processor to access one of these four locations will cause a hardware breakpoint (specifically, an INT 1 interrupt will be raised), which can then be caught by a debugger. Again, the details are complicated and can be found various places online or in Intel's technical manuals, but not necessary to gain a high-level understanding.

关于这些特殊调试寄存器的好处是,它们提供了一种无需修改代码即可实现数据断点的方法!但是,有两个严重的局限性.首先,只有四个可能的位置,因此没有很多技巧,您只能使用四个断点.其次,调试寄存器是特权资源,访问和操作它们的指令只能在环0(本质上是内核模式)上执行.尝试在任何其他特权级别(例如第3圈,即第3圈)中读取或写入这些寄存器有效的用户模式)将导致一般的保护故障.因此,Visual Studio调试器必须跳过一些箍才能使用它们.我相信它首先会挂起线程,然后调用 SetThreadContext API函数(会在内部导致切换到内核模式)来操纵寄存器的内容.最后,它恢复线程.这些调试寄存器非常功能强大,可为包含数据的内存位置设置读写断点,以及为包含代码的内存位置设置执行断点.

The nice thing about these special debug registers is that they provide a way to implement data breakpoints without needing to modify the code! However, there are two serious limitations. First, there are only four possible locations, so without a lot of cleverness, you are limited to four breakpoints. Second, the debug registers are privileged resources, and instructions that access and manipulate them can be executed only at ring 0 (essentially, kernel mode). Attempts to read or write these registers at any other privilege level (such as in ring 3, which is effectively user mode) will cause a general protection fault. Therefore, the Visual Studio debugger has to jump through some hoops to use these. I believe that it first suspends the thread and then calls the SetThreadContext API function (which causes a switch to kernel mode internally) to manipulate the contents of the registers. Finally, it resumes the thread. These debug registers are very powerful for setting read/write breakpoints for memory locations that contain data, as well as for setting execute breakpoints for memory locations that contain code.

但是,如果您需要4个以上,或者遇到其他限制,则这些硬件提供的调试寄存器将无法工作. Visual Studio调试器必须具有其他一些更通用的方法来实现数据断点.实际上,这就是为什么在调试器下运行时,拥有大量断点确实会减慢程序执行的原因.

However, if you need more than 4, or hit against some other limitation, then these hardware-provided debug registers won't work. The Visual Studio debugger has to have some other, more general way of implementing data breakpoints. This is, in fact, why having a large number of breakpoints can really slow down the execution of your program when running under the debugger.

这里有各种各样的技巧,而对于不同的闭源调试器到底使用了哪些技巧,我知之甚少.您几乎可以肯定地通过反向工程或什至更仔细的观察发现了这一点,也许有人比我更了解这一点.但是,我将简要总结一些我知道的技巧:

There are various tricks here, and I know a lot less about exactly which ones are used by the different closed-source debuggers. You could almost certainly find out by reverse-engineering or even closer observation, and perhaps there is someone that knows more about this than me. But I'll briefly summarize a couple of the tricks I know about:

  • 内存访问断点的一个技巧是使用保护页面.这涉及将包含感兴趣数据的虚拟内存页面的保护级别更改为PAGE_GUARD,这意味着后续访问该页面(读取或写入)的尝试将引发保护页面违反异常.然后,调试器可以捕获此异常,验证其是否在访问目标内存地址时发生,并将其作为断点进行处理.然后,当您恢复执行时,调试器安排页面访问成功,再次重置PAGE_GUARD标志,然后继续.这就是 OllyDBG 如何实现其对内存访问断点的支持.我不知道Visual Studio的调试器是否使用此技巧.

  • One trick for memory-access breakpoints is to use guard pages. This involves changing the protection level of the virtual-memory page that contains the data of interest to PAGE_GUARD, meaning that subsequent attempts to access that page (either read or write) will raise a guard page violation exception. The debugger can then catch this exception, verify that it occurred upon access to the memory address of interest, and process it as a breakpoint. Then, when you resume execution, the debugger arranges for the page access to succeed, resets the PAGE_GUARD flag again, and continues. This is how OllyDBG implements its support for memory-access breakpoints. I don't know if Visual Studio's debugger uses this trick or not.

另一个技巧是使用单步支持.基本上,调试器在x86 EFLAGS寄存器中设置陷阱标志(TF).这会导致CPU在执行每条指令之前进行陷阱捕获(这通过引发INT 1异常来完成,就像我们在上面使用调试寄存器时所看到的那样).然后,调试器捕获此陷阱,并决定是否应继续执行.

Another trick is to use single-stepping support. Basically, the debugger sets the Trap Flag (TF) in the x86 EFLAGS register. This causes the CPU to trap before executing each instruction (which it does by raising an INT 1 exception, just as we saw above when the debug registers are used). The debugger then catches this trap, and decides whether it should continue executing or not.

最后,有条件断点.在这里可以在一行代码上设置一个断点,但是要求调试器仅在某些指定条件的结果为true时才在此中断.这些功能非常强大,但是据我的经验,开发人员很少使用它们.据我所知,这些是在幕后作为正常的无条件断点实现的.当达到断点时,调试器将自动评估条件.如果为真,则为用户闯入".如果为假,它将继续执行,就好像从未命中过断点一样.对条件断点没有硬件支持(除了上面讨论的数据断点支持),而且我不知道对条件断点(例如,由操作系统提供的任何东西)的任何较低级别的支持.当然,这就是为什么在断点上附加复杂条件会大大降低程序的执行速度的原因!

Finally, there are conditional breakpoints. This is where you can set a breakpoint on a line of code, but ask the debugger to only break there if a certain specified condition evaluates to true. These are extremely powerful, but, in my experience, only rarely used by developers. As far as I know, these are implemented under the hood as normal, unconditional breakpoints. When the breakpoint is hit, the debugger automatically evaluates the condition. If it is true, it "breaks in" for the user. If it is false, it continues execution just as if the breakpoint had never been hit. There is no hardware support for conditional breakpoints (beyond the data breakpoints support discussed above), and I am not aware of any lower-level support for conditional breakpoints (e.g., something provided by the operating system). This is, of course, why having complicated conditions attached to your breakpoints can significantly slow down the execution speed of your program!

如果您对更多详细信息感兴趣(好像这个答案还不够长!),您可以查看

If you're interested in more details (as if this answer isn't already long enough!), you might check out Tarik Soulami's Inside Windows Debugging. It looks like it contains relevant information, although I haven't read it yet so I can't unabashedly recommend it. (It's on my Amazon wish list!)

这篇关于如何在Visual Studio中添加运行时断点?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆