printf() var-arg 引用如何与堆栈内存布局交互? [英] How does printf() var-arg referencing interact with stack memory layout?

查看:54
本文介绍了printf() var-arg 引用如何与堆栈内存布局交互?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给定代码片段:

int main()
{
    printf("Val: %d", 5);
    return 0;
}

是否有任何保证编译器会连续存储 "Val: %d"'5' ?例如:

is there any guarantee that the compiler would store "Val: %d" and '5' contiguously? For example:

+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| ... |  %d | ' ' | ':' | 'l' | 'a' | 'V' | '5' | ... |
+-----+-----+-----+-----+-----+-----+-----+-----+-----+
      ^                                   ^     ^
      |           Format String           | int |

这些参数在内存中究竟是怎么分配的?

Exactly how does are these parameters allocated in memory?

此外,printf 函数是相对于格式字符串还是通过绝对值访问 int?例如,在数据中

Furthermore, does the printf function access the int relative to the format string or by absolute value? So for example, in the data

+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| ... |  %d | ' ' | ':' | 'l' | 'a' | 'V' | '5' | ... |
+-----+-----+-----+-----+-----+-----+-----+-----+-----+
      ^                                   ^     ^
      |           Format String           | int |

当函数遇到 %d 时,是否已经存在函数的第一个参数的存储内存地址,该参数将被引用,或者该值是否会相对于格式字符串的第一个元素进行计算?

when the function encounters %d would there already be a stored memory address for the first parameter of the function which would be referenced or would the value be calculated relative to the first element of the format string?

抱歉,如果我混淆了,我的主要目标是了解允许用户提供本文档中描述的格式字符串的字符串格式化漏洞

Sorry if I'm being confusing, my primary goal is to understand string formatting exploits where the user is allowed to supply the format string as described in this document

http://www.cis.syr.edu/~wedu/Teaching/cis643/LectureNotes_New/Format_String.pdf

我担心第 3 页和第 4 页描述的攻击.我认为 %x 将跳过字符串占用的 16 位,这表明函数是连续分配的和引用相对,但其他来源表明不能保证编译器必须连续分配,我担心这篇论文是一种简化.

My concerns arise on the attack described on page 3 and 4. I figured that the %x's are to skip the 16 bits that the string takes up which would indicate that the function allocated contiguously and references relatively but other sources indicate that there is not guaranteed that the compiler must allocate contiguously and I was concerned that the paper was a simplification.

推荐答案

是否有任何保证编译器会连续存储Val: %d"和5"

is there any guarantee that the compiler would store "Val: %d" and '5' contiguously

几乎可以保证他们不会.5 足够小,可以直接嵌入到指令流中,而不是通过内存地址(指针)加载 - 类似于 movl #5, %eax 和/或随后推送到堆栈——而字符串对象将放置在可执行映像的只读数据区域中,并将通过指针引用.我们讨论的是编译时间可执行映像的布局.

It's virtually guaranteed they won't be. The 5 is small enough that it can be embedded right in the instruction stream rather than loaded through a memory address (pointer) -- something like movl #5, %eax and/or followed by a push onto the stack -- whereas the string object will be laid out in the read-only data area of the executable image, and will be referenced via a pointer. We're talking about compile time layout of the executable image.

除非您指的是堆栈运行时布局,其中是的,指向该字符串的单词大小的指针,以及单词-大小常数 5,将彼此相邻.但顺序可能与您期望的相反——研究C 函数调用约定".

Unless you mean the runtime layout of the stack in which yes, the word-sized pointer to that string, and the word-sized constant 5, will be next to each other. But the order is probably the reverse of what you expect -- study 'C function calling convention'.

[后期现在使用 -S(输出程序集)运行一些代码示例;我被提醒在调用者中使用少量寄存器(即 CPU 寄存器可以被覆盖而不会造成伤害),并且被调用函数的参数很少,这些参数可以完全通过寄存器传递以节省指令和内存.因此,即使攻击者可以访问源代码,堆栈的布局实际上也很难预测.特别是使用 gcc -O2,它把我的 main -> my_function -> printf 函数序列折叠成 main -> printf]

[Later edit: Running some code samples with -S (output assembly) now; I'm reminded that with light register usage in the caller (i.e. CPU registers can be overwritten without harm), and few arguments to the called function, the arguments can be passed entirely via registers to save instructions and memory. So the layout of the stack is actually tricky to predict, even if the attacker had access to the source code. Especially with gcc -O2, which collapsed my main -> my_function -> printf function sequence into main -> printf]

大多数漏洞利用都采用堆栈溢出,因为恶意代码会撞到砖墙,试图修改上述只读数据区中的内存——操作系统会中止该过程.

Most exploits employ stack overruns, since malicious code runs into a brick wall trying to modify memory in the aforementioned read-only data area -- OS aborts the process.

printf 的行为很奇特,因为格式字符串就像一个微型计算机程序,它告诉 printf 查看堆栈上的参数,以找到它找到的每个 '%' 格式说明符.如果这些参数实际上从未被推送,和/或具有不同的大小, printf 将盲目地遍历堆栈的部分它不应该并且可能会在私有数据可能所在的堆栈上(调用链的下游)进一步显示数据.如果 printf 的第一个参数至少是一个常量,那么编译器至少可以在后续参数与 '%' 说明符不匹配时发出警告,但是当它是一个变量时,一切都结束了.

The behavior of printf is peculiar in that the format string is like a miniature computer program that tells printf to look at arguments on the stack for every '%' format specifier that it finds. If those arguments were never in fact pushed, and/or were of different sizes, printf will blindly traverse portions of the stack it shouldn't and perhaps reveal data further up the stack (down the call chain) where private data may lie. If the first argument to printf is at least a constant, a compiler can at least warn you when subsequent arguments mismatch the '%' specifiers, but when it's a variable, all bets are off.

从安全角度来看,printf 很糟糕,而且计算量很大,但非常强大且富有表现力.欢迎来到 C.:-)

printf is awful from a security perspective and is computationally intensive, but very powerful and expressive. Welcome to C. :-)

第二次编辑现在你在评论中的第一个问题......正如你所看到的那样,你的术语和想法可能有点乱.研究以下内容以了解正在发生的事情.不要担心指向字符串的指针.这是在 Linux 3.13 64 位上使用 gcc 4.8.2 编译的,没有标志.请注意,过度使用格式说明符实际上是如何在堆栈中向后遍历,揭示在先前函数调用中传递的参数.

2nd later edit Now your first question in the comments...as you can see your terminology and perhaps thoughts were a bit garbled. Study the following to get a sense of what's going on. Don't worry about pointers to strings yet. This was compiled with gcc 4.8.2 on Linux 3.13 64-bit with no flags. Note how the excessive use of format specifiers essentially walks backward through the stack, revealing arguments that were passed in a previous function call.

/* Do not compile this at home. */
#include <stdio.h>

int second() {
  printf("%08X %08X %08X %08X %08X %08X %08X %08X\n");
}

int first(int a, int b, int c, int d, int e, int f, int g, int h) {
  second();
}

int main(int argc, char **argv) {
  first(0xDEEDC0DE, 0x1EADBEEF, 0x11BEDEAD, 0xCAFAF000, 0xDAFEBABE, 0xAACEBACE, 0xE1ED1EAA, 0x10F00FAA);
  return 0;
}

两次背靠背运行,stdio 输出:

Two back-to-back runs, stdio output:

1EADBEEF 11BEDEAD CAFAF000 DAFEBABE AACEBACE 75F83520 00400568 88B151C8

1EADBEEF 11BEDEAD CAFAF000 DAFEBABE AACEBACE 8B4CBDC0 00400568 7BB841C8

这篇关于printf() var-arg 引用如何与堆栈内存布局交互?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆