从技术上讲,你如何可变参数的函数工作?如何printf的工作? [英] Technically, how do variadic functions work? How does printf work?

查看:163
本文介绍了从技术上讲,你如何可变参数的函数工作?如何printf的工作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道我可以使用的va_arg 写我自己的可变参数的功能,但在装配指令级怎么办可变参数的函数的工作引擎盖下,即?

I know I can use va_arg to write my own variadic functions, but how do variadic functions work under the hood, i.e. on the assembly instruction level?

如,怎么可能是的printf 采用可变数量的参数?

E.g., how is it possible that printf takes a variable number of arguments?

<子>
*无一例外没有规则。没有语言的C / C ++,但是,这个问题可以回答对他们俩的

<子>
*注:答案原本给<一个href=\"http://stackoverflow.com/questions/23103875/how-can-printf-function-can-take-variable-parameters-in-number-while-output-them/23104211?noredirect=1#comment35316316_23104211\">How可以printf函数可以采用可变参数的数量,同时输出呢?,但似乎并不适用于提问

* Note: Answer originally given to How can printf function can take variable parameters in number while output them?, but it seems it did not apply to the questioner

推荐答案

C和C ++标准没有关于它如何工作的任何要求。相符编译器也可以决定发出链接列表,的std ::栈&LT;提高::任何方式&gt;引擎盖下甚至神奇的小马灰尘(按照XEO)

The C and C++ standard do not have any requirement on how it has to work. A complying compiler may well decide to emit chained lists, std::stack<boost::any> or even magical pony dust (as per Xeo) under the hood.

然而,它通常被实现如下,即使像在CPU寄存器内联或传递参数变换可以不离开讨论$ C $的c任何

However, it is usually implemented as follows, even though transformations like inlining or passing arguments in the CPU registers may not leave anything of the discussed code.

请注意,这个答案具体介绍如下视觉效果的不断增长向下堆栈;同时,这个答案是一种简化只是为了演示计划(请参阅 https://en.wikipedia.org/wiki/Stack_frame)。

Please also note that this answer specifically describes a downwards growing stack in the visuals below; also, this answer is a simplification just to demonstrate the scheme (please see https://en.wikipedia.org/wiki/Stack_frame).

这是可能的,因为底层机器体系结构具有用于每个线程一个所谓的堆栈。堆栈可以将参数传递给函数。例如,当你有:

This is possible because the underlying machine architecture has a so-called "stack" for every thread. The stack is used to pass arguments to functions. For example, when you have:

foobar("%d%d%d", 3,2,1);

那么这个编译成汇编code像这样(示范和示意图,实际code外观可能有所不同);注意,参数从右到左传递:

Then this compiles to an assembler code like this (exemplary and schematically, actual code might look different); note that the arguments are passed from right to left:

push 1
push 2
push 3
push "%d%d%d"
call foobar

那些推操作填补堆栈:

Those push-operations fill up the stack:

              []   // empty stack
-------------------------------
push 1:       [1]  
-------------------------------
push 2:       [1]
              [2]
-------------------------------
push 3:       [1]
              [2]
              [3]  // there is now 1, 2, 3 in the stack
-------------------------------
push "%d%d%d":[1]
              [2]
              [3]
              ["%d%d%d"]
-------------------------------
call foobar   ...  // foobar uses the same stack!

底部堆元素被称为堆栈的顶部,通常缩写TOS。

The bottom stack element is called the "Top of Stack", often abbreviated "TOS".

foobar的函数现在将访问堆栈,在TOS开始,即格式字符串,你还记得被推最后它。试想是堆栈指针,栈[0] 是在TOS值,堆叠[1] 是一个TOS以上,等等:

The foobar function would now access the stack, beginning at the TOS, i.e. the format string, which as you remember was pushed last. Imagine stack is your stack pointer , stack[0] is the value at the TOS, stack[1] is one above the TOS, and so forth:

format_string <- stack[0]

...然后解析格式字符串。在解析,它recognozies的%d个 -tokens,并为每个,加载一个从堆栈更多的价值:

... and then parses the format-string. While parsing, it recognozies the %d-tokens, and for each, loads one more value from the stack:

format_string <- stack[0]
offset <- 1
while (parsing):
    token = tokenize_one_more(format_string)
    if (needs_integer (token)):
        value <- stack[offset]
        offset = offset + 1
    ...

这当然是一个非常不完整的伪code演示功能如何还要靠传递的参数,找出它有多大的加载和从堆栈中删除。

This is of course a very incomplete pseudo-code that demonstrates how the function has to rely on the arguments passed to find out how much it has to load and remove from the stack.

这是用户提供的参数的依赖也是最大的安全问题present之一(见 HTTPS://cwe.mitre .ORG / TOP25 / )。用户可以方便地使用可变参数函数错误,要么是因为他们没有阅读文档,或忘了调整格式字符串或参数列表,或者是因为他们是普通邪,或什么的。另请参见格式化字符串攻击

This reliance on user-provided arguments is also one of the biggest security issues present (see https://cwe.mitre.org/top25/). Users may easily use a variadic function wrongly, either because they did not read the documentation, or forgot to adjust the format string or argument list, or because they are plain evil, or whatever. See also Format String Attack.

在C和C ++,参数可变型函数一起用的va_list 接口使用。虽然推入堆栈是内在这些语言(<一个href=\"https://stackoverflow.com/questions/13950642/why-does-a-function-with-no-parameters-compared-to-the-actual-function-definiti\">in K + RC你甚至可以前瞻性声明函数不说明它的参数的,但仍与任何数量和种类的参数来调用它),从这样的未知参数列表读通过 va_接口... -macros和的va_list 型,基本上抽象了低级别的栈帧访问。

In C and C++, variadic functions are used together with the va_list interface. While the pushing onto the stack is intrinsic to those languages (in K+R C you could even forward-declare a function without stating its arguments, but still call it with any number and kind arguments), reading from such an unknown argument list is interfaced through the va_...-macros and va_list-type, which basically abstracts the low-level stack-frame access.

这篇关于从技术上讲,你如何可变参数的函数工作?如何printf的工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆