哪些呢编译器做一个[I]这是一个数组?而如果一个是一个指针? [英] what does compiler do with a[i] which a is array? And what if a is a pointer?
问题描述
我被 C-FAQ告诉的编译器做不同的事情要处理[Ⅰ],而一个是数组或指针。下面是从C-常见问题的例子:
的char a [] =你好;
的char * p =世界;
鉴于上述声明,当编译器看到前pression一个[3],它发出code。在位置``一',将过去三个它启动,并有获取字符。当看到前pression P [3],它发出code。在位置``P''开始,取指针值出现,添加三个指针,最后取字指向
块引用>但有人告诉我,用[I]打交道时,编译器往往会转换(这是一个数组)的指针到阵列。所以,我希望看到装配codeS找出哪些是正确的。
编辑:
下面是这句话的来源。 C-FAQ
并注意这句话:
形式的[I]的前pression导致数组衰变为指针,按照上面的规则,然后是刚刚下标为是在EX pression P A指针变量[I](尽管最终的内存访问会有所不同。
块引用>我很困惑这个pretty:因为衰减到指针,那么为什么他吝啬
的内存访问会有所不同?下面是我的code:
// array.cpp
#包括LT&;&cstdio GT;
使用命名空间std;诠释的main()
{
所以char a [6] =你好;
的char * p =世界;
的printf(%C \\ N,一个[3]);
的printf(%C \\ N,第[3]);
}这是我得到了使用汇编code的一部分的g ++ -S array.cpp
.filearray.cpp
.section伪.RODATA
.LC0:
.string世界
.LC1:
.string%C \\ N
。文本
.globl主
.TYPE为主,@function
主要:
.LFB2:
莱亚尔4(%ESP),ECX%
.LCFI0:
和L $ -16,ESP%
pushl -4(ECX%)
.LCFI1:
pushl%EBP
.LCFI2:
MOVL%ESP,EBP%
.LCFI3:
pushl%ECX
.LCFI4:
subl $ 36%ESP
.LCFI5:
MOVL $一八一九○四三一七六,-14(%EBP)
MOVW $ 111 -10(%EBP)
MOVL $ .LC0,-8(%EBP)
movzbl -11(%EBP),EAX%
movsbl%人,EAX%
MOVL%EAX,4(%ESP)
MOVL $ .LC1(%ESP)
调用printf
MOVL -8(%EBP),EAX%
ADDL $ 3%EAX
movzbl(%EAX),EAX%
movsbl%人,EAX%
MOVL%EAX,4(%ESP)
MOVL $ .LC1(%ESP)
调用printf
MOVL $ 0,%EAX
ADDL $ 36%ESP
popl%ECX
popl%EBP
莱亚尔-4(ECX%),ESP%
RET我不能从上面codeS想出[3]和p [3]的机制。如:
- 其中,被你好初始化?
- 是什么$一十八亿一千九百○四万三千一百七十六是什么意思?也许它的你好?(一个地址)的内存地址
- 我敢肯定,-11(%EBP)是指[3],但是为什么呢?
- 在MOVL -8(%EBP),%eax中,poniter磷内容存储在EAX,对不对?所以$ .LC0意味着指针p的内容?
- 什么是movsbl%人,%eax中呢?
和,请注意下列3行中的codeS:结果
MOVL $一八一九○四三一七六,-14(%EBP)结果
MOVW $ 111 -10(%EBP)结果
MOVL $ .LC0,-8(%EBP)最后一个用MOVL,但为什么没有把它改写-10(%EBP)的内容? (我知道雁现在:),地址是增量和MOVL $ .LC0 -8(%EBP)将只覆盖{-8,-7,-6,-5}(EBP%))
我很抱歉,但我完全糊涂了的机制,以及组装code ...
非常感谢你的帮助。
解决方案
A
是一个指向字符数组。P
是一个指向恰好,在这种情况下一个字符,被指向一个字符串字面。MOVL $一十八亿一千九百零四万三千一百七十六,-14(%EBP)
MOVW $ 111 -10(%EBP)初始化当地的你好在栈上(这就是为什么它是通过
EBP
引用)。由于在你好比4字节以上,需要两个指令。movzbl -11(%EBP),EAX%
movsbl%人,EAX%参考
A [3]
:两步法是因为在获得所引用的存储器方面的限制的,虽然EBP
(我的x86福是有点生疏了)。
MOVL -8(%EBP),%eax中
确实引用P
指针。
LC0
引用了相对记忆的位置:一旦程序被加载到内存中一个固定的内存位置将被分配
movsbl%人,%eax中
表示:移动单字节,低(给予或采取......我不得不来关注一下吧...我我在这方面有点生疏)。人
重新present从寄存器的字节EAX
。I was told by c-faq that compiler do different things to deal with a[i] while a is an array or a pointer. Here's an example from c-faq:
char a[] = "hello"; char *p = "world";
Given the declarations above, when the compiler sees the expression a[3], it emits code to start at the location ``a'', move three past it, and fetch the character there. When it sees the expression p[3], it emits code to start at the location ``p'', fetch the pointer value there, add three to the pointer, and finally fetch the character pointed to.
But I was told that when dealing with a[i], the compiler tends to convert a (which is an array) to a pointer-to-array. So I want to see assembly codes to find out which is right.
EDIT:
Here's the source of this statement. c-faq And note this sentence:
an expression of the form a[i] causes the array to decay into a pointer, following the rule above, and then to be subscripted just as would be a pointer variable in the expression p[i] (although the eventual memory accesses will be different, "
I'm pretty confused of this: since a has decayed to pointer, then why does he mean about "memory accesses will be different?"
Here's my code:
// array.cpp #include <cstdio> using namespace std; int main() { char a[6] = "hello"; char *p = "world"; printf("%c\n", a[3]); printf("%c\n", p[3]); }
And here's part of the assembly code I got using g++ -S array.cpp
.file "array.cpp" .section .rodata .LC0: .string "world" .LC1: .string "%c\n" .text .globl main .type main, @function main: .LFB2: leal 4(%esp), %ecx .LCFI0: andl $-16, %esp pushl -4(%ecx) .LCFI1: pushl %ebp .LCFI2: movl %esp, %ebp .LCFI3: pushl %ecx .LCFI4: subl $36, %esp .LCFI5: movl $1819043176, -14(%ebp) movw $111, -10(%ebp) movl $.LC0, -8(%ebp) movzbl -11(%ebp), %eax movsbl %al,%eax movl %eax, 4(%esp) movl $.LC1, (%esp) call printf movl -8(%ebp), %eax addl $3, %eax movzbl (%eax), %eax movsbl %al,%eax movl %eax, 4(%esp) movl $.LC1, (%esp) call printf movl $0, %eax addl $36, %esp popl %ecx popl %ebp leal -4(%ecx), %esp ret
I can not figure out the mechanism of a[3] and p[3] from codes above. Such as:
- where was "hello" initialized?
- what does $1819043176 mean? maybe it's the memory address of "hello" (address of a)?
- I'm sure that "-11(%ebp)" means a[3], but why?
- In "movl -8(%ebp), %eax", content of poniter p is stored in EAX, right? So $.LC0 means content of pointer p?
- What does "movsbl %al,%eax" mean?
And, note these 3 lines of codes:
movl $1819043176, -14(%ebp)
movw $111, -10(%ebp)
movl $.LC0, -8(%ebp)The last one use "movl" but why did not it overwrite the content of -10(%ebp)? (I know the anser now :), the address is incremental and "movl $.LC0 -8(%ebp) will only overwrite {-8, -7, -6, -5}(%ebp))
I'm sorry but I'm totally confused of the mechanism, as well as assembly code...
Thank you very much for your help.
解决方案
a
is a pointer to an array of chars.p
is a pointer to a char which happens to, in this case, being pointed at a string-literal.movl $1819043176, -14(%ebp) movw $111, -10(%ebp)
Initializes the local "hello" on the stack (that's why it is referenced through
ebp
). Since there are more than 4bytes in "hello", it takes two instructions.movzbl -11(%ebp), %eax movsbl %al,%eax
References
a[3]
: the two step process is because of a limitation in terms of access to the memory referenced thoughebp
(my x86-fu is a bit rusty).
movl -8(%ebp), %eax
does indeed reference thep
pointer.
LC0
references a "relative memory" location: a fixed memory location will be allocated once the program is loaded in memory.
movsbl %al,%eax
means: "move single byte, lower" (give or take... I'd have to look it up... I am a bit rusty on this front).al
represent a byte from the registereax
.这篇关于哪些呢编译器做一个[I]这是一个数组?而如果一个是一个指针?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!