哪些呢编译器做一个[I]这是一个数组?而如果一个是一个指针? [英] what does compiler do with a[i] which a is array? And what if a is a pointer?

查看:173
本文介绍了哪些呢编译器做一个[I]这是一个数组?而如果一个是一个指针?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我被 C-FAQ告诉的编译器做不同的事情要处理[Ⅰ],而一个是数组或指针。下面是从C-常见问题的例子:


 的char a [] =你好;
的char * p =世界;


  
  

鉴于上述声明,当编译器看到前pression一个[3],它发出code。在位置``一',将过去三个它启动,并有获取字符。当看到前pression P [3],它发出code。在位置``P''开始,取指针值出现,添加三个指针,最后取字指向


但有人告诉我,用[I]打交道时,编译器往往会转换(这是一个数组)的指针到阵列。所以,我希望看到装配codeS找出哪些是正确的。

编辑:

下面是这句话的来源。 C-FAQ
并注意这句话:


  

形式的[I]的前pression导致数组衰变为指针,按照上面的规则,然后是刚刚下标为是在EX pression P A指针变量[I](尽管最终的内存访问会有所不同。


我很困惑这个pretty:因为衰减到指针,那么为什么他吝啬

的内存访问会有所不同?

下面是我的code:

  // array.cpp
#包括LT&;&cstdio GT;
使用命名空间std;诠释的main()
{
    所以char a [6] =你好;
    的char * p =世界;
    的printf(%C \\ N,一个[3]);
    的printf(%C \\ N,第[3]);
}

这是我得到了使用汇编code的一部分的g ++ -S array.cpp

  .filearray.cpp
    .section伪.RODATA
.LC0:
    .string世界
.LC1:
    .string%C \\ N
    。文本
.globl主
    .TYPE为主,@function
主要:
.LFB2:
    莱亚尔4(%ESP),ECX%
.LCFI0:
    和L $ -16,ESP%
    pushl -4(ECX%)
.LCFI1:
    pushl%EBP
.LCFI2:
    MOVL%ESP,EBP%
.LCFI3:
    pushl%ECX
.LCFI4:
    subl $ 36%ESP
.LCFI5:
    MOVL $一八一九○四三一七六,-14(%EBP)
    MOVW $ 111 -10(%EBP)
    MOVL $ .LC0,-8(%EBP)
    movzbl -11(%EBP),EAX%
    movsbl%人,EAX%
    MOVL%EAX,4(%ESP)
    MOVL $ .LC1(%ESP)
    调用printf
    MOVL -8(%EBP),EAX%
    ADDL $ 3%EAX
    movzbl(%EAX),EAX%
    movsbl%人,EAX%
    MOVL%EAX,4(%ESP)
    MOVL $ .LC1(%ESP)
    调用printf
    MOVL $ 0,%EAX
    ADDL $ 36%ESP
    popl%ECX
    popl%EBP
    莱亚尔-4(ECX%),ESP%
    RET

我不能从上面codeS想出[3]和p [3]的机制。如:


  • 其中,被你好初始化?

  • 是什么$一十八亿一千九百○四万三千一百七十六是什么意思?也许它的你好?(一个地址)的内存地址

  • 我敢肯定,-11(%EBP)是指[3],但是为什么呢?

  • 在MOVL -8(%EBP),%eax中,poniter磷内容存储在EAX,对不对?所以$ .LC0意味着指针p的内容?

  • 什么是movsbl%人,%eax中呢?

  • 和,请注意下列3行中的codeS:结果
        MOVL $一八一九○四三一七六,-14(%EBP)结果
        MOVW $ 111 -10(%EBP)结果
        MOVL $ .LC0,-8(%EBP)

    最后一个用MOVL,但为什么没有把它改写-10(%EBP)的内容? (我知道雁现在:),地址是增量和MOVL $ .LC0 -8(%EBP)将只覆盖{-8,-7,-6,-5}(EBP%))


我很抱歉,但我完全糊涂了的机制,以及组装code ...

非常感谢你的帮助。


解决方案

A 是一个指向字符数组。 P 是一个指向恰好,在这种情况下一个字符,被指向一个字符串字面。

  MOVL $一十八亿一千九百零四万三千一百七十六,-14(%EBP)
MOVW $ 111 -10(%EBP)

初​​始化当地的你好在栈上(这就是为什么它是通过 EBP 引用)。由于在你好比4字节以上,需要两个指令。

  movzbl -11(%EBP),EAX%
movsbl%人,EAX%

参考 A [3] :两步法是因为在获得所引用的存储器方面的限制的,虽然 EBP (我的x86福是有点生疏了)。

MOVL -8(%EBP),%eax中确实引用 P 指针。

LC0 引用了相对记忆的位置:一旦程序被加载到内存中一个固定的内存位置将被分配

movsbl%人,%eax中表示:移动单字节,低(给予或采取......我不得不来关注一下吧...我我在这方面有点生疏)。 重新present从寄存器的字节 EAX

I was told by c-faq that compiler do different things to deal with a[i] while a is an array or a pointer. Here's an example from c-faq:

char a[] = "hello";
char *p = "world";

Given the declarations above, when the compiler sees the expression a[3], it emits code to start at the location ``a'', move three past it, and fetch the character there. When it sees the expression p[3], it emits code to start at the location ``p'', fetch the pointer value there, add three to the pointer, and finally fetch the character pointed to.

But I was told that when dealing with a[i], the compiler tends to convert a (which is an array) to a pointer-to-array. So I want to see assembly codes to find out which is right.

EDIT:

Here's the source of this statement. c-faq And note this sentence:

an expression of the form a[i] causes the array to decay into a pointer, following the rule above, and then to be subscripted just as would be a pointer variable in the expression p[i] (although the eventual memory accesses will be different, "

I'm pretty confused of this: since a has decayed to pointer, then why does he mean about "memory accesses will be different?"

Here's my code:

// array.cpp
#include <cstdio>
using namespace std;

int main()
{
    char a[6] = "hello";
    char *p = "world";
    printf("%c\n", a[3]);
    printf("%c\n", p[3]);
}

And here's part of the assembly code I got using g++ -S array.cpp

    .file   "array.cpp" 
    .section    .rodata
.LC0:
    .string "world"
.LC1:
    .string "%c\n"
    .text
.globl main
    .type   main, @function
main:
.LFB2:
    leal    4(%esp), %ecx
.LCFI0:
    andl    $-16, %esp
    pushl   -4(%ecx)
.LCFI1:
    pushl   %ebp
.LCFI2:
    movl    %esp, %ebp
.LCFI3:
    pushl   %ecx
.LCFI4:
    subl    $36, %esp
.LCFI5:
    movl    $1819043176, -14(%ebp)
    movw    $111, -10(%ebp)
    movl    $.LC0, -8(%ebp)
    movzbl  -11(%ebp), %eax
    movsbl  %al,%eax
    movl    %eax, 4(%esp)
    movl    $.LC1, (%esp)
    call    printf
    movl    -8(%ebp), %eax
    addl    $3, %eax
    movzbl  (%eax), %eax
    movsbl  %al,%eax
    movl    %eax, 4(%esp)
    movl    $.LC1, (%esp)
    call    printf
    movl    $0, %eax
    addl    $36, %esp
    popl    %ecx
    popl    %ebp
    leal    -4(%ecx), %esp
    ret 

I can not figure out the mechanism of a[3] and p[3] from codes above. Such as:

  • where was "hello" initialized?
  • what does $1819043176 mean? maybe it's the memory address of "hello" (address of a)?
  • I'm sure that "-11(%ebp)" means a[3], but why?
  • In "movl -8(%ebp), %eax", content of poniter p is stored in EAX, right? So $.LC0 means content of pointer p?
  • What does "movsbl %al,%eax" mean?
  • And, note these 3 lines of codes:
    movl $1819043176, -14(%ebp)
    movw $111, -10(%ebp)
    movl $.LC0, -8(%ebp)

    The last one use "movl" but why did not it overwrite the content of -10(%ebp)? (I know the anser now :), the address is incremental and "movl $.LC0 -8(%ebp) will only overwrite {-8, -7, -6, -5}(%ebp))

I'm sorry but I'm totally confused of the mechanism, as well as assembly code...

Thank you very much for your help.

解决方案

a is a pointer to an array of chars. p is a pointer to a char which happens to, in this case, being pointed at a string-literal.

movl    $1819043176, -14(%ebp)
movw    $111, -10(%ebp)

Initializes the local "hello" on the stack (that's why it is referenced through ebp). Since there are more than 4bytes in "hello", it takes two instructions.

movzbl  -11(%ebp), %eax
movsbl  %al,%eax

References a[3]: the two step process is because of a limitation in terms of access to the memory referenced though ebp (my x86-fu is a bit rusty).

movl -8(%ebp), %eax does indeed reference the p pointer.

LC0 references a "relative memory" location: a fixed memory location will be allocated once the program is loaded in memory.

movsbl %al,%eax means: "move single byte, lower" (give or take... I'd have to look it up... I am a bit rusty on this front). al represent a byte from the register eax.

这篇关于哪些呢编译器做一个[I]这是一个数组?而如果一个是一个指针?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆