在C语言中,我访问数组索引是指针快或访问速​​度更快? [英] In C, accessing my array index is faster or accessing by pointer is faster?

查看:506
本文介绍了在C语言中,我访问数组索引是指针快或访问速​​度更快?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在C,访问数组的索引是快或指针访问速度更快?
通过更快我的意思是,这人会花费较少的时钟周期。
该数组是不是一个常量数组。

In C, accessing an array index is faster or accessing by pointer is faster? By faster I mean, which one would take less clock cycle. The array is not an constant array.

推荐答案

templatetypedef已经把它概括。一些支持添加到他的回应。把这些例子的功能:

templatetypedef has summed it up. To add some support to his response. Take these example functions:


unsigned int fun1 ( unsigned int *x )
{
    unsigned int ra,rb;

    rb=0;
    for(ra=0;ra<1000;ra++) rb+=*x++;
    return(rb);
}

unsigned int fun2 ( unsigned int *x )
{
    unsigned int ra,rb;
    rb=0;
    for(ra=0;ra<1000;ra++) rb+=x[ra];
    return(rb);
}

现在GCC产生这样的:

Now gcc produced this:


00000000 fun1:
   0:   e52d4004    push    {r4}        ; (str r4, [sp, #-4]!)
   4:   e1a03000    mov r3, r0
   8:   e2804efa    add r4, r0, #4000   ; 0xfa0
   c:   e3a00000    mov r0, #0
  10:   e1a02003    mov r2, r3
  14:   e492c004    ldr ip, [r2], #4
  18:   e5931004    ldr r1, [r3, #4]
  1c:   e2823004    add r3, r2, #4
  20:   e080000c    add r0, r0, ip
  24:   e1530004    cmp r3, r4
  28:   e0800001    add r0, r0, r1
  2c:   1afffff7    bne 10 
  30:   e49d4004    pop {r4}        ; (ldr r4, [sp], #4)
  34:   e12fff1e    bx  lr

00000038 fun2:
  38:   e3a03000    mov r3, #0
  3c:   e1a02003    mov r2, r3
  40:   e790c003    ldr ip, [r0, r3]
  44:   e2833004    add r3, r3, #4
  48:   e7901003    ldr r1, [r0, r3]
  4c:   e2833004    add r3, r3, #4
  50:   e082200c    add r2, r2, ip
  54:   e3530efa    cmp r3, #4000   ; 0xfa0
  58:   e0822001    add r2, r2, r1
  5c:   1afffff7    bne 40 
  60:   e1a00002    mov r0, r2
  64:   e12fff1e    bx  lr

在code是不同的,但我在优化的错失机会感到惊讶。

The code is different, but I am surprised at the missed opportunities for optimization.

锵/ LLVM产生这样的:

Clang/llvm produced this:



00000000 fun1:
   0:   e3a01000    mov r1, #0
   4:   e3a02ffa    mov r2, #1000   ; 0x3e8
   8:   e1a03001    mov r3, r1
   c:   e2522001    subs    r2, r2, #1
  10:   e490c004    ldr ip, [r0], #4
  14:   e08c3003    add r3, ip, r3
  18:   e2c11000    sbc r1, r1, #0
  1c:   e182c001    orr ip, r2, r1
  20:   e35c0000    cmp ip, #0
  24:   1afffff8    bne c 
  28:   e1a00003    mov r0, r3
  2c:   e12fff1e    bx  lr

00000030 fun2:
  30:   e3a01000    mov r1, #0
  34:   e3a02ffa    mov r2, #1000   ; 0x3e8
  38:   e1a03001    mov r3, r1
  3c:   e2522001    subs    r2, r2, #1
  40:   e490c004    ldr ip, [r0], #4
  44:   e08c3003    add r3, ip, r3
  48:   e2c11000    sbc r1, r1, #0
  4c:   e182c001    orr ip, r2, r1
  50:   e35c0000    cmp ip, #0
  54:   1afffff8    bne 3c
  58:   e1a00003    mov r0, r3
  5c:   e12fff1e    bx  lr

您可能会注意到编译器产生完全相同的code,指针或偏移量。并通过改变编译器我又比改​​变VS指针数组索引更好。我认为LLVM可以做的更好一点,我需要学习一些这方面更多的了解我的code做了什么导致此。

You might notice that the compiler produced the exact same code, pointer or offset. And by changing compilers I was better off than changing pointer vs array indexing. I think llvm could have done a little better, I will need study this some more to understand what my code did to cause this.

编辑:

我希望让编译器至少使用LDR路[RS],#4指令这有利于指针,并希望编译器会看到,它可能会破坏这样的阵列地址对待它就像一个指针,而比偏移到一个数组(和使用上述指令时,它基本上是什么铛/ LLVM那样)。或者,如果它没有,它会使用LDR路[RM,RN]指令数组的事情。基本上希望编译器的人会产生这些解决方案之一:

I was hoping to get the compiler to at a minimum use the ldr rd,[rs],#4 instruction which favors pointers, and hoped the compiler would see that it could destroy the array address thus treating it like a pointer rather than an offset into an array (and use the above instruction, which is basically what clang/llvm did). Or if it did the array thing that it would use the ldr rd,[rm,rn] instruction. Basically was hoping one of the compilers would generate one of these solutions:



funa:
    mov r1,#0
    mov r2,#1000
funa_loop:
    ldr r3,[r0],#4
    add r1,r1,r3
    subs r2,r2,#1
    bne funa_loop
    mov r0,r1
    bx lr

funb:
    mov r1,#0
    mov r2,#0
funb_loop:
    ldr r3,[r0,r2]
    add r1,r1,r3
    add r2,r2,#4
    cmp r2,#0x4000
    bne funb_loop
    mov r0,r1
    bx lr

func:
    mov r1,#0
    mov r2,#4000
    subs r2,r2,#4
func_loop:
    beq func_done
    ldr r3,[r0,r2]
    add r1,r1,r3
    subs r2,r2,#4
    b func_loop
func_done:
    mov r0,r1
    bx lr

剪掉相当那里却得到了pretty接近。这是一个有趣的练习。注意上面是所有ARM汇编。

Didnt quite get there but got pretty close. This was a fun exercise. Note the above is all ARM assembler.

在一般情况下,(不我的特定的C $ C $℃实施例,不一定一个ARM),一些流行的架构你将不得不从一个基于寄存器地址的负荷(LDR R0,[R1])和一个负载与寄存器索引/偏移(LDR R0,[R1,R2])其中的地址是两个寄存器的总和。一个寄存器理想地是数组的基地址和所述第二偏移的索引/。从寄存器中的前负荷适合于三分球,后者阵列。如果你的C程序不会改变或移动指针或索引,那么在这两种情况下,这意味着它的计算则正常负载使用静态地址,这两个数组和指针应该产生相同的指令。为了改变指针/索引的更有趣的案例。

In general, (not my specific C code example and not necessarily an ARM), a number of the popular architectures you will have a load from a register based address (ldr r0,[r1]) and a load with a register index/offset (ldr r0,[r1,r2]) where the address is the sum of the two registers. one register ideally is the base address of the array and the second the index/offset. The former load from register lends itself to pointers, the latter to arrays. if your C program is NOT going to change or move the pointer or index, then in both cases that means a static address which is computed then a normal load is used, both array and pointer should produce the same instructions. For the more interesting case of changing the pointer/index.


Pointer

ldr r0,[r1]
...
add r1,r1,some number

Array index

ldr r0,[r1,r2]
...
add r2,r2,some number

(带店更换负载和需要一分的加)

(replace the load with a store and the add with a sub as needed)

一些体系结构不具备三个寄存器寄存器址指令所以你必须做一些像

Some architectures do not have a three register register index instruction so there you have to do something like


array index:
mov r2,r1
...
ldr r0,[r2]
...
add r2,r2,some number

或取决于编译器就可以得到非常糟糕的,尤其如果你编译调试或不优化,并假设你没有一个三免费注册

Or depending on the compiler it can get really bad, esp if you compile for debugging or without optimizations, and assuming you dont have a three register add


array index:
mov r2,#0
...
mov r3,r1
add r3,r2
ldr r4,[r3]
...
add r2,some number

因此​​,它是很可能的是,这两种方法是相同的。由于ARM的看出,它可以在两个指针指示(为眼前的范围内)合并为一体,使得该快一点。数组索引溶液燃烧更多的寄存器,并且根据可用的寄存器为推动你向具有交换寄存器到堆栈越早和更频繁的体系结构的数目(比使用指针),减慢速率甚至更多。如果你不介意摧毁的基址,底线是指针的解决方案的可能的给你从性能的角度来看是有利的。它有很多事情要做你code和编译器。对我来说,可读性进场,我觉得阵列更容易阅读和遵守,其次我需要preserve该指针释放一个malloc或再次经过内存等,如果这样我可能会使用与索引的数组,如果是一次通过,我不死命的基地址,我会用一个指针护理。在您使用编译器生成code上面看到的,如果性能是至关重要的,然后用手code汇编反正解​​决方案(基于让编译器尝试首先提出的办法)。

So it is quite possible that the two approaches are equal. As seen on the ARM, it can combine the two (within limits for the immediate) pointer instructions into one, making that a little faster. The array index solution burns more registers, and depending on the number of available registers for the architecture that pushes you toward having to swap registers out to the stack sooner and more often (than you would with pointers), slowing you down even more. If you dont mind destroying the base address, the bottom line is the pointer solution might give you an advantage from a performance perspective. It has a lot to do with your code and the compiler. For me it readability comes into play and I feel arrays are easier to read and follow, and second do I need to preserve that pointer to free a malloc or to go through that memory again, etc. If so I will probably use an array with an index, if it is a one time pass and I dont care about destroying the base address I will use a pointer. As you saw above with the compiler generated code, if performance is critical, then hand code the solution in assembler anyway (based on suggested approaches by letting the compilers try it first).

这篇关于在C语言中,我访问数组索引是指针快或访问速​​度更快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆