效率:数组与指针 [英] Efficiency: arrays vs pointers

查看:23
本文介绍了效率:数组与指针的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

据说通过指针访问内存比通过数组访问内存更有效.我正在学习 C,以上内容在 K&R 中有说明.他们具体说

Memory access through pointers is said to be more efficient than memory access through an array. I am learning C and the above is stated in K&R. Specifically they say

任何可以通过数组下标实现的操作,也可以用指针来完成.指针版本一般会更快

Any operation that can be achieved by array subscripting can also be done with pointers. The pointer version will in general be faster

我使用 Visual C++ 反汇编了以下代码.(我的是 686 处理器.我已禁用所有优化.)

I dis-assembled the following code using visual C++.(Mine is a 686 processor. I have disabled all optimizations.)

int a[10], *p = a, temp;

void foo()
{
    temp = a[0];
    temp = *p;
}

令我惊讶的是,我发现通过指针访问内存需要 3 条指令,而通过数组访问内存需要 3 条指令.下面是对应的代码.

To my surprise I see that memory access through a pointer takes 3 instructions to the two taken by memory access through an array. Below is the corresponding code.

; 5    : temp = a[0];

    mov eax, DWORD PTR _a
    mov DWORD PTR _temp, eax

; 6    : temp = *p;

    mov eax, DWORD PTR _p
    mov ecx, DWORD PTR [eax]
    mov DWORD PTR _temp, ecx

请帮我理解.我在这里错过了什么??

Please help me understand. What am I missing here??

正如许多答案和评论所指出的那样,我使用编译时常量作为数组索引,因此可以说更容易通过数组访问.下面是以变量为索引的汇编代码.我现在有相同数量的指令用于通过指针和数组进行访问.我的更广泛的问题仍然有效.通过指针访问内存并不能提高效率.

As pointed out by many answers and comments I had used a compile time constant as the array index thus making it arguably easier for the access through an array. Below is the assembly code with a variable as the index. I now have equal number of instructions for access through pointer and arrays. My broader questions still holds good. The memory access through a pointer is not lending itself as being more efficient.

; 7    :        temp = a[i];

    mov eax, DWORD PTR _i
    mov ecx, DWORD PTR _a[eax*4]
    mov DWORD PTR _temp, ecx

; 8    : 
; 9    :    
; 10   :        temp = *p;

    mov eax, DWORD PTR _p
    mov ecx, DWORD PTR [eax]
    mov DWORD PTR _temp, ecx

推荐答案

据说通过指针访问内存比通过数组访问内存更有效.

Memory access through pointers is said to be more efficient than memory access through an array.

在过去,当编译器是相对愚蠢的野兽时,这可能是正确的.只需要在高优化模式下查看gcc输出的部分代码就知道不再是真的了.其中一些代码很难理解,但是一旦你理解了,它的魅力就显而易见了.

That may have been true in the past when compilers were relatively stupid beasts. You only need to look at some of the code output by gcc in high optimisation modes to know that it is no longer true. Some of that code is very hard to understand but, once you do, its brilliance is evident.

一个体面的编译器将为指针访问和数组访问生成相同的代码,您可能不必担心这种级别的性能.编写编译器的人比我们普通人更了解他们的目标架构.在优化您的代码(算法选择等)时更多地关注宏观层面,并信任您的工具制造商来完成他们的工作.

A decent compiler will generate the same code for pointer accesses and array accesses and you should probably not be worrying about that level of performance. The people that write compilers know far more about their target architectures than we mere mortals. Concentrate more on the macro level when optimising your code (algorithm selection and so on) and trust in your tool-makers to do their job.

事实上,我很惊讶编译器没有优化整个

In fact, I'm surprised the compiler didn't optimise the entire

temp = a[0];

行不存在,因为 temp 在下一行被不同的值覆盖,并且 a 没有被标记为 volatile.

line out of existence since temp is over-written in the very next line with a different value and a is in no way marked volatile.

我记得很久以前有一个关于最新 VAX Fortran 编译器的基准测试的城市神话(这里显示了我的年龄),其性能比竞争对手高出几个数量级.

I remember an urban myth from long ago about a benchmark for the latest VAX Fortran compiler (showing my age here) that outperformed its competitors by several orders of magnitude.

事实证明,编译器发现基准计算的结果没有在任何地方使用,因此它优化了整个计算循环,使其消失.从而大幅提高运行速度.

Turns out the compiler figured out that the result from the benchmark calculation wasn't used anywhere so it optimised the entire calculation loop into oblivion. Hence the substantial improvement in run speed.

更新:优化代码在您的特定情况下更有效的原因是您找到位置的方式.a 将在链接/加载时决定的固定位置,同时对它的引用将被修复.所以 a[0] 或者实际上 a[any constant] 将在一个固定的位置.

Update: The reason that optimised code is more efficient in your particular case is because of the way you find the location. a will be at a fixed location decided at link/load time and the reference to it will be fixed up at the same time. So a[0] or indeed a[any constant] will be at a fixed location.

p 本身也将出于同样的原因在一个固定的位置.但是 *p(p 的内容)是可变的,因此需要额外的查找才能找到正确的内存位置.

And p itself will also be at a fixed location for the same reason. But *p (the contents of p) is variable and therefore will have an extra lookup involved to find the correct memory location.

您可能会发现将另一个变量 x 设置为 0(不是 const)并使用 a[x] 也会引入额外的计算.

You'll probably find that having yet another variable x set to 0 (not const) and using a[x] would also introduce extra calculations.

在您的评论之一中,您声明:

In one of your comments, you state:

按照您的建议进行操作也导致了 3 条通过数组访问内存的指令(获取索引、获取数组元素的值、存储在临时文件中).但我仍然无法看到效率.:-(

Doing as you suggested resulted in 3 instructions for memory access through arrays too (fetch index, fetch value of array element, store in temp). But I am still unable to see the efficiency. :-(

我对此的回应是,您很可能不会看到使用指针的效率.现代编译器的任务不仅仅是弄清楚数组操作和指针操作可以转换成相同的底层机器代码.

My response to that is that you very likely won't see an efficiency in using pointers. Modern compilers are more than up to the task of figuring out that array operations and pointer operations can be turned into the same underlying machine code.

事实上,如果没有开启优化,指针代码的效率会降低.考虑以下翻译:

In fact, without optimisation turned on, pointer code can be less efficient. Consider the following translations:

int *pa, i, a[10];

for (i = 0; i < 10; i++)
    a[i] = 100;
/*
    movl    $0, -16(%ebp)              ; this is i, init to 0
L2:
    cmpl    $9, -16(%ebp)              ; from 0 to 9
    jg      L3
    movl    -16(%ebp), %eax            ; load i into register
    movl    $100, -72(%ebp,%eax,4)     ; store 100 based on array/i
    leal    -16(%ebp), %eax            ; get address of i
    incl    (%eax)                     ; increment
    jmp     L2                         ; and loop
L3:
*/

for (pa = a; pa < a + 10; pa++)
    *pa = 100;
/*
    leal    -72(%ebp), %eax
    movl    %eax, -12(%ebp)            ; this is pa, init to &a[0]
L5:
    leal    -72(%ebp), %eax
    addl    $40, %eax
    cmpl    -12(%ebp), %eax            ; is pa at &(a[10])
    jbe     L6                         ; yes, stop
    movl    -12(%ebp), %eax            ; get pa
    movl    $100, (%eax)               ; store 100
    leal    -12(%ebp), %eax            ; get pa
    addl    $4, (%eax)                 ; add 4 (sizeof int)
    jmp     L5                         ; loop around
L6:
*/

从该示例中,您实际上可以看到指针示例更长,并且不必要.它多次将 pa 加载到 %eax 中而不改变它,并且确实在 pa 之间交替了 %eax&(a[10]).这里的默认优化基本没有.

From that example, you can actually see that the pointer example is longer, and unnecessarily so. It loads pa into %eax multiple times without it changing and indeed alternates %eax between pa and &(a[10]). The default optimisation here is basically none at all.

当你切换到优化级别 2 时,你得到的代码是:

When you switch up to optimisation level 2, the code you get is:

    xorl    %eax, %eax
L5:
    movl    $100, %edx
    movl    %edx, -56(%ebp,%eax,4)
    incl    %eax
    cmpl    $9, %eax
    jle     L5

对于阵列版本,以及:

    leal    -56(%ebp), %eax
    leal    -16(%ebp), %edx
    jmp     L14
L16:
    movl    $100, (%eax)
    addl    $4, %eax
L14:
    cmpl    %eax, %edx
    ja      L16

对于指针版本.

我不会在这里对时钟周期进行分析(因为工作量太大而且我基本上很懒惰)但我会指出一件事.两个版本的代码在汇编指令方面没有太大区别,而且鉴于现代 CPU 的实际运行速度,除非您执行数十亿操作.我总是倾向于编写代码以提高可读性,并且只在出现问题时才担心性能.

I'm not going to do an analysis on clock cycles here (since it's too much work and I'm basically lazy) but I will point out one thing. There's not a huge difference in the code for both versions in terms of assembler instructions and, given the speeds that modern CPUs actually run at, you won't notice a difference unless you're doing billions of these operations. I always tend to prefer writing code for readability and only worrying about performance if it becomes an issue.

顺便说一句,您引用的那句话:

As an aside, that statement you reference:

5.3 指针和数组:指针版本通常会更快,但至少对于初学者来说,有点难以立即掌握.

5.3 Pointers and Arrays: The pointer version will in general be faster but, at least to the uninitiated, somewhat harder to grasp immediately.

可以追溯到 K&R 的最早版本,包括我在 1978 年的古老版本,该版本仍然编写函数:

dates back to the earliest versions of K&R, including my ancient 1978 one where functions are still written:

getint(pn)
int *pn;
{
    ...
}

从那时起,编译器已经走了很长一段路.

Compilers have come an awfully long way since back then.

这篇关于效率:数组与指针的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆