使用数组偏移和指针增量有什么区别? [英] What are the differences between using array offsets vs pointer incrementation?

查看:174
本文介绍了使用数组偏移和指针增量有什么区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给定2个函数,应该更快,如果有任何差异?假设输入数据非常大

Given 2 functions, which should be faster, if there is any difference at all? Assume that the input data is very large

void iterate1(const char* pIn, int Size)
{
   for ( int offset = 0; offset < Size; ++offset )
   {
      doSomething( pIn[offset] );
   }
}

vs

void iterate2(const char* pIn, int Size)
{
   const char* pEnd = pIn+Size;
   while(pIn != pEnd)
   {
      doSomething( *pIn++ );
   }
}

两种方法都有其他问题需要考虑吗?

Are there other issues to be considered with either approach?

推荐答案

Boojum是正确的 - IF ,您的编译器有一个很好的优化器,如果不是这样,或者你使用的数组不是顺序的,并且可以优化,使用数组偏移可能远远慢得多。

Boojum is correct - IF your compiler has a good optimizer and you have it enabled. If that's not the case, or your use of arrays isn't sequential and liable to optimization, using array offsets can be far, far slower.

这里有一个例子。回到1988年,我们在Mac II上实现了一个带有简单电传接口的窗口。这包括24行的80个字符。当您从自动收报机中得到一条新线时,您向上滚动了前23行,并在底部显示新的。当电传上有东西时,这是不是所有的时间,它进来了在300波特,这与串行协议开销约每秒30个字符。所以我们不是说应该已经征税了一个16兆赫的68020在所有!

Here's an example. Back about 1988, we were implementing a window with a simple teletype interface on a Mac II. This consisted of 24 lines of 80 characters. When you got a new line in from the ticker, you scrolled up the top 23 lines and displayed the new one on the bottom. When there was something on the teletype, which wasn't all the time, it came in at 300 baud, which with the serial protocol overhead was about 30 characters per second. So we're not talking something that should have taxed a 16 MHz 68020 at all!

但是这个人写的是这样:

But the guy who wrote this did it like:

char screen[24][80];

并使用2-D数组偏移量滚动字符,如下所示:

and used 2-D array offsets to scroll the characters like this:

int i, j;
for (i = 0; i < 23; i++)
  for (j = 0; j < 80; j++)
    screen[i][j] = screen[i+1][j];

这样的六个窗口使机器跪了!

Six windows like this brought the machine to its knees!

为什么?因为编译器在那些日子是愚蠢的,所以在机器语言中,内部循环分配的每个实例 screen [i] [j] = screen [i + 1] [j] ,看起来像这样(Ax和Dx是CPU寄存器);

Why? Because compilers were stupid in those days, so in machine language, every instance of the inner loop assignment, screen[i][j] = screen[i+1][j], looked kind of like this (Ax and Dx are CPU registers);

Fetch the base address of screen from memory into the A1 register
Fetch i from stack memory into the D1 register
Multiply D1 by a constant 80
Fetch j from stack memory and add it to D1
Add D1 to A1
Fetch the base address of screen from memory into the A2 register
Fetch i from stack memory into the D1 register
Add 1 to D1
Multiply D1 by a constant 80
Fetch j from stack memory and add it to D1
Add D1 to A2
Fetch the value from the memory address pointed to by A2 into D1
Store the value in D1 into the memory address pointed to by A1

因此,我们讨论了23x80 = 1840内循环迭代中的13个机器语言指令,共有23920条指令,包括3680个CPU密集型整数乘法。

So we're talking 13 machine language instructions for each of the 23x80=1840 inner loop iterations, for a total of 23920 instructions, including 3680 CPU-intensive integer multiplies.

我们对C源代码进行了一些修改,如下所示:

We made a few changes to the C source code, so then it looked like this:

int i, j;
register char *a, *b;
for (i = 0; i < 22; i++)
{
  a = screen[i];
  b = screen[i+1];
  for (j = 0; j < 80; j++)
    *a++ = *b++;
}

还有两个机器语言乘法,循环,因此只有46个整数乘法而不是3680.内循环 * a ++ = * b ++ 语句只包含两个机器语言操作。

There are still two machine-language multiplies, but they're in the outer loop, so there are only 46 integer multiplies instead of 3680. And the inner loop *a++ = *b++ statement only consisted of two machine-language operations.

Fetch the value from the memory address pointed to by A2 into D1, and post-increment A2
Store the value in D1 into the memory address pointed to by A1, and post-increment A1.

假设有1840次内循环迭代,总共有3680个CPU低廉的指令 - 少6.5倍 - 和无整数乘。在这之后,我们不能在六个电传窗口死亡,我们从来没有能够拉起足够的机器下降 - 我们用尽了电传数据源。

Given there are 1840 inner loop iterations, that's a total of 3680 CPU-cheap instructions - 6.5 times fewer - and NO integer multiplies. After this, instead of dying at six teletype windows, we never were able to pull up enough to bog the machine down - we ran out of teletype data sources first. And there are ways to optimize this much, much further, as well.

现在,现代编译器会为你做这种优化 - IF 您要求他们这样做,如果,您的代码的结构允许它。

Now, modern compilers will do that kind of optimization for you - IF you ask them to do it, and IF your code is structured in a way that permits it.

但仍有情况下,编译器不能为你做这个 - 例如,如果你在数组中进行非顺序操作。

But there are still circumstances where compilers can't do that for you - for instance, if you're doing non-sequential operations in the array.

所以我发现它很好地使用指针而不是数组引用。表演肯定不会更糟,经常会更好,更好。

So I've found it's served me well to use pointers instead of array references whenever possible. The performance is certainly never worse, and frequently much, much better.

这篇关于使用数组偏移和指针增量有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆