C循环优化帮助进行最终分配(禁用编译器优化) [英] C loop optimization help for final assignment (with compiler optimization disabled)
问题描述
因此,对于我在计算机系统课程中的最终作业,我们需要针对循环进行优化,以使其比原始循环更快.
So for my final assignment in my Computer Systems class, we need to optimize these for loops to be faster than the original.
使用我们的linux服务器,基本成绩低于7秒,完整成绩低于5秒.我在这里拥有的这段代码大约需要5.6秒.我想我可能需要以某种方式使用指针来使其更快地运行,但是我不确定.有人可以提供我的任何提示或选项吗?
The basic grade is under 7 seconds and the full grade is under 5 seconds with our linux server. This code that I have right here gets about 5.6 seconds. I am thinking I may need to use pointers with this in some way to get it to go faster but I'm not really sure. Could anyone offer any tips or options that I have?
文件必须保留50行或更少,而我忽略了教师已包括的注释行.
The file must remain 50 lines or less and I am ignoring those commented lines the instructor has included.
#include <stdio.h>
#include <stdlib.h>
// You are only allowed to make changes to this code as specified by the comments in it.
// The code you submit must have these two values.
#define N_TIMES 600000
#define ARRAY_SIZE 10000
int main(void)
{
double *array = calloc(ARRAY_SIZE, sizeof(double));
double sum = 0;
int i;
// You can add variables between this comment ...
register double sum1 = 0, sum2 = 0, sum3 = 0, sum4 = 0, sum5 = 0, sum6 = 0, sum7 = 0, sum8 = 0, sum9 = 0;
register int j;
// ... and this one.
printf("CS201 - Asgmt 4 - \n");
for (i = 0; i < N_TIMES; i++)
{
// You can change anything between this comment ...
for (j = 0; j < ARRAY_SIZE; j += 10)
{
sum += array[j];
sum1 += array[j + 1];
sum2 += array[j + 2];
sum3 += array[j + 3];
sum4 += array[j + 4];
sum5 += array[j + 5];
sum6 += array[j + 6];
sum7 += array[j + 7];
sum8 += array[j + 8];
sum9 += array[j + 9];
}
// ... and this one. But your inner loop must do the same
// number of additions as this one does.
}
// You can add some final code between this comment ...
sum += sum1 + sum2 + sum3 + sum4 + sum5 + sum6 + sum7 + sum8 + sum9;
// ... and this one.
return 0;
}
推荐答案
您可能处在正确的轨道上,尽管您需要对其进行确定以确保其准确性(我对测量,而不是猜测在这里似乎是多余的,因为分配的整个 point 都是要测量的.
You may be on the right track, though you'll need to measure it to be certain (my normal advice to measure, not guess seems a little superfluous here since the whole point of the assignment is to measure).
优化的编译器可能不会有太大的区别,因为它们对这类东西非常聪明,但是,由于我们不知道它将以何种优化级别进行编译,因此您可能会获得实质性的改进.
Optimising compilers will probably not see much of a difference since they're pretty clever about that sort of stuff but, since we don't know what optimisation level it will be compiling at, you may get a substantial improvement.
要在内部循环中使用指针,只需添加一个指针变量即可.
To use pointers in the inner loop is a simple matter of first adding a pointer variable:
register double *pj;
然后将循环更改为:
for (pj = &(array[0]); pj < &(array[ARRAY_SIZE]); j++) {
sum += *j++;
sum1 += *j++;
sum2 += *j++;
sum3 += *j++;
sum4 += *j++;
sum5 += *j++;
sum6 += *j++;
sum7 += *j++;
sum8 += *j++;
sum9 += *j;
}
这使循环内的加法数量保持不变(当然,假设您将+=
和++
算作加法运算符),但是基本上使用指针而不是数组索引.
This keeps the amount of additions the same within the loop (assuming you're counting +=
and ++
as addition operators, of course) but basically uses pointers rather than array indexes.
在我的系统上没有优化 1 时,它从9.868秒(CPU时间)降至4.84秒.您的里程可能会有所不同.
With no optimisation1 on my system, this drops it from 9.868 seconds (CPU time) to 4.84 seconds. Your mileage may vary.
1 在优化级别为-O3
的情况下,两者的报告时间均为0.001秒,因此,如上所述,优化器非常聪明.但是,鉴于您看到的时间超过5秒,我建议您不要在优化时对其进行编译.
1 With optimisation level -O3
, both are reported as taking 0.001 seconds so, as mentioned, the optimisers are pretty clever. However, given you're seeing 5+ seconds, I'd suggest it wasn't been compiled with optimisation on.
顺便说一句,这是一个很好的理由,通常建议您以可读的方式编写代码,并让编译器负责使其运行速度更快.尽管我微不足道的优化尝试使速度提高了一倍,但使用-O3
使其运行速度快了 10,000 倍:-)
As an aside, this is a good reason why it's usually advisable to write your code in a readable manner and let the compiler take care of getting it running faster. While my meager attempts at optimisation roughly doubled the speed, using -O3
made it run some ten thousand times faster :-)
这篇关于C循环优化帮助进行最终分配(禁用编译器优化)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!