最终赋值的 C 循环优化帮助(禁用编译器优化) [英] C loop optimization help for final assignment (with compiler optimization disabled)

查看:47
本文介绍了最终赋值的 C 循环优化帮助(禁用编译器优化)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

因此,对于我在计算机系统课程中的最终作业,我们需要优化这些 for 循环,使其比原始循环更快.

So for my final assignment in my Computer Systems class, we need to optimize these for loops to be faster than the original.

使用我们的 linux 服务器,基本成绩在 7 秒以内,满级成绩在 5 秒以内.我在这里的这段代码大约需要 5.6 秒.我想我可能需要以某种方式使用指针来让它运行得更快,但我不太确定.任何人都可以提供我的任何提示或选项吗?

The basic grade is under 7 seconds and the full grade is under 5 seconds with our linux server. This code that I have right here gets about 5.6 seconds. I am thinking I may need to use pointers with this in some way to get it to go faster but I'm not really sure. Could anyone offer any tips or options that I have?

文件必须保持在 50 行或更少,我将忽略教师包含的那些注释行.

The file must remain 50 lines or less and I am ignoring those commented lines the instructor has included.

#include <stdio.h>
#include <stdlib.h>

// You are only allowed to make changes to this code as specified by the comments in it.

// The code you submit must have these two values.
#define N_TIMES     600000
#define ARRAY_SIZE   10000

int main(void)
{
    double  *array = calloc(ARRAY_SIZE, sizeof(double));
    double  sum = 0;
    int     i;

    // You can add variables between this comment ...
    register double sum1 = 0, sum2 = 0, sum3 = 0, sum4 = 0, sum5 = 0, sum6 = 0, sum7 = 0, sum8 = 0, sum9 = 0;
    register int j;
    // ... and this one.

    printf("CS201 - Asgmt 4 - 
");

    for (i = 0; i < N_TIMES; i++)
    {
        // You can change anything between this comment ...
        for (j = 0; j < ARRAY_SIZE; j += 10)
        {
            sum += array[j];
            sum1 += array[j + 1];
            sum2 += array[j + 2];
            sum3 += array[j + 3];
            sum4 += array[j + 4];
            sum5 += array[j + 5];
            sum6 += array[j + 6];
            sum7 += array[j + 7];
            sum8 += array[j + 8];
            sum9 += array[j + 9];
        }
        // ... and this one. But your inner loop must do the same
        // number of additions as this one does.
    }                   

    // You can add some final code between this comment ...
    sum += sum1 + sum2 + sum3 + sum4 + sum5 + sum6 + sum7 + sum8 + sum9;
    // ... and this one.

    return 0;
}

推荐答案

可能在正确的轨道上,尽管你需要衡量它是确定的(我对测量,而不是猜测在这里似乎有点多余,因为作业的整个重点就是测量).

You may be on the right track, though you'll need to measure it to be certain (my normal advice to measure, not guess seems a little superfluous here since the whole point of the assignment is to measure).

优化编译器可能不会看到太大的不同,因为它们对这类东西非常聪明,但是,由于我们不知道它将在什么优化级别进行编译,因此您可能会得到实质性的改进.

Optimising compilers will probably not see much of a difference since they're pretty clever about that sort of stuff but, since we don't know what optimisation level it will be compiling at, you may get a substantial improvement.

要在内循环中使用指针是一个简单的事情,首先添加一个指针变量:

To use pointers in the inner loop is a simple matter of first adding a pointer variable:

register double *pj;

然后将循环更改为:

for (pj = &(array[0]); pj < &(array[ARRAY_SIZE]); j++) {
        sum += *j++;
        sum1 += *j++;
        sum2 += *j++;
        sum3 += *j++;
        sum4 += *j++;
        sum5 += *j++;
        sum6 += *j++;
        sum7 += *j++;
        sum8 += *j++;
        sum9 += *j;
    }

这使循环中的加法数量保持不变(当然,假设您将 +=++ 计算为加法运算符)但基本上使用指针而不是数组索引.

This keeps the amount of additions the same within the loop (assuming you're counting += and ++ as addition operators, of course) but basically uses pointers rather than array indexes.

在我的系统上没有优化1,它从 9.868 秒(CPU 时间)下降到 4.84 秒.您的里程可能会有所不同.

With no optimisation1 on my system, this drops it from 9.868 seconds (CPU time) to 4.84 seconds. Your mileage may vary.

1 使用优化级别-O3两者都被报告为耗时 0.001 秒,因此,如前所述,优化器非常聪明.但是,考虑到您看到的时间超过 5 秒,我建议它没有经过优化编译.

1 With optimisation level -O3, both are reported as taking 0.001 seconds so, as mentioned, the optimisers are pretty clever. However, given you're seeing 5+ seconds, I'd suggest it wasn't been compiled with optimisation on.

顺便说一句,这是一个很好的理由,为什么通常建议以可读的方式编写代码并让编译器负责让它运行得更快.虽然我在优化方面的微薄尝试使速度大致提高了一倍,但使用 -O3 使其运行速度提高了 一万 倍:-)

As an aside, this is a good reason why it's usually advisable to write your code in a readable manner and let the compiler take care of getting it running faster. While my meager attempts at optimisation roughly doubled the speed, using -O3 made it run some ten thousand times faster :-)

这篇关于最终赋值的 C 循环优化帮助(禁用编译器优化)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆