最终分配空调回路优化帮助 [英] C loop optimization help for final assignment

查看:95
本文介绍了最终分配空调回路优化帮助的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以,我在我的电脑系统级决赛中分配,我们需要优化这些forloops比原来快。基本等级低于7秒,满级低于5秒,我们的Linux服务器。这code,我有在这里获得约5.6秒。我想我可能需要使用指针与此以某种方式让它走得更快,但我真的不知道。任何人都可以提供任何提示或我有选择吗?太谢谢你了!

快速编辑:文件必须保持50行以内,我这些注释行教官已列入忽略

 的#include<&stdio.h中GT;
#包括LT&;&stdlib.h中GT;//你只能通过它的评论指定更改此code。//将code您提交必须有这两个值。
#定义N_TIMES 600000
#定义ARRAY_SIZE 10000INT主要(无效)
{
    双阵列* =释放calloc(ARRAY_SIZE,sizeof的(双));
    双总和= 0;
    INT I;    //可以此评论之间添加变量...
    寄存器双SUM1 = 0,SUM2 = 0,sum3 = 0,SUM4 = 0,sum5 = 0,sum6 = 0,sum7 = 0,sum8 = 0,sum9 = 0;
    注册诠释J;
    // ...而这一次。    的printf(CS201 - Asgmt 4 - \\ n);    对于(i = 0; I< N_TIMES;我++)
    {
        //你可以改变这个注释之间什么...
        为(J = 0; J< ARRAY_SIZE; J + = 10)
        {
            总和+ =阵列[J]。
            SUM1 + =阵列[J + 1];
            SUM2 + =阵列[J + 2];
            sum3 + =阵列[J + 3];
            SUM4 + =阵列[J + 4];
            sum5 + =阵列[J + 5];
            sum6 + =阵列[J + 6];
            sum7 + =阵列[J + 7]。
            sum8 + =阵列[J + 8];
            sum9 + =阵列[J + 9];
        }
        // ...而这一次。但是,你的内循环也必须这样做
        //这一个做加法的数量。
    }    //您可以添加此评论的一些最后的code ...
    总和+ = SUM1 + SUM2 + sum3 + SUM4 + sum5 + sum6 + sum7 + sum8 + sum9;
    // ...而这一次。    返回0;
}


解决方案

您的可能的是在正确的轨道上,但是你需要衡量它是一定的(我一般建议的措施,无法猜测的似乎有点多余,因为这里整个的的的分配是衡量)。

优化的编译器可能不会看到太多的差别,因为他们是pretty聪明那档子的东西,但,因为我们不知道它会在被编译什么优化级别,您可能会得到大幅改善

要在内部循环使用指针是第一个加入一个指针变量简单的事情:

 录得双* PJ;

然后改变循环:

 为(PJ =及(数组[0]); PJ<及(数组[ARRAY_SIZE]); J ++){
        总和+ = * J ++;
        SUM1 + = * J ++;
        SUM2 + = * J ++;
        sum3 + = * J ++;
        SUM4 + = * J ++;
        sum5 + = * J ++;
        sum6 + = * J ++;
        sum7 + = * J ++;
        sum8 + = * J ++;
        sum9 + = *焦耳;
    }

这增加不断的循环中相同的金额(假设你指望 + = ++ 如加法运算,当然),但基本上使用了指针,而不是数组索引。

由于没有优化 1 我的系统上,这种下降从9.868秒(CPU时间)4.84秒。你的情况可能会有所不同。


1 的优化级别 -O3 两个的报告为0.001服用秒钟,这样,如前所述,在优化器是pretty聪明。然而,鉴于你看到至少5秒钟,我建议它没有被编译与优化。

顺便说一句,这是一个很好的理由,为什么它通常是最好写可读的方式你code,让编译器来得到它的运行速度更快。虽然我在优化微薄的尝试几乎翻了一倍的速度,使用 -O3 使其运行一些的更快的10000 的时间: - )

So for my final assignment in my Computer Systems class, we need to optimize these forloops to be faster than the original. The basic grade is under 7 seconds and the full grade is under 5 seconds with our linux server. This code that I have right here gets about 5.6 seconds. I am thinking I may need to use pointers with this in some way to get it to go faster but I'm not really sure. Could anyone offer any tips or options that I have? Thank you so much!

QUICKEDIT: The file must remain 50 lines or less and I am ignoring those commented lines the instructor has included.

#include <stdio.h>
#include <stdlib.h>

// You are only allowed to make changes to this code as specified by the comments in it.

// The code you submit must have these two values.
#define N_TIMES     600000
#define ARRAY_SIZE   10000

int main(void)
{
    double  *array = calloc(ARRAY_SIZE, sizeof(double));
    double  sum = 0;
    int     i;

    // You can add variables between this comment ...
    register double sum1 = 0, sum2 = 0, sum3 = 0, sum4 = 0, sum5 = 0, sum6 = 0, sum7 = 0, sum8 = 0, sum9 = 0;
    register int j;
    // ... and this one.

    printf("CS201 - Asgmt 4 - \n");

    for (i = 0; i < N_TIMES; i++)
    {
        // You can change anything between this comment ...
        for (j = 0; j < ARRAY_SIZE; j += 10)
        {
            sum += array[j];
            sum1 += array[j + 1];
            sum2 += array[j + 2];
            sum3 += array[j + 3];
            sum4 += array[j + 4];
            sum5 += array[j + 5];
            sum6 += array[j + 6];
            sum7 += array[j + 7];
            sum8 += array[j + 8];
            sum9 += array[j + 9];
        }
        // ... and this one. But your inner loop must do the same
        // number of additions as this one does.
    }                   

    // You can add some final code between this comment ...
    sum += sum1 + sum2 + sum3 + sum4 + sum5 + sum6 + sum7 + sum8 + sum9;
    // ... and this one.

    return 0;
}

解决方案

You may be on the right track, though you'll need to measure it to be certain (my normal advice to measure, not guess seems a little superfluous here since the whole point of the assignment is to measure).

Optimising compilers will probably not see much of a difference since they're pretty clever about that sort of stuff but, since we don't know what optimisation level it will be compiling at, you may get a substantial improvement.

To use pointers in the inner loop is a simple matter of first adding a pointer variable:

register double *pj;

then changing the loop to:

for (pj = &(array[0]); pj < &(array[ARRAY_SIZE]); j++) {
        sum += *j++;
        sum1 += *j++;
        sum2 += *j++;
        sum3 += *j++;
        sum4 += *j++;
        sum5 += *j++;
        sum6 += *j++;
        sum7 += *j++;
        sum8 += *j++;
        sum9 += *j;
    }

This keeps the amount of additions the same within the loop (assuming you're counting += and ++ as addition operators, of course) but basically uses pointers rather than array indexes.

With no optimisation1 on my system, this drops it from 9.868 seconds (CPU time) to 4.84 seconds. Your mileage may vary.


1 With optimisation level -O3, both are reported as taking 0.001 seconds so, as mentioned, the optimisers are pretty clever. However, given you're seeing 5+ seconds, I'd suggest it wasn't been compiled with optimisation on.

As an aside, this is a good reason why it's usually advisable to write your code in a readable manner and let the compiler take care of getting it running faster. While my meager attempts at optimisation roughly doubled the speed, using -O3 made it run some ten thousand times faster :-)

这篇关于最终分配空调回路优化帮助的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆