C:提高性能的功能重罪()的用法 [英] C: Improving performance of function with heavy sin() usage

查看:90
本文介绍了C:提高性能的功能重罪()的用法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个计算的基础上经过时间4正弦值的C函数。使用gprof的,我想这个功能使用100%的CPU时间(100.7%确切笑)。

I have a C function that computes the values of 4 sines based on time elapsed. Using gprof, I figured that this function uses 100% (100.7% to be exact lol) of the CPU time.

void
update_sines(void)
{
    clock_gettime(CLOCK_MONOTONIC, &spec);
    s = spec.tv_sec;
    ms = spec.tv_nsec * 0.0000001;
    etime = concatenate((long)s, ms);

    int k;
    for (k = 0; k < 799; ++k)
    {
        double A1 = 145 * sin((RAND1 * k + etime) * 0.00333) + RAND5;           // Amplitude
        double A2 = 100 * sin((RAND2 * k + etime) * 0.00333) + RAND4;           // Amplitude
        double A3 = 168 * sin((RAND3 * k + etime) * 0.00333) + RAND3;           // Amplitude
        double A4 = 136 * sin((RAND4 * k + etime) * 0.00333) + RAND2;           // Amplitude

        double B1 = 3 + RAND1 + (sin((RAND5 * k) * etime) * 0.00216);           // Period
        double B2 = 3 + RAND2 + (sin((RAND4 * k) * etime) * 0.002);         // Period
        double B3 = 3 + RAND3 + (sin((RAND3 * k) * etime) * 0.00245);           // Period
        double B4 = 3 + RAND4 + (sin((RAND2 * k) * etime) * 0.002);         // Period

        double x = k;                                   // Current x

        double C1 = 0.6 * etime;                            // X axis move
        double C2 = 0.9 * etime;                            // X axis move
        double C3 = 1.2 * etime;                            // X axis move
        double C4 = 0.8 * etime + 200;                          // X axis move

        double D1 = RAND1 + sin(RAND1 * x * 0.00166) * 4;               // Y axis move
        double D2 = RAND2 + sin(RAND2 * x * 0.002) * 4;                 // Y axis move
        double D3 = RAND3 + cos(RAND3 * x * 0.0025) * 4;                // Y axis move
        double D4 = RAND4 + sin(RAND4 * x * 0.002) * 4;                 // Y axis move

        sine1[k] = A1 * sin((B1 * x + C1) * 0.0025) + D1;
        sine2[k] = A2 * sin((B2 * x + C2) * 0.00333) + D2 + 100;
        sine3[k] = A3 * cos((B3 * x + C3) * 0.002) + D3 + 50;
        sine4[k] = A4 * sin((B4 * x + C4) * 0.00333) + D4 + 100;
    }

}

这是从gprof的输出:

And this is the output from gprof:

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  Ts/call  Ts/call  name    
100.07      0.04     0.04  

我目前得到使用这个大约30-31 fps的帧速率。现在,我想那里要做到这一点更有效的方式。

I'm currently getting a frame rate of roughly 30-31 fps using this. Now I figure there as to be a more efficient way to do this.

当你注意到我已经改变了所有的师乘法但对性能的影响非常小。

As you noticed I already changed all the divisions to multiplications but that had very little effect on performance.

我怎么能增加这一数学重功能的性能?

How could I increase the performance of this math heavy function?

推荐答案

除了所有在其他的答案中给出的其他建议,这里是一个纯算法的优化。

Besides all the other advice given in other answers, here is a pure algorithmic optimization.

在大多数情况下,你的计算形式的东西罪(K * A + B),其中 A b 是常数, K 是一个循环变量。如果你也计算 COS(K * A + B),那么你可以使用一个2D的旋转矩阵形成复发的关系(以矩阵形式):

In most cases, you're computing something of the form sin(k * a + b), where a and b are constants, and k is a loop variable. If you were also to compute cos(k * a + b), then you could use a 2D rotation matrix to form a recurrence relationship (in matrix form):

|cos(k*a + b)| = |cos(a)  -sin(a)| * |cos((k-1)*a + b)|
|sin(k*a + b)|   |sin(a)   cos(a)|   |sin((k-1)*a + b)|

在换言之,可以在从previous迭代的值而言计算用于当前迭代的值。因此,你只需要做的全部触发计算 K == 0 ,但其余的可以通过这个循环来计算(一旦你已经计算 COS(一)罪(一),这是常数)。所以,你消除三角函数调用的75%(目前还不清楚同样的伎俩可以拉的最后一组trig的电话)。

In other words, you can calculate the value for the current iteration in terms of the value from the previous iteration. Thus, you only need to to do the full trig calculation for k == 0, but the rest can be calculated via this recurrence (once you have calculated cos(a) and sin(a), which are constants). So you eliminate 75% of the trig function calls (it's not clear the same trick can be pulled for the final set of trig calls).

这篇关于C:提高性能的功能重罪()的用法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆