我可以优化具有3个for循环和4个ifs的代码吗? [英] Can I optimize code that has 3 for loops and 4 ifs?

查看:146
本文介绍了我可以优化具有3个for循环和4个ifs的代码吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我发表了另一篇文章



这里,我问如何创建一个立方体素节点在3-d空间中的26个邻居。我得到了一个很好的答案,并实施它。



我添加了一些MIN MAX Position检查。



我想知道是否有办法, 与3个for循环和4个(如果使用)的关系,以提高此代码的执行时间。我已经阅读在另一个帖子sth,当使用while循环更快,但它是在一个后不是语言具体。



这是真的吗?如果是的,可以请帮助我在我的代码,因为我运气的经验?



这里是我的代码:







$ b b $ b

  ... 
std :: vector< Pos> Create26Neighbor(Pos somePos,double resol)
{
std :: vector< Pos> vect1;
Pos m_MinPos(0.0,0.0,0.0);
Pos m_MaxPos(5.0,4.0,5.0);

for(double dz = somePos.m_pPos [2] - resol; dz< = somePos.m_pPos [2] + resol; dz + = resol)
{
if dz> m_MinPos.m_pPos [2]& dz< m_MaxPos.m_pPos [2])
{
for(double dy = someCPos.m_pPos [1] - resol; dy <= someCPos。
{$ b_m_pPos [1] + resol; dy + = resol)
{
if(dy> m_MinPos.m_pPos [1]&& dy {
if(dx> m_MinPos.m_pPos [ 0]& dx {
//全部27
if((dx!= somePos.m_pPos [0])|| = somePos.m_pPos [1])||(dz!= somePos.m_pPos [2]))
{
Pos tempPos(dx,dy,dz);
vect1.push_back(tempPos);
}
}
}
}
}
}
}
return vect1;
}
....


解决方案

首先,摆脱if语句。没有必要。您可以将它们合并到循环条件。第二,避免每次迭代重新计算循环条件。是的,编译器可以优化它,但是它通常是非常保守的浮点优化(并且它可以处理从内存读取的fp值不同于从寄存器读取的值,这意味着它不能从循环条件中消除数组查找),因此通常最好手动进行简单的优化:

  std :: vector< Pos> Create26Neighbor(Pos somePos,double resol)
{
std :: vector< Pos> vect1(27); //初始化具有正确大小的向量。
Pos m_MinPos(0.0,0.0,0.0);
Pos m_MaxPos(5.0,4.0,5.0);

double minz = std :: max(somePos.m_pPos [2] - resol,m_MinPos.m_pPos [2]);
double maxz = std :: min(somePos.m_pPos [2] + resol,m_MaxPos.m_pPos [2];
int i = 0;
for(double dz = min; dz< ; = max; dz + = resol)
{
double miny = std :: max(somePos.m_pPos [1] - resol,m_MinPos.m_pPos [1]);
double maxy = std :: min(somePos.m_pPos [1] + resol,m_MaxPos.m_pPos [1];
for(double dy = miny; dy <= maxy; dy + = resol)
{
double minx = std :: max(somePos.m_pPos [0] - resol,m_MinPos.m_pPos [0]);
double maxx = std :: min(somePos.m_pPos [0] + resol,m_MaxPos.m_pPos [ 0];

for(double dx = minx; dx <= maxx; dx + = resol)
{
++ i;
//如果我们'不是在中心,只是使用'i'作为索引,否则使用i + 1
int idx =(dx!= somePos.m_pPos [0] || dy!= somePos.m_pPos [1] || dz != somePos.m_pPos [2])?i:i + 1;
vec1 [idx] = Pos(dx,dy,dz); //现场构建Pos,*可能会保存一份副本,相比起始化它,将其存储为局部变量,然后将其复制到向量中。
}
}
}
return vect1;
}

我想要考虑的最后一点是内部if语句。紧密循环中的分支可能比您期望的更昂贵。我可以想到一些方法来消除它:




  • 正如我在代码中描绘的,?:操作符可以被哄骗到计算中心值的不同向量索引(因此它被写入下一个向量元素,并且因此在下一次迭代时被重写)。

  • 拆分循环,以便在'resol'值之前和之后有单独的循环。这有点尴尬,有很多较小的循环,整体效率可能较低。但它会消除内部if语句,因此也可能更快。

  • 允许将中心点添加到向量中,然后忽略它,或在循环(这将是一个有点昂贵的操作,可能会或不会偿还。如果你使用deque而不是向量可能更便宜。



并确保编译器展开内循环,手动展开它也可能会有帮助。



最后,很多取决于Pos的定义。 p>

请注意,我建议的大部分是由它可能不会更快,但...限定,你必须不断描述和基准每一个更改,以确保



根据你愿意去的距离,你可以把所有的东西合并到一个循环(在整数运行)在每次迭代中即时计算Pos坐标。


i made another post

here where I asked how to create the 26 neighbors of a cubic-voxel-node in 3-d space. I got a very good answer and implemented it.

To that I added some MIN MAX Position checking.

I would like to know if there is way, in relationship to the 3 for loops and 4 if used, to improve the execution time of this code. I have read in another post sth that when using while loops is faster but it was in a post not language specific.

Is this true? If yes, could u please help me to this in my code because i luck experience? Is there a way to implement this recursively in a way that will make it faster?

here is my code:

...
std::vector<Pos> Create26Neighbor(Pos somePos, double resol) 
{
    std::vector <Pos> vect1;
    Pos  m_MinPos(0.0,0.0,0.0);
    Pos  m_MaxPos(5.0,4.0,5.0);

    for (double dz = somePos.m_pPos[2] - resol; dz <= somePos.m_pPos[2] + resol; dz+=resol)
    {
    	if (dz>m_MinPos.m_pPos[2] && dz<m_MaxPos.m_pPos[2])
    	{
    		for (double dy = someCPos.m_pPos[1] - resol; dy <= someCPos.m_pPos[1] + resol; dy+=resol)
    		{
    			if (dy>m_MinPos.m_pPos[1] && dy<m_MaxPos.m_pPos[1])
    			{
    				for (double dx = somePos.m_pPos[0] - resol; dx <= somePos.m_pPos[0] + resol; dx+=resol)
    				{
    					if (dx>m_MinPos.m_pPos[0] && dx<m_MaxPos.m_pPos[0])
    					{
    						// all 27
    						if ((dx != somePos.m_pPos[0]) || (dy != somePos.m_pPos[1]) || (dz != somePos.m_pPos[2]))
    						{
    							Pos tempPos(dx,dy,dz);
    							vect1.push_back(tempPos);
    						}
    					}
    				}
    			}
    		}
    	}
    }
    return vect1;
}
....

解决方案

First, get rid of the if statements. There's no need for them. You can merge them into the loop condition. Second, avoid recomputing the loop condition every iteration. Yes, the compiler may optimize it away, but it's generally very conservative with floating-point optimizations (and it may treat fp values read from memory differently from ones read from a register, which means it can't eliminate your array lookups from the loop conditions), so it's often best to do even simple optimizations manually:

std::vector<Pos> Create26Neighbor(Pos somePos, double resol) 
{
    std::vector <Pos> vect1(27); // Initialize the vector with the correct size.
    Pos  m_MinPos(0.0,0.0,0.0);
    Pos  m_MaxPos(5.0,4.0,5.0);

    double minz = std::max(somePos.m_pPos[2] - resol, m_MinPos.m_pPos[2]);
    double maxz = std::min(somePos.m_pPos[2] + resol, m_MaxPos.m_pPos[2];
    int i = 0;
    for (double dz = min; dz <= max; dz+=resol)
    {
        double miny = std::max(somePos.m_pPos[1] - resol, m_MinPos.m_pPos[1]);
        double maxy = std::min(somePos.m_pPos[1] + resol, m_MaxPos.m_pPos[1];
        for (double dy = miny; dy <= maxy; dy+=resol)
        {
            double minx = std::max(somePos.m_pPos[0] - resol, m_MinPos.m_pPos[0]);
            double maxx = std::min(somePos.m_pPos[0] + resol, m_MaxPos.m_pPos[0];

            for (double dx = minx; dx <= maxx; dx+=resol)
            {
                ++i;
                // If we're not at the center, just use 'i' as index. Otherwise use i+1
                int idx = (dx != somePos.m_pPos[0] || dy != somePos.m_pPos[1] || dz != somePos.m_pPos[2]) ? i : i+1;
                vec1[idx] = Pos(dx, dy, dz); // Construct Pos on the spot, *might* save you a copy, compared to initilizing it, storing it as a local variable, and then copying it into the vector.
              }
        }
    }
    return vect1;
}

The last point I'd consider looking at is the inner if-statement. Branches in a tight loop can be more costly than you might expect. I can think of a number of ways to eliminate it:

  • As I sketched in the code, the ?: operator could be coaxed into calculating a different vector index for the center value (so it's written to the next vector element, and so gets overwritten again next iteration). This would eliminate the branch, but may or may not be faster overall.
  • Split up the loops so you have separate loops before and after the 'resol' value. That gets a bit awkward, with a whole lot of smaller loops, and may be less efficient overall. But it would eliminate the inner if-statement, so it may also be faster.
  • Allow the center point to be added to the vector, and either ignore it afterwards, or remove it after the loops (that'd be a somewhat expensive operation, and may or may not pay off. It may be cheaper if you use a deque instead of vector.

And make sure the compiler unrolls the inner loop. Manually unrolling it may help too.

Finally, a lot depends on how Pos is defined.

Note that most of what I suggested is qualified by "it may not be faster, but...". You have to constantly profile and benchmark every change you make, to ensure you're actually improving performance.

Depending on how far you're willing to go, you may be able to merge everything into one loop (running on integers) and just compute the Pos coordinates on the fly in each iteration.

这篇关于我可以优化具有3个for循环和4个ifs的代码吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆