递归方法比交互方法慢 10 倍 [英] Recursive method 10x slower than interative

查看:65
本文介绍了递归方法比交互方法慢 10 倍的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

尽可能清理代码以显示我的问题.基本上它是一个八叉树搜索+相交.intersect 函数来自一个 SDK(整个项目是一个犀牛插件).如果我使用交叉调用进行循环,它比通过八叉树进行递归搜索快 10 倍.甚至更奇怪 - 我隔离了相交调用的时间 - 在递归内部它比循环慢 8 倍.它的行为可能有 1000 种原因,但我希望我犯了一些明显的错误,有人可以通过查看代码发现.

The code is cleaned as far as i could to show my problem. Basically its an octree search+intersect. the intersect function comes from an SDK (the whole project is a plugin for rhino). If i make a loop with intersection calls, its 10 times faster than the recursive search through the octree. Stranger even - i isolated the timing of the intersect call - and inside the recursive it was 8 times slower than in a loop. There could be probably 1000 reasons why it behaves like this, but i hope i made some obvious mistake someone can spot by looking over the code.

有一个缓慢的递归片段:

there is the slow recusive piece:

public void NewRayCast()
{            
    int runs = 500000;                          //how many rays we cast
    Point3d raypos = new Point3d(0, 0, 0);      //raystart
    Ray3d ray = new Ray3d();                

    Random r = new Random();                    //here we create targets to scatter the ray directions
    Vector3d[] targets = new Vector3d[runs];
    for (int i = 0; i < runs; i++)
    {
        targets[i] = new Vector3d(r.NextDouble() * 200 - 100, 500, r.NextDouble() * 200 - 100);
    }

    for (int i = 0; i < runs; i++)
    {
        boxes = new List<int>();            // this collects the octree leaves the rays hits
        rayline.From = ray.Position;
        rayline.To = ray.Position + ray.Direction;                
        LineBoxer(starter);                 // this starts the raycasting - starter is a array with 1 value (the scene bounding box)
    }            
}

public void LineBoxer(int[] check)    // Cast a ray against boxes
{            
    for (int i = 0; i < check.Length; i++)  // check only because the first call is with the scene box (1 index)
    {
        if (octmanB.node[check[i]].closed == false) // if node is a dead end > empty we skip it
        {                                       
            isect = Rhino.Geometry.Intersect.Intersection.LineBox(rayline, octmanB.node[check[i]].bbox, 0, out ival); // intersection function, changing to an arbitrary bounding box doesnt speed it up either                    
            if (isect == true)
            {                        
                if (octmanB.node[check[i]].leaf == false)   // continue with recursion
                {
                    LineBoxer(octmanB.node[check[i]].child);
                }
                else
                {
                    boxes.Add(check[i]);    // here we have a leaf
                }
            }
        }
    }
}

这里是快速的:

public void FasterTestRun()
{
    int runs = 500000;                       
    Line line = new Line(new Point3d(1, 1, 1), new Point3d(0, 1000, 0));
    BoundingBox box = new BoundingBox(new Point3d(-50, 50, -50), new Point3d(50, 150, 50));

    bool isect;
    Interval v = new Interval();

    Random r = new Random();
    Point3d[] targets = new Point3d[runs];
    for (int i = 0; i < runs; i++)
    {
        targets[i] = new Point3d(r.NextDouble() * 20 - 10, 1000, r.NextDouble() * 20 - 10);
    }

    for (int i = 0; i < runs; i++)
    {
        line.To = targets[i];                
        isect = Rhino.Geometry.Intersect.Intersection.LineBox(line, box, 0, out v);                
    }            
}

谢谢!

推荐答案

现在我在家我终于可以回答你的问题了,那么让我们开始吧...

Now that I am at home I can finally answer your question, so let's begin...

首先:递归总是比迭代慢,如果您使用结构化编程语言.你不能概括这一点,因为函数式编程语言中的函数调用速度更快(函数在那里定义不同).如需更多信息,维基百科是一个很好的来源.

First of all: Recursion is always slower than iteration, if you are working with structured programming languages. You cannot generalize this, since function calls in functional programming languages are faster (functions are differently defined there). For more information Wikipedia is a good source.

详细地,递归调用将函数(或方法)分配的所有局部变量推送到堆栈上,等待直到内部调用返回(这包括相同的过程……),最后从中弹出值调用堆栈并继续使用它们.这不仅是沉重的内存负载,这也是垃圾收集器的痛苦:您的函数必须等待的时间越长,您的对象在内存中的持续时间就越长,老化并最终到达 gen1gen2.这意味着它们实际上需要很长时间才能被释放.

In detail a recursive call pushes all local variables the function (or method) allocated onto the stack, wait's until the inner call returns (this includes the same procedure on and on and on...) and finally pops the values from the call-stack and continues working with them. This is not only heavy memory load, this is also pain for the garbage collector: the longer your function must wait, the longer your objects last in memory, aging and finally reaching gen1 or gen2. Which means it takes actually long until they get released.

我能看到的另一个问题如下:

Another problem I can see is the following:

public void LineBoxer(int[] check)
{
    // ...
    LineBoxer(octmanB.node[check[i]].child);
    // ...
}

递归地传递数组意味着所有数组的值在堆栈上驻留很长时间.即使大部分元素都准备好发布!

Passing arrays recursively means that all of the values of the array reside on the stack for that long time. Even if most of the elements are ready to be released!

如果你迭代地做同样的事情,那么堆栈就没有压力.分配的变量通常是临时变量,可以很快释放.内存分配是这里的瓶颈.这(并且因为您在评论中要求)就是我将继续更详细地介绍的原因.

If you are doing the same thing iteratively there is no stressing for the stack. The variables allocated are pretty often temporary variables and can get released quite quickly. And memory-allocation is the bottleneck in here. This,(and because you asked for it in the comments) is why I will continue going a little bit more into detail.

但是(在评论中)您也在询问如何更有效地处理光线追踪.基本上你是正确的使用八叉树.您要执行的基本操作是简单的搜索.这就是你的问题.据我了解您的代码,您正在测试每个八叉树的叶子是否被击中:

However (in the comments) you are also asking on how to handle the raytracing more efficently. Basicly you are right with using an octree. The basic operation you want to perform is a simple search. And there's your problem. From as far as I understand your code, you are testing each and every octree leaf if it got hit or not:

public void LineBoxer(int[] check)    // Cast a ray against boxes
{            
    for (int i = 0; i < check.Length; i++)
    {
        // ...
    }
}

这只是搜索所有与您的射线相交的框.但这不是引入树木的动机.你可以想象一个八叉树,比如扩展到 3 个维度的 二叉搜索树.二叉搜索树可以搜索一维的条目(例如列表或数组).要在二维结构中搜索信息,您可以使用 quadtrees.现在我们需要添加另一个维度(因为我们现在处于 3D 状态),所以我们使用 八叉树.

That's simply searching all for all boxes that intersect with your ray. But that's not the motivation of introducing trees. You can imagine an octree like an binary search tree that's extented to 3 dimensions. A binary search tree can search entries in 1 dimension (e.g. a list or array). To search for information in two-dimensional constructs you can use quadtrees. And now we need to add another dimension (because we are in 3D now), so we use octrees.

到目前为止一切都很好,但是树如何帮助我们更好地"执行搜索操作?

So far so good, but how can trees help us to perform search operations "better"?

这是因为分而治之的原则.如果我们在更大的信息集中搜索特定的东西,我们会将这组信息分成小块.我们一直这样做,直到找到我们正在寻找的特定事物.

That's because of the divide and conquer priciple. If we are searching for something specific in a larger set of information we divide the set into small pieces. We do this just as long until we found the specific thing we are looking for.

对于我们的八叉树,这意味着我们将一个立方体分成 8 个较小的立方体.现在我们测试每个盒子是否与光线相交.在最好的情况下,它正好与一个盒子相交.这是进一步检查的那个.但是,如果每个盒子包含 1000 个盒子,我们只需增加一张支票就可以去掉 7000 个支票!

For our octree this means that we divide a cube into 8 smaller cubes. Now we test each box if our ray intersects with it. In the best case it intersects with exactly one box. That's the one to check further. But if each box contains 1000 boxes we simply get rid of 7000 checks by one additional check!

现在我们一次又一次地这样做,直到找到一片或多片叶子.递归执行此操作不应比迭代执行此操作慢得多.想象一个有 100.000 个节点的八叉树.第一个立方体可以存储 8 个立方体,那 8 个立方体可以存储 64 个立方体(深度:2!)等等......

Now we do this again and again until we found one or more leafs. Doing this recursively should not be much slower than doing this iteratively. Imagine an octree with 100.000 nodes. The first cube can store 8 cubes, those 8 cubes can store 64 cubes (depth: 2!) and so on...

  • 深度 = 3:512 个立方体
  • 深度 = 4:4.096 个立方体
  • 深度 = 5:32.768 个立方体
  • 深度 = 6:262.144 个立方体
  • 深度 = 7:2.097.152 个立方体
  • ...

如果我们正在搜索一个特定的框,我们的最大检查数量永远不会超过 8 x Depth 元素.因此,我们将算法性能从 O(n) 提高到 O(log(n)).1

And our maximum number of checks is never more than 8 x Depth elements if we are searching for one specific box. So we increased our algorithm performance from O(n) to O(log(n)). 1

首先:您正在使用 C# - 一种面向对象的语言.所以使用对象!如果您将所有内容都封装在一个对象结构中,那么将分而治之原则应用于您的树结构非常简单.

First of all: You are working with C# - an object orientated language. So use objects! It is pretty simple to apply the divide and conquer principle to your tree structure if you are encapsulating everything within a object structure.

在您的具体情况下,这意味着:

In your specific case this means:

public class OctreeNode
{
    public bool IsLeaf { get; private set; }
    public OctreeNode[8] Children { get; private set; }

    public OctreeNode()
    {
        this.IsLeaf = true;
        this.Children = null;
    }

    // Don't forget to implement methods to build up the tree and split the node when required.
    // Splitting results in a tree structure. Inside the split-method 
    // you need to allocate the Children-Array and set the IsLeaf-Property to false.

    public bool Intersects(Line rayline, ref ICollection<OctreeNodes> Nodes)
    {
        Interval interval;

        // If the current node does not intersect the ray, then we do not need to
        // investigate further on from here.
        if (!Rhino.Geometry.Intersect.Intersection.LineBox(rayline, this.GetBoundingBox(), 0, out interval))
        {
            return false;
        }

        // If this is a leaf (in which we are interested in) we add it to 
        // the nodes-collection.
        if (this.IsLeaf)
        {
            Nodes.Add(this);
            return true;
        }

        // Not a leaf, but our box intersects with the ray, so we need to investigate further.

        for (int i = 0; i < 8; ++i)
        {
            // Recursive call here, but the result doesn't actually matter.
            this.Children[i].Intersects(rayline, Nodes)
        }

        // The ray intersects with our current node.
        return true;
    }
}

这将为您带来所有魔力!它只测试树直到测试失败的深度,并继续直到你拥有与光线相交的所有叶子.您还可以以谁获得对数交叉间隔"的方式对它们进行排序,以便在流式传输它们时将对象置于更高的优先级,但由于我现在完全失败了该主题,我将继续:

This will do all the magic for you! It tests the tree only until the depth where the test fails and continues until you have all leafs that the ray intersects with. You can also sort them in a way on "who got the logest intersection interval", to bring the objects inside into a higher priority when streaming them, but since I am totally failing the topic now I will continue:

您甚至可以通过应用并行性来进一步加速此算法.这很容易使用 TPL,这里很简单:

You can even further speed up this algorithm by applying parallelism. This is pretty easy using TPL, which is pretty simple in here:

Parallel.For<int> (0; 8; i =>
{
    this.Children[i].Intersects(rayline, Nodes)
});

好的,我想暂时就这些.我希望这对你和周围的更多人有帮助.:)

Okay, that's enought for the moment, I think. I hope this helps you and some more people around there. :)

注意:我对 rhino3d 不是很熟悉.也许有内置功能可以帮助您更好地解决问题!

Note: I am not very familiar with rhino3d. Maybe there is built-in functionality that can help you solving your problem even better!

注意 2: 原谅我,当我不是 100% 正确时,我已经有一段时间没有做这些理论考虑了...

Note 2: Forgive me, when I am not 100% correct on all points, I haven't done those theoretical considerations for a while...

1 在最好的情况下,我们在一个完全平衡的树中寻找一个特定的叶子.

1 In best case, where we are searching for one specific leaf in a completely balanced tree.

这篇关于递归方法比交互方法慢 10 倍的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆