实现用于检测自相交多边形的蛮力算法 [英] Implementing a brute force algorithm for detecting a self-intersecting polygon

查看：33 发布时间：2021/11/17 1:35:47 c# algorithm arcgis shapefile polygons

本文介绍了实现用于检测自相交多边形的蛮力算法的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我最初实现了 Hoey-Shamos 算法，但是它对于未来的可维护性来说太复杂了(我对此没有发言权)，并且它没有正确报告，所以我将要使用优化的蛮力算法使用.

I initially implemented the Hoey-Shamos algorithm, however it is too complex for future maintainability (I have no say in this), and it wasn't reporting correctly, so an optimized brute force algorithm is what I'm going to use.

我的问题是:如何优化此代码以使其可用?

My question is: How can I optimize this code to be usable?

就目前而言，我的代码包含一个嵌套的 for 循环，将同一个列表迭代两次.

As it stands, my code contains a nested for loop, iterating the same list twice.

将行转换为 HashSet 并使用两个 foreach 循环......扫描 10,000 条代码大约需要 45 秒.还是不够.

Turned lines into a HashSet and used two foreach loops... shaved about 45 seconds off scanning 10,000. It's still not enough.

foreach (Line2D g in lines)
{                   
foreach (Line2D h in lines)
{
    if (g.intersectsLine(h))
    {
        return false;
    }
}                  

 } // end 'lines' for each loop

如果我强制我的intersectsLine()"方法返回 false(用于测试目的)，扫描 10,000 条记录(我有 700,000 条记录)仍然需要 8 秒.太长了，需要优化一下这段代码.

If I force my "intersectsLine()" method to return false (for testing purposes) it still takes 8 seconds to scan 10,000 records (I have 700,000 records). This is too long, so I need to optimize this piece of code.

在与所有其他行进行比较后，我尝试从列表中删除行，但是存在准确性问题(不知道为什么)并且速度增加几乎不明显.

I tried removing lines from the List after it's been compared to all the other lines, however there's an accuracy issue (no idea why) and the speed increase is barely noticeable.

这是我的 intersectsLine 方法.我在 here 找到了另一种方法，但它看起来像所有方法调用和诸如此类的东西都会变慢.计算斜率在我看来并不需要太多计算(如果我错了，请纠正我?)

Here is my intersectsLine method. I found an alternate approach here but it looks like it'd be slower with all the method calls and whatnot. Calculating the slope doesn't seem to me like it'd take too much computing (Correct me if I'm wrong?)

public bool intersectsLine(Line2D comparedLine)
{

//tweakLine(comparedLine);
if (this.Equals(comparedLine) ||
    P2.Equals(comparedLine.P1) ||
    P1.Equals(comparedLine.P2))
{
    return false;
}

double firstLineSlopeX, firstLineSlopeY, secondLineSlopeX, secondLineSlopeY;

firstLineSlopeX = X2 - X1;
firstLineSlopeY = Y2 - Y1;

secondLineSlopeX = comparedLine.X2 - comparedLine.X1;
secondLineSlopeY = comparedLine.Y2 - comparedLine.Y1;

double s, t;
s = (-firstLineSlopeY * (X1 - comparedLine.X1) + firstLineSlopeX * (Y1 - comparedLine.Y1)) / (-secondLineSlopeX * firstLineSlopeY + firstLineSlopeX * secondLineSlopeY);
t = (secondLineSlopeX * (Y1 - comparedLine.Y1) - secondLineSlopeY * (X1 - comparedLine.X1)) / (-secondLineSlopeX * firstLineSlopeY + firstLineSlopeX * secondLineSlopeY);

if (s >= 0 && s <= 1 && t >= 0 && t <= 1)
{
    //console.WriteLine("s = {0}, t = {1}", s, t);
    //console.WriteLine("this: " + this);
    //console.WriteLine("other: " + comparedLine);
    return true;
}

return false; // No collision */

}

主要瓶颈是大多边形！我排除了超过 100 行的运行多边形，它在 5:10 中运行了所有 700,000 多个多边形.这在可接受的范围内！在运行此代码之前，肯定有一种方法可以查看多边形是否值得计算?如果有帮助，我可以访问 XMin、Xmax、YMin 和 YMax 属性吗?

The major bottleneck is the big polygons! I excluded running polygons with more than 100 lines, and it ran all 700,000+ polygons in 5:10. Which is in the acceptable range! Surely there's a way to see if a polygon is worth calculating before running this code? I have access to the XMin, Xmax, YMin, and YMax properties if that helps?

进行另一项测试，将每个多边形限制在 1000 行以下.花了一个多小时.

Ran another test, limiting polygons to under 1000 lines each. It took a little over an hour.

我删除了多边形的所有限制，现在它已经运行了 36 小时……仍然没有结果.

I removed all limiting of polygons, and it's been running for 36 hours now... still no results.

我有几个想法:

- 当我生成我的行哈希集时，有另一个哈希集/列表，其中包含交叉候选.如果有交叉的机会，我只会在此列表中添加行.但是我如何消除/增加可能性?如果只有 3 条线可能与其他线相交，我会将 4000 条线与 3 条线进行比较，而不是 4000 条线.仅此一项就可以产生巨大的差异.

-When I generate my lines hashset, have another hashset/List that has candidates for intersection. I would only add lines to this list if there's a chance for intersection. But how do I eliminate/add possibilities? If there's only three lines that could possibly intersect with others, I'd be comparing 4000 lines against 3 instead of 4000. This alone could make a HUGE difference.

-如果同一个点出现两次，除了第一个和最后一个点，省略运行嵌套的 for 循环.

-If the same point occurs twice, except the first and last point, omit running the nested for loop.

关于多边形的信息:总共 700,000 个

Information about the polygons: 700,000 total

具有 1,000 或更多点的四千多个多边形

There is over four thousand polygons with 1,000 or more points

有 2 个多边形具有 70,000 或更多点

There is 2 polygons with 70,000 or more points

我认为可以通过一点创意将这段时间缩短到 15 分钟左右.

I think it's possible to get this down to fifteen or so minutes with a bit of creativity.

推荐答案

您当前的 Brute-Force 算法是 O(n^2).仅对于您的两个 70,000 线多边形来说，这是近 100 亿次操作的某个因素，更不用说其他 700,000 个多边形了.显然，仅仅代码优化是不够的，因此您需要某种算法优化来降低 O(n^2) 而不会过于复杂.

Your current Brute-Force algorithm is O(n^2). For just your two 70,000 line polygons that's some factor of almost 10 Billion operations, to say nothing of the 700,000 other polygons. Obviously, no amount of mere code optimization is going to be enough, so you need some kind algorithmic optimization that can lower that O(n^2) without being unduly complicated.

O(n^2) 来自蛮力算法中的嵌套循环，每个循环都以 n 为边界，使其成为 O(n*n).改善这一点的最简单方法是找到某种方法来减少内部循环，使其不受n 的约束或依赖.因此，我们需要找到某种方法来对内部行列表进行排序或重新排序以检查外部行，以便只需要扫描完整列表的一部分.

The O(n^2) comes from the nested loops in the brute-force algorithm that are each bounded by n, making it O(n*n). The simplest way to improve this would be to find some way to reduce the inner loop so that it is not bound by or dependent on n. So what we need to find is some way to order or re-order the inner list of lines to check the outer line against so that only a part of the full list needs to be scanned.

我采用的方法利用了这样一个事实:如果两条线段相交，那么它们的 X 值的范围必须相互重叠.请注意，这并不意味着它们确实相交，但如果它们的 X 范围不重叠，则它们不能相交，因此无需检查它们彼此.(Y 值范围也是如此，但您一次只能利用一个维度).

The approach that I am taking takes advantage of the fact that if two line segments intersect, then the range of their X values must overlap each other. Mind you, this doesn't mean that they do intersect, but if their X ranges don't overlap, then they cannot be intersecting so theres no need to check them against each other. (this is true of the Y value ranges also, but you can only leverage one dimension at a time).

这种方法的优点是这些 X 范围可用于对线的端点进行排序，而这些端点又可用作检查线是否相交的起点和终点.

The advantage of this approach is that these X ranges can be used to order the endpoints of the lines which can in turn be used as the starting and stopping points for which lines to check against for intersection.

因此，我们具体要做的是定义一个自定义类 (endpointEntry)，用于表示线的两个点的高或低 X 值.这些端点都被放入同一个 List 结构中，然后根据它们的 X 值进行排序.

So specifically what we do is to define a custom class (endpointEntry) that represents the High or Low X values of the line's two points. These endpoints are all put into the same List structure and then sorted based on their X values.

然后我们实现一个外循环，在那里我们扫描整个列表，就像在蛮力算法中一样.然而，我们的内部循环有很大不同.我们不是重新扫描整个列表的行以检查交叉点，而是开始从外部循环的行的高 X 值端点开始向下扫描排序的端点列表，并在我们通过它时结束它同一行的低 X 值端点.这样，我们只检查 X 值范围与外循环的行重叠的行.

Then we implement an outer loop where we scan the entire list just as in the brute-force algorithm. However our inner loop is considerably different. Instead of re-scanning the entire list for lines to check for intersection, we rather start scanning down the sorted endpoint list from the high X value endpoint of the outer loop's line and end it when we pass below that same line's low X value endpoint. In this way, we only check the lines whose range of X values overlap the outer loop's line.

好的，这是一个演示我的方法的 c# 类:

OK, here's a c# class demonstrating my approach:

class CheckPolygon2
{
    // internal supporting classes
    class endpointEntry
    {
        public double XValue;
        public bool isHi;
        public Line2D line;
        public double hi;
        public double lo;
        public endpointEntry fLink;
        public endpointEntry bLink;
    }
    class endpointSorter : IComparer<endpointEntry>
    {
        public int Compare(endpointEntry c1, endpointEntry c2)
        {
            // sort values on XValue, descending
            if (c1.XValue > c2.XValue) { return -1; }
            else if (c1.XValue < c2.XValue) { return 1; }
            else // must be equal, make sure hi's sort before lo's
                if (c1.isHi && !c2.isHi) { return -1; }
                else if (!c1.isHi && c2.isHi) { return 1; }
                else { return 0; }
        }
    }

    public bool CheckForCrossing(List<Line2D> lines)
    {
        List<endpointEntry> pts = new List<endpointEntry>(2 * lines.Count);

        // Make endpoint objects from the lines so that we can sort all of the
        // lines endpoints.
        foreach (Line2D lin in lines)
        {
            // make the endpoint objects for this line
            endpointEntry hi, lo;
            if (lin.P1.X < lin.P2.X)
            {
                hi = new endpointEntry() { XValue = lin.P2.X, isHi = true, line = lin, hi = lin.P2.X, lo = lin.P1.X };
                lo = new endpointEntry() { XValue = lin.P1.X, isHi = false, line = lin, hi = lin.P1.X, lo = lin.P2.X };
            }
            else
            {
                hi = new endpointEntry() { XValue = lin.P1.X, isHi = true, line = lin, hi = lin.P1.X, lo = lin.P2.X };
                lo = new endpointEntry() { XValue = lin.P2.X, isHi = false, line = lin, hi = lin.P2.X, lo = lin.P1.X };
            }
            // add them to the sort-list
            pts.Add(hi);
            pts.Add(lo);
        }

        // sort the list
        pts.Sort(new endpointSorter());

        // sort the endpoint forward and backward links
        endpointEntry prev = null;
        foreach (endpointEntry pt in pts)
        {
            if (prev != null)
            {
                pt.bLink = prev;
                prev.fLink = pt;
            }
            prev = pt;
        }

        // NOW, we are ready to look for intersecting lines
        foreach (endpointEntry pt in pts)
        {
            // for every Hi endpoint ...
            if (pt.isHi)
            {
                // check every other line whose X-range is either wholly 
                // contained within our own, or that overlaps the high 
                // part of ours.  The other two cases of overlap (overlaps 
                // our low end, or wholly contains us) is covered by hi 
                // points above that scan down to check us.

                // scan down for each lo-endpoint below us checking each's 
                // line for intersection until we pass our own lo-X value
                for (endpointEntry pLo = pt.fLink; (pLo != null) && (pLo.XValue >= pt.lo); pLo = pLo.fLink)
                {
                    // is this a lo-endpoint?
                    if (!pLo.isHi)
                    {
                        // check its line for intersection
                        if (pt.line.intersectsLine(pLo.line))
                            return true;
                    }
                }
            }
        }

        return false;
    }
}

我不确定这个算法的真正执行复杂度是多少，但我怀疑对于大多数非病理多边形，它会接近 O(n*SQRT(n))，这应该足够快.

I am not certain what the true execution complexity of this algorithm is, but I suspect that for most non-pathological polygons it will be close to O(n*SQRT(n)) which should be fast enough.

内循环只是按照与外循环相同的排序顺序扫描端点列表.但它会开始从外循环当前在列表中的位置开始扫描(这是某行的hiX点)，并且只会扫描直到endPoint值低于相应的值同一行的 loX 值.

The Inner Loop simply scans the endPoints list in the same sorted order as the outer loop. But it will start scanning from where the outer loop from where the outer loop currently is in the list (which is the hiX point of some line), and will only scan until the endPoint values drop below the corresponding loX value of that same line.

这里的棘手之处在于，这不能用枚举器(外循环的 foreach(..in pts))来完成，因为无法枚举列表的子列表，也无法根据另一个枚举的当前位置开始枚举.因此，我所做的是使用 Forward 和 Backward Links(fLink 和 bLink)属性来创建一个双向链表结构，该结构保留列表的排序顺序，但我可以在不枚举列表的情况下进行增量扫描:

What's tricky here, is that this cannot be done with an Enumerator (the foreach(..in pts) of the outer loop) because there's no way to enumerate a sublist of a list, nor to start the enumeration based on another enumerations current position. So instead what I did was to use the Forward and Backward Links (fLink and bLink) properties to make a doubly-linked list structure that retains the sorted order of the list, but that I can incrementally scan without enumerating the list:

for (endpointEntry pLo = pt.fLink; (pLo != null) && (pLo.XValue >= pt.lo); pLo = pLo.fLink)

分解一下，旧式 for 循环说明符包含三个部分:初始化、条件和增量-减量.初始化表达式 endpointEntry pLo = pt.fLink; 用列表中当前点的前向链接初始化 pLo.也就是说，列表中的下一个点，按降序排列.

Breaking this down, the old-style for loop specifier has three parts: initialization, condition, and increment-decrement. The initialization expression, endpointEntry pLo = pt.fLink; initializes pLo with the forward Link of the current point in the list. That is, the next point in the list, in descending sorted order.

然后内循环的主体被执行.然后应用增量减量 pLo = pLo.fLink，它只是使用它的前向链接(pLo)将内循环的当前点(pLo)设置到下一个较低点code>pLo.fLink)，从而推进循环.

Then the body of the inner loop gets executed. Then the increment-decrement pLo = pLo.fLink gets applied, which simply sets the inner loop's current point (pLo) to the next lower point using it's forward-link (pLo.fLink), thus advancing the loop.

最后，它在测试循环条件后循环(pLo != null) &&(pLo.XValue >= pt.lo) 只要新点不为空(这意味着我们在列表的末尾)并且只要新点的 循环>XValue 仍然大于等于外循环当前点的低X值.第二个条件确保内循环只查看与外循环端点的线重叠的线.

Finally, it loops after testing the loop condition (pLo != null) && (pLo.XValue >= pt.lo) which loops so long as the new point isn't null (which would mean that we were at the end of the list) and so long as the new point's XValue is still greater than or equal to the low X value of the outer loop's current point. This second condition insures that the inner loop only looks at lines that are overlapping the line of the outer loop's endpoint.

现在对我来说更清楚的是，我可能可以通过将端点列表视为数组来解决整个 fLink-bLink 的笨拙之处:

What is clearer to me now, is that I probably could have gotten around this whole fLink-bLink clumsiness by treating the endPoints List as an array instead:

用端点填充列表
按 XValue 对列表进行排序
使用索引迭代器 (i)，而不是 foreach 枚举器，以降序对列表进行外循环
(A) 内部循环遍历列表，使用第二个迭代器 (j)，从 i 开始，在它低于 pt.Lo<时结束/code>.



Fill up the list with endPoints
Sort the List by XValue
Outer Loop through the list in descending order, using an index iterator (i), instead of a foreach enumerator
(A) Inner Loop through the list, using a second iterator (j), starting at i and ending when it passed below pt.Lo.

我认为那会简单得多.如果你愿意，我可以发布一个类似的简化版本.
That I think would be much simpler.  I can post a simplified version like that, if you want.

                        这篇关于实现用于检测自相交多边形的蛮力算法的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

实现用于检测自相交多边形的蛮力算法 [英] Implementing a brute force algorithm for detecting a self-intersecting polygon

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录关闭

实现用于检测自相交多边形的蛮力算法 [英] Implementing a brute force algorithm for detecting a self-intersecting polygon

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录 关闭

登录关闭