算法:使用联合查找计算岛数 [英] Algorithm: use union find to count number of islands

查看:93
本文介绍了算法:使用联合查找计算岛数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设您需要计算矩阵上的孤岛数量

Suppose you need to count the number of islands on a matrix

                    {1, 1, 0, 0, 0},
                    {0, 1, 0, 0, 1},
                    {1, 0, 0, 1, 1},
                    {0, 0, 0, 0, 0},
                    {1, 0, 1, 0, 1}

我们可以简单地使用DFS或BFS当输入矩阵的大小适合内存时。

We could simply use DFS or BFS when the input matrix size can be fitting into the memory.

但是,如果输入矩阵很大而又无法适合内存时该怎么办?

However, what do we do if the input matrix is really large which could not be fitting into the memory?

我可以将输入矩阵分块/分割成不同的小文件,然后分别读取它们。

I could chunk/split the input matrix into different small files and read them respectively.

但是如何合并

我被困在合并它们。我的想法是,合并它们时,我们必须阅读一些重叠的部分。但是,这样做的具体方法是什么?

I got stuck at how to merge them. I have the idea that when merging them we have to read some overlapped portion. But what is a concrete way to do so?

白板上的以下样品,并逐行处理。
合并然后合并顶部,似乎无法正常工作。

When I drew the below sample on the whiteboard and process it row by row. Merge left then merge top and it seems won't work.

从Matt的解决方案中。

From Matt's solution.

            int topidx = col * 2;
            int botidx = topidx + 1;

推荐答案

使用联合查找,基本算法(无需担心内存)是:

Using union-find, the basic algorithm (without worrying about memory) is:


  1. 为每个 1
  2. 创建一个集合
  3. 为每对相邻的 1合并集合 s。

容易,并且只需稍加注意,就可以使用对矩阵的顺序访问和仅2行的内存来做到这一点:

Easy, and with a little care, you can do this using sequential access to the matrix and only 2 rows worth of memory:


  1. 将岛计数初始化为0

  2. 读取第一行,为每个创建一个集合1 ,并合并相邻列中的集合。

  3. 每增加一行:

  1. Initialize the island count to 0
  2. Read the first row, create a set for each 1, and merge sets in adjacent columns.
  3. For each additional row:


  1. 读取该行,为每个 1 创建一个集合,然后合并相邻列中的集合;

  2. 合并设置在新行中,而在上一行中有相邻的设置。始终将链接指向下方,这样您就永远不会在新行中找到与旧行中的父级链接的集。

  3. 计算上一行中的剩余根集,并将数字添加到您的岛屿计数中。这些将永远无法与其他任何对象合并。

  4. 丢弃上一行中的所有集合-您再也不需要它们了,因为您已经计算了它们并且没有任何链接

  1. Read the row, create a set for each 1, and merge sets in adjacent columns;
  2. Merge sets in the new row with adjacent sets in the previous row. ALWAYS POINT THE LINKS DOWNWARD, so that you never end up with a set in the new row linked to a parent in the old row.
  3. Count the remaining root sets in the previous row, and add the number to your island count. These will never be able to merge with anything else.
  4. Discard all the sets in the previous row -- you're never going to need them again, because you already counted them and nothing links to them.


  • 最后,计数最后一行中的根集并将它们添加到岛计数中。

  • Finally, count the root sets in the last row and add them to your island count.

    当然,关键是总是在指向不同行的链接集时将链接指向下方。这不会损害算法的复杂性,而且如果您使用自己的联合查找,则很容易实现。如果您使用的是库数据结构,则可以仅对每一行使用它,并自己跟踪不同行中的根集之间的链接。

    The key to this, of course, is always pointing the links downward whenever you link sets in different rows. This will not hurt the complexity of the algorithm, and if you're using your own union-find, then it is easy to accomplish. If you're using a library data structure then you can use it just for each row, and keep track of the links between root sets in different rows yourself.

    实际上是我最喜欢的算法之一,这是Java的实现。这不是最易读的实现,因为它涉及一些低级技巧,但是它效率极高且简短-在性能非常重要的情况下,我会写这种东西:

    Since this is actually one of my favorite algorithms, here is an implementation in Java. This is not the most readable implementation since it involves some low-level tricks, but is super-efficient and short -- the kind of thing I'd write where performance is very important:

    import java.util.Arrays;
    
    public class Islands
    {
        private static final String[] matrix=new String[] {
            "  #############  ###   ",
            "  #      #####   ##    ",
            "  #  ##  ##   #   #    ",
            "    ###      ##   #  # ",
            "  #   #########  ## ## ",
            "          ##       ##  ",
            "          ##########   ",
        };
    
        // find with path compression.
        // If sets[s] < 0 then it is a link to ~sets[s].  Otherwise it is size of set
        static int find(int[] sets, int s)
        {
            int parent = ~sets[s];
            if (parent>=0)
            {
                int root = find(sets, parent);
                if (root != parent)
                {
                    sets[s] = ~root;
                }
                return root;
            }
            return s;
        }
    
        // union-by-size
        // If sets[s] < 0 then it is a link to ~sets[s].  Otherwise it is size of set
        static boolean union(int[] sets, int x, int y)
        {
            x = find(sets,x);
            y = find(sets,y);
            if (x!=y)
            {
                if ((sets[x] < sets[y]))
                {
                    sets[y] += sets[x];
                    sets[x] = ~y;
                }
                else
                {
                    sets[x] += sets[y];
                    sets[y] = ~x;
                }
                return true;
            }
            return false;
        }
    
        // Count islands in matrix
    
        public static void main(String[] args)
        {
            // two rows of union-find sets.
            // top row is at even indexes, bottom row is at odd indexes.  This arrangemnt is chosen just
            // to make resizing this array easier.
            // For each value x:
            // x==0 => no set. x>0 => root set of size x. x<0 => link to ~x
            int cols=4;
            int[] setrows= new int[cols*2];
    
            int islandCount = 0;
    
            for (String s : matrix)
            {
                System.out.println(s);
                //Make sure our rows are big enough
                if (s.length() > cols)
                {
                    cols=s.length();
                    if (setrows.length < cols*2)
                    {
                        int newlen = Math.max(cols,setrows.length)*2;
                        setrows = Arrays.copyOf(setrows, newlen);
                    }
                }
                //Create sets for land in bottom row, merging left
                for (int col=0; col<s.length(); ++col)
                {
                    if (!Character.isWhitespace(s.charAt(col)))
                    {
                        int idx = col*2+1;
                        setrows[idx]=1; //set of size 1
                        if (idx>=2 && setrows[idx-2]!=0)
                        {
                            union(setrows, idx, idx-2);
                        }
                    }
                }
                //merge up
                for (int col=0; col<cols; ++col)
                {
                    int topidx = col*2;
                    int botidx = topidx+1;
                    if (setrows[topidx]!=0 && setrows[botidx]!=0)
                    {
                        int toproot=find(setrows,topidx);
                        if ((toproot&1)!=0)
                        {
                            //top set is already linked down
                            union(setrows, toproot, botidx);
                        }
                        else
                        {
                            //link top root down.  It does not matter that we aren't counting its size, since
                            //we will shortly throw it aaway
                            setrows[toproot] = ~botidx;
                        }
                    }
                }
                //count root sets, discard top row, and move bottom row up while fixing links
                for (int col=0; col<cols; ++col)
                {
                    int topidx = col * 2;
                    int botidx = topidx + 1;
                    if (setrows[topidx]>0)
                    {
                        ++islandCount;
                    }
                    int v = setrows[botidx];
                    setrows[topidx] = (v>=0 ? v : v|1); //fix up link if necessary
                    setrows[botidx] = 0;
                }
            }
    
            //count remaining root sets in top row
            for (int col=0; col<cols; ++col)
            {
                if (setrows[col*2]>0)
                {
                    ++islandCount;
                }
            }
    
            System.out.println("\nThere are "+islandCount+" islands there");
        }
    
    }
    

    这篇关于算法:使用联合查找计算岛数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆