二维数组在创建组相似的元素 [英] Creating sets of similar elements in a 2D array

查看:205
本文介绍了二维数组在创建组相似的元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图解决一个基于二维数组的一个问题。此数组包含不同种类的元素(从总共3种可能的)。让我们假设一种为X,Y,Z。

I am trying to solve a problem that is based on a 2D array. This array contains different kinds of elements (from a total of 3 possible kinds). Lets assume the kind as X, Y, Z.

该阵列看起来是这样的。需要注意的是它总是被完全填充。该图是用于说明

The array appears to be something like this. Note that it would always be completely filled. The diagram is for illustration.

7 | | | | | | |
6 | | | | | | |
5 | | | | | | |
4 | |X|Z|Y|X| |
3 | |Y|X|Y|Y|X|
2 |Y|Y|X|Z|Z|X|
1 |X|X|Y| |X|X|
0 | | | |Z| | |
   0 1 2 3 4 5

我试图创建集被放置彼此相邻元素。例如,设置1可以包括X型的元件位于的:(0,1),(1,1),(2,2),(2,3),(1,4)。同样,设置2可包括Y型的元件位于的:(3,4),(3,3),4,3)

I am trying to create sets of elements that are placed adjacent to each other. For example, set1 may comprise of elements of type X located at: (0,1), (1,1), (2,2), (2,3), (1,4). Similarly, set2 may comprise of elements of type Y located at: (3,4), (3,3), 4,3).

问题:考虑到阵列中的任何一点上,它必须能够将所有元素的适当设置,并确保有包含相同的元素没有两套。请注意,如果遇到超过2个相邻的相同类型的元素是仅创建了一组

Problem: Given any point in the array, it must be capable of adding all elements to the appropriate set and ensuring that there are no two sets that contain the same element. Note that a set is only created if more than 2 adjacent elements of the same kind are encountered.

此外,如果在一定的子集的元素被除去,加入更多的元素,以取代除去的。该数组必须被重新迭代做出新设置或修改现有的。

Moreover, if a certain subset of elements is removed, more elements are added to replace the removed ones. The array must then be re-iterated over to make new sets or modify the existing ones.

解决方法:我实现了一个递归的解决方案,例如,它会遍历过所有相邻的元素,例如,元素X(0,1)。然后,在遍历的8种可能的相邻的元素,它会递归调用自身每当一个X型的发生。

Solution: I implemented a recursive solution such that it would iterate over all the adjacent elements of, for example, element X (0,1). Then, while iterating over the 8 possible adjacent elements, it would call itself recursively whenever a type X occurred.

这类型的解决方案是太多蛮力和效率不高,尤其是在一些元素被替换可能具有不同类型的新情况。在这样的情况下,几乎整个阵列必须被重新迭代到使/修改集和确保没有相同的元素在多于一组的存在。

This kind of solution is too much brute-force and inefficient, especially in the case where some elements are replaced with new ones of possibly different types. In such a case, almost the whole array has to be re-iterated to make/modify sets and ensuring that no same element exists in more than one set.

有没有什么算法有这样那样的问题有效地处理?我需要一些想法/建议或伪codeS帮助。

Is there any algorithm to deal efficiently with this kind of problem? I need help with some ideas/suggestions or pseudo codes.

推荐答案

在下文中,通过连接组件我说的是一套彼此可到达,使具有相同类型的元素的相邻位置之间只有水平,垂直或对角线移动路径的所有位置。例如。您的例子 {(0,1),(1,1),(2,2),(2,3),(1,4)} 是一个连接组件在你的例子输入。每个位置属于一个连接的组件。

In the following, by "connected component" I mean the set of all positions that are reachable from each other by a path that allows only horizontal, vertical or diagonal moves between neighbouring positions having the same kind of element. E.g. your example {(0,1), (1,1), (2,2), (2,3), (1,4)} is a connected component in your example input. Each position belongs to exactly one connected component.

我们将建立一个联盟/查找数据结构将用于给每位置(x,y)的一个数字标记具有这样的性质:当且仅当任何两个位置(X,Y)和(x',y')的属于相同的组件那么他们有相同的标号。特别是这种数据结构支持三种操作:

We will build a union/find data structure that will be used to give every position (x, y) a numeric "label" having the property that if and only if any two positions (x, y) and (x', y') belong to the same component then they have the same label. In particular this data structure supports three operations:

  • 设置(X,Y,I)将标签的位置(X,Y)设置为我。
  • 找到(X,Y)返回分配到位置(X,Y)的标签。
  • 工会(Z),对一些组标签Z,将结合所有标签Z中到一个单一的标签K,在某种意义上说,将来调用找到(X,Y)上的任何位置(X,Y)是previously了Z轴的标签现在复位K。 (在广义k将已经在Z上的标签之一,虽然这不是真正重要的。)工会(Z)也返回了新的主人的标签,K。
  • set(x, y, i) will set the label for position (x, y) to i.
  • find(x, y) will return the label assigned to the position (x, y).
  • union(Z), for some set of labels Z, will combine all labels in Z into a single label k, in the sense that future calls to find(x, y) on any position (x, y) that previously had a label in Z will now return k. (In general k will be one of the labels already in Z, though this is not actually important.) union(Z) also returns the new "master" label, k.

如果有n =宽*总高度位置,这可以为O完成(N * A(n))的时间,其中()是极其缓慢增长的反阿克曼函数。 对于所有的实际输入尺寸,这是一样的O(N)。

If there are n = width * height positions in total, this can be done in O(n*a(n)) time, where a() is the extremely slow-growing inverse Ackermann function. For all practical input sizes, this is the same as O(n).

注意,每当两个顶点是彼此相邻,有四种可能的情况:

Notice that whenever two vertices are adjacent to each other, there are four possible cases:

  1. 在一个高于其他的(由垂直边缘连接)
  2. 之一是另一个的左侧(由水平边连接)
  3. 一个是上面和其他的左侧(由 \ 对角边连接)
  4. 其一是上述和其他的权利(由 / 对角线边缘连接)
  1. One is above the other (connected by a vertical edge)
  2. One is to the left of the other (connected by a horizontal edge)
  3. One is above and to the left of the other (connected by a \ diagonal edge)
  4. One is above and to the right of the other (connected by a / diagonal edge)

我们可以使用下面的通,以确定每个位置标签(X,Y):

We can use the following pass to determine labels for each position (x, y):

  • 设置nextLabel为0。
  • 对于每一个y行递增的顺序:
    • 对于每一个x列的递增顺序:
      • 检查的W,NW,N和NE的邻居(X,Y)。令z为这4个邻居是相同的种类为(x,y)的的所述子集。
      • 如果Z是空集,那么我们姑且假设(X,Y)将启动一个全新的组件,所以调用set(X,Y​​,nextLabel)和递增nextLabel。
      • 否则,调用找到(Z [I])z的每个元素上找到自己的标签,并呼吁工会()本组标签将它们结合在一起。分配新的标签(这个工会()调用的结果),以K,然后还要求集(X,Y,K)增加(X,Y),以该组件。
      • Set nextLabel to 0.
      • For each row y in increasing order:
        • For each column x in increasing order:
          • Examine the W, NW, N and NE neighbours of (x, y). Let Z be the subset of these 4 neighbours that are of the same kind as (x, y).
          • If Z is the empty set, then we tentatively suppose that (x, y) starts a brand new component, so call set(x, y, nextLabel) and increment nextLabel.
          • Otherwise, call find(Z[i]) on each element of Z to find their labels, and call union() on this set of labels to combine them together. Assign the new label (the result of this union() call) to k, and then also call set(x, y, k) to add (x, y) to this component.

          在此,呼吁找到(X,Y)上的任何位置(X,Y)有效地告诉你,它属于哪个组件。如果您希望能够快速回答形式的查询哪些岗位属于连接的成分含有位置(X,Y)?然后创建列表的一个哈希表 posInComp 键,进行第二次传过来的输入数组,追加每一个(X,Y)到列表 posInComp [找到( X,Y)] 。这可以在线性时间和空间都进行。现在来回答一些特定的位置(X,Y),只需拨打一个查询实验室=查找(X,Y)找到那个位置的标签,然后列表中的位置 posInComp [实验室]

          After this, calling find(x, y) on any position (x, y) effectively tells you which component it belongs to. If you want to be able to quickly answer queries of the form "Which positions belong to the connected component containing position (x, y)?" then create a hashtable of lists posInComp and make a second pass over the input array, appending each (x, y) to the list posInComp[find(x, y)]. This can all be done in linear time and space. Now to answer a query for some given position (x, y), simply call lab = find(x, y) to find that position's label, and then list the positions in posInComp[lab].

          要解决过小的组件,就看的大小posInComp [实验室] 。如果它是1或2,那么(X,Y)不属于任何足够大的组件。

          To deal with "too-small" components, just look at the size of posInComp[lab]. If it's 1 or 2, then (x, y) does not belong to any "large-enough" component.

          最后,所有这些工作有效地花费线性时间,所以这将是快如闪电,除非你输入数组是巨大的。所以这是完全合理的修改输入数组后从头开始重新计算了。

          Finally, all this work effectively takes linear time, so it will be lightning fast unless your input array is huge. So it's perfectly reasonable to recompute it from scratch after modifying the input array.

          这篇关于二维数组在创建组相似的元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆