在二维数组中创建相似元素集 [英] Creating sets of similar elements in a 2D array

查看:16
本文介绍了在二维数组中创建相似元素集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试解决基于二维数组的问题.该数组包含不同种类的元素(共有 3 种可能的种类).让我们假设类型为 X、Y、Z.

I am trying to solve a problem that is based on a 2D array. This array contains different kinds of elements (from a total of 3 possible kinds). Lets assume the kind as X, Y, Z.

数组看起来是这样的.请注意,它总是会被完全填满.该图用于说明.

The array appears to be something like this. Note that it would always be completely filled. The diagram is for illustration.

7 | | | | | | |
6 | | | | | | |
5 | | | | | | |
4 | |X|Z|Y|X| |
3 | |Y|X|Y|Y|X|
2 |Y|Y|X|Z|Z|X|
1 |X|X|Y| |X|X|
0 | | | |Z| | |
   0 1 2 3 4 5

我正在尝试创建彼此相邻放置的元素集.例如,set1 可能包含位于以下位置的 X 类型元素:(0,1)、(1,1)、(2,2)、(2,3)、(1,4).类似地,set2 可能包含位于以下位置的 Y 类型元素:(3,4), (3,3), 4,3).

I am trying to create sets of elements that are placed adjacent to each other. For example, set1 may comprise of elements of type X located at: (0,1), (1,1), (2,2), (2,3), (1,4). Similarly, set2 may comprise of elements of type Y located at: (3,4), (3,3), 4,3).

问题:给定数组中的任何点,它必须能够将所有元素添加到适当的集合中,并确保没有两个集合包含相同的元素.请注意,仅当遇到超过 2 个相同类型的相邻元素时才会创建集合.

Problem: Given any point in the array, it must be capable of adding all elements to the appropriate set and ensuring that there are no two sets that contain the same element. Note that a set is only created if more than 2 adjacent elements of the same kind are encountered.

此外,如果某个元素的子集被移除,则会添加更多元素来替换被移除的元素.然后必须重新迭代该数组以创建新集合或修改现有集合.

Moreover, if a certain subset of elements is removed, more elements are added to replace the removed ones. The array must then be re-iterated over to make new sets or modify the existing ones.

解决方案:我实现了一个递归解决方案,它会遍历所有相邻元素,例如元素 X (0,1).然后,在遍历 8 个可能的相邻元素时,它会在出现类型 X 时递归调用自身.

Solution: I implemented a recursive solution such that it would iterate over all the adjacent elements of, for example, element X (0,1). Then, while iterating over the 8 possible adjacent elements, it would call itself recursively whenever a type X occurred.

这种解决方案过于暴力且效率低下,尤其是在某些元素被可能不同类型的新元素替换的情况下.在这种情况下,几乎整个数组都必须重新迭代以创建/修改集合,并确保在多个集合中不存在相同的元素.

This kind of solution is too much brute-force and inefficient, especially in the case where some elements are replaced with new ones of possibly different types. In such a case, almost the whole array has to be re-iterated to make/modify sets and ensuring that no same element exists in more than one set.

是否有任何算法可以有效地处理此类问题?我需要一些想法/建议或伪代码的帮助.

Is there any algorithm to deal efficiently with this kind of problem? I need help with some ideas/suggestions or pseudo codes.

推荐答案

在下文中,连接组件"是指通过仅允许在具有相同类型元素的相邻位置之间水平、垂直或对角线移动的路径彼此可到达的所有位置的集合.例如.您的示例 {(0,1), (1,1), (2,2), (2,3), (1,4)} 是示例输入中的连接组件.每个位置恰好属于一个连通分量.

In the following, by "connected component" I mean the set of all positions that are reachable from each other by a path that allows only horizontal, vertical or diagonal moves between neighbouring positions having the same kind of element. E.g. your example {(0,1), (1,1), (2,2), (2,3), (1,4)} is a connected component in your example input. Each position belongs to exactly one connected component.

我们将构建一个 union/find 数据结构,用于为每个position (x, y) 一个数字标签",具有以下属性:当且仅当任意两个位置 (x, y) 和 (x', y') 属于同一组件时,它们具有相同的标签.特别是这个数据结构支持三种操作:

We will build a union/find data structure that will be used to give every position (x, y) a numeric "label" having the property that if and only if any two positions (x, y) and (x', y') belong to the same component then they have the same label. In particular this data structure supports three operations:

  • set(x, y, i) 将位置 (x, y) 的标签设置为 i.
  • find(x, y) 将返回分配给位置 (x, y) 的标签.
  • union(Z),对于某些标签集 Z,会将 Z 中的所有标签组合成单个标签 k,在未来调用 find(x, y) 的意义上之前在 Z 中具有标签的任何位置 (x, y) 上的 现在将返回 k.(通常 k 将是 Z 中已经存在的标签之一,尽管这实际上并不重要.)union(Z) 还返回新的主"标签 k.
  • set(x, y, i) will set the label for position (x, y) to i.
  • find(x, y) will return the label assigned to the position (x, y).
  • union(Z), for some set of labels Z, will combine all labels in Z into a single label k, in the sense that future calls to find(x, y) on any position (x, y) that previously had a label in Z will now return k. (In general k will be one of the labels already in Z, though this is not actually important.) union(Z) also returns the new "master" label, k.

如果总共有 n = 宽度 * 高度位置,这可以在 O(n*a(n)) 时间内完成,其中 a() 是增长极其缓慢的阿克曼反函数.对于所有实际输入大小,这与 O(n) 相同.

If there are n = width * height positions in total, this can be done in O(n*a(n)) time, where a() is the extremely slow-growing inverse Ackermann function. For all practical input sizes, this is the same as O(n).

请注意,只要两个顶点彼此相邻,就有四种可能的情况:

Notice that whenever two vertices are adjacent to each other, there are four possible cases:

  1. 一个在另一个之上(由垂直边缘连接)
  2. 一个在另一个的左侧(由水平边缘连接)
  3. 一个在另一个的上方和左侧(由 对角线连接)
  4. 一个在另一个的上方和右侧(由 / 对角线连接)
  1. One is above the other (connected by a vertical edge)
  2. One is to the left of the other (connected by a horizontal edge)
  3. One is above and to the left of the other (connected by a diagonal edge)
  4. One is above and to the right of the other (connected by a / diagonal edge)

我们可以使用以下 pass 来确定每个位置 (x, y) 的标签:

We can use the following pass to determine labels for each position (x, y):

  • 将 nextLabel 设置为 0.
  • 对于每行 y,按升序排列:
    • 对于按升序排列的每一列 x:
      • 检查 (x, y) 的 W、NW、N 和 NE 邻居.令 Z 为这 4 个与 (x, y) 同类的邻居的子集.
      • 如果 Z 是空集,那么我们假设 (x, y) 开始一个全新的组件,因此调用 set(x, y, nextLabel) 并递增 nextLabel.
      • 否则,对 Z 的每个元素调用 find(Z[i]) 以找到它们的标签,并在这组标签上调用 union() 将它们组合在一起.将新标签(此 union() 调用的结果)分配给 k,然后还调用 set(x, y, k) 将 (x, y) 添加到此组件.

      在此之后,在任意位置 (x, y) 调用 find(x, y) 可以有效地告诉您它属于哪个组件.如果您希望能够快速回答哪些位置属于包含位置 (x, y) 的连通分量?"形式的查询?然后创建列表 posInComp 的哈希表,并对输入数组进行第二次传递,将每个 (x, y) 附加到列表 posInComp[find(x, y)].这一切都可以在线性时间和空间内完成.现在要回答某个给定位置 (x, y) 的查询,只需调用 lab = find(x, y) 来查找该位置的标签,然后在 posInComp[lab].

      After this, calling find(x, y) on any position (x, y) effectively tells you which component it belongs to. If you want to be able to quickly answer queries of the form "Which positions belong to the connected component containing position (x, y)?" then create a hashtable of lists posInComp and make a second pass over the input array, appending each (x, y) to the list posInComp[find(x, y)]. This can all be done in linear time and space. Now to answer a query for some given position (x, y), simply call lab = find(x, y) to find that position's label, and then list the positions in posInComp[lab].

      要处理太小"的组件,只需查看 posInComp[lab] 的大小即可.如果是 1 或 2,则 (x, y) 不属于任何足够大"的组件.

      To deal with "too-small" components, just look at the size of posInComp[lab]. If it's 1 or 2, then (x, y) does not belong to any "large-enough" component.

      最后,所有这些工作实际上都需要线性时间,所以除非你的输入数组很大,否则它会快如闪电.所以修改输入数组后从头开始重新计算是完全合理的.

      Finally, all this work effectively takes linear time, so it will be lightning fast unless your input array is huge. So it's perfectly reasonable to recompute it from scratch after modifying the input array.

      这篇关于在二维数组中创建相似元素集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆