将数字数组按接近度划分为集合 [英] Partition an array of numbers into sets by proximity

查看:92
本文介绍了将数字数组按接近度划分为集合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我们有一个像这样的数组

Let's say we have an array like

[37, 20, 16, 8, 5, 5, 3, 0]

我可以使用什么算法来指定分区数并将数组划分为

What algorithm can I use so that I can specify the number of partitions and have the array broken into them.

对于2个分区,应该是

[37] and [20, 16, 8, 5, 5, 3, 0]

对于3,应该是

[37],[20, 16] and [8, 5, 5, 3, 0]

我能够通过简单地减去带有左右数字的元素来按接近度分解它们,但这并不能确保正确的分区数量。
有什么想法吗?

I am able to break them down by proximity by simply subtracting the element with right and left numbers but that doesn't ensure the correct number of partitions. Any ideas?

我的代码是红宝石,但是任何语言/算法/伪代码都足够。

My code is in ruby but any language/algo/pseudo-code will suffice.

这是Vikram算法的红宝石代码

Here's the ruby code by Vikram's algorithm

def partition(arr,clusters)

# Return same array if clusters are less than zero or more than array size
return arr if (clusters >= arr.size) || (clusters < 0) 

edges = {}

# Get weights of edges
arr.each_with_index do |a,i|
  break if i == (arr.length-1)
  edges[i] = a - arr[i+1]
end

# Sort edge weights in ascending order
sorted_edges =  edges.sort_by{|k,v| v}.collect{|k| k.first}

# Maintain counter for joins happening. 

prev_edge = arr.size+1
joins = 0


sorted_edges.each do |edge|
    # If join is on right of previous, subtract the number of previous joins that happened on left
    if (edge > prev_edge)

        edge -= joins   
    end
            joins += 1
    # Join the elements on the sides of edge. 
    arr[edge] = arr[edge,2].flatten
    arr.delete_at(edge+1)

    prev_edge = edge

    # Get out when right clusters are done
    break if arr.size == clusters
end
end


推荐答案

我认为,使用kruskal算法使用k聚类可以解决您的问题。使用Kruskal算法查找聚类,以使它们之间具有最大间距。

I think your problem can be solved using k-clustering using kruskal's algorithm . Kruskal algorithm is used to find the clusters such that there is maximum spacing between them.

算法:-

从您的数据集构造路径图,如下所示:-

Construct path graph from your data set like following : -


[37,20,16,16,8,5,5,3,0]

[37, 20, 16, 8, 5, 5, 3, 0]

路径图:-0-> 1-> 2-> 3-> 4-> 5-> 6-> 7

path graph: - 0 -> 1 -> 2 -> 3 -> 4 -> 5 -> 6 -> 7

然后每个边缘的权重将是它们的值之间的差异

then weight for each edge will be difference between their values

  edge(0,1) = abs(37-20) = 17
  edge(1,2) = abs(20-16) = 4
  edge(2,3) = abs(16-8) = 8
  edge(3,4) = abs(8-5) = 3
  edge(4,5) = abs(5-5) = 0
  edge(5,6) = abs(5-3) = 2
  edge(6,7) = abs(3-0) = 3


在此图上使用kruskal,直到仅剩k个簇为止:-

Use kruskal on this graph till there are only k clusters remaining : -


Sort the edges first according to weights in ascending order:-

(4,5),( 5,6),(6,7),(3,4),(1,2),(2,3),(0,1)

(4,5),(5,6),(6,7),(3,4),(1,2),(2,3),(0,1)

在其上使用krushkal可以找到准确的k = 3个簇:-

Use krushkal on it find exactly k = 3 clusters : -

   iteration 1 :  join (4,5) clusters = 7 clusters: [37,20,16,8,(5,5),3,0]
   iteration 2 :  join (5,6) clusters = 6 clusters: [37,20,16,8,(5,5,3),0]
   iteration 3 :  join (6,7) clusters = 5 clusters: [37,20,16,8,(5,5,3,0)]
   iteration 4 :  join (3,4) clusters = 4 clusters: [37,20,16,(8,5,5,3,0)]
   iteration 5 :  join (1,2) clusters = 3 clusters: [37,(20,16),(8,5,5,3,0)]
   stop as clusters = 3

重新构造的解决方案:[(37),(20,16),(8,5,5,3,0)]是
u所需

reconstrusted solution : [(37), (20, 16), (8, 5, 5, 3, 0)] is what u desired

这篇关于将数字数组按接近度划分为集合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆