将连接的节点分为三个组 [英] Dividing connected nodes into groups of three

查看:86
本文介绍了将连接的节点分为三个组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个相互连接的对象集合(我称它们为节点)。每个节点都连接到至少一个其他节点,并且节点的整个集合是一个大的Blob(没有异常值)。节点数是三个的倍数。

I have a collection of objects (I'll call them nodes) that are connected to each other. Every node is connected to at least one other node, and the entire collection of nodes is one big blob (no outliers). The number of nodes is a multiple of three.

我要做的是将节点分为三个连接的节点组成的组(组之间没有重叠)。

What I want to do is divide the nodes into groups of three connected nodes (with no overlapping between groups).

每个节点都有唯一的ID。节点存储在文档中,其中每一行都是节点的ID,后跟该节点所连接的节点。看起来像这样:

Every node has a unique ID. The nodes are stored in a document wherein every line is the ID of a node followed by the nodes it is connected to. It looks something like this:

30:240
40:210/240/230/60
50:80/220/230/70/210/190/200
60:240/40/230/220
70:90/200/50/220
80:50/170/190/210
90:200/70/180
100:110/160/190/170/200/180
110:100/180
160:170/100
170:80/160/190/100
180:90/200/110/100
190:50/80/200/170/100
200:90/70/100/50/190/180
210:50/80/40/230
220:50/230/70/60
230:50/210/60/40/220
240:40/30/60

我制作了一些图像以使其可视化:

I've made some images to help visualize it:

我想将节点分为以下几组三,但必须连接。分组节点有许多可能的组合。一种可能是以下图像:

I want to divide my nodes into groups of three, but they must be connected. There are many possible combinations of grouping the nodes. One possibility is the following image:

但是,我努力提出一种算法,以确保每个节点都在三个连接的节点的组中(并且没有重叠)。为了避免重叠,我在建立组时声明了节点,这是我检查节点是否已经属于另一个组的方法。

However, I am struggling to come up with an algorithm that can ensure that every node is in a group of three connected nodes (and no overlapping). In order to avoid overlapping, I "claim" nodes while making groups, which is my way of checking if a node already belongs to another group.

我想出了以下代码:

nodes = dict()
with open("input.txt") as f:
    for line in f.readlines():
        split = line.split(":")
        node = int(split[0])
        nodes[node] = []
        for connection in split[1].split("/"):
            nodes[node].append(int(connection.strip()))
groups = dict()
claimed = []
for node in nodes:
    if node in claimed:
        continue
    connections = nodes[node]
    if len(connections) > 1:
        for i in range(len(connections)):
            if node in groups and len(groups[node]) == 2:
                break
            if not connections[i] in claimed:
                if not node in groups:
                    groups[node] = []
                groups[node].append(connections[i])
                claimed.append(connections[i])
                if not node in claimed:
                    claimed.append(node)
f = open("output.txt",'w')
for group in groups:
    line = str(group) + ":"
    for node in groups[group]:
        line += str(node) + "/"
    line = line[:-1] + "\n"
    f.write(line)
f.close()

但是,会产生以下结果:

However, this produces the following:

160:170/100
80:190
230:50/210
70:90/200
110:180
240:40/30
220:60

可以像这样可视化:

如您所见,有四组,每组三个(蓝色),三组,每组两个(橙色) 。这是因为我在创建组时声明了节点,导致某些节点不再具有两个未声明的连接的节点。

As you can see, there are four groups of three (in blue), and three groups of two (in orange). This is because I claim nodes as I am making the groups, causing some nodes to no longer have two unclaimed connected nodes for a group.

所以我的问题是:如何让我的节点声明连接的节点,以使其他节点仍可以组成自己的三个节点组?

So my question is: How can I have my nodes claim connected nodes in such a way that other nodes can still form their own groups of three?

更基本地讲:如何将连接的节点分成三个没有重叠的连接节点的组?每个节点必须在三个一组中。

Or more basically: How can I divide connected nodes into groups of three connected nodes with no overlapping? Every node needs to be in a group of three.

我的问题更多是关于算法而不是代码。我也对完全不同的算法持开放态度。

My question is more about the algorithm than code. I am also open to an entirely different algorithm.

推荐答案


data.py

data.py



node_string = """
30:240
40:210/240/230/60
50:80/220/230/70/210/190/200
60:240/40/230/220
70:90/200/50/220
80:50/170/190/210
90:200/70/180
100:110/160/190/170/200/180
110:100/180
160:170/100
170:80/160/190/100
180:90/200/110/100
190:50/80/200/170/100
200:90/70/100/50/190/180
210:50/80/40/230
220:50/230/70/60
230:50/210/60/40/220
240:40/30/60
        """

node_string_1 = """
(238,140,141):(208,134,191)/(117,119,222)/(229,134,102)/(167,234,229)
(176,120,113):(150,201,147)/(111,237,177)
(113,171,164):(237,128,198)/(159,172,151)/(202,171,225)/(240,168,211)/(111,218,139)
(208,134,191):(238,140,141)/(229,134,102)
(131,150,110):(212,209,198)/(155,133,217)/(168,140,187)/(173,204,120)/(147,210,106)/(207,149,101)/(228,172,108)
(229,134,102):(208,134,191)/(238,140,141)/(167,234,229)
(212,209,198):(109,129,191)/(155,133,217)/(147,210,106)/(131,150,110)
(105,199,125):(150,201,147)/(175,118,238)/(240,168,211)
(175,118,238):(105,199,125)/(240,168,211)
(114,178,129):(164,233,166)/(227,123,219)/(135,156,161)/(224,183,104)/(228,172,108)
(147,210,106):(212,209,198)/(173,204,120)/(131,150,110)/(143,147,114)
(222,128,188):(135,156,161)/(237,128,198)/(159,172,151)/(111,237,177)/(150,201,147)
(166,218,221):(207,149,101)/(143,114,198)/(159,172,151)/(202,171,225)
(143,147,114):(173,204,120)/(107,184,168)/(147,210,106)
(111,218,139):(132,138,234)/(202,171,225)/(113,171,164)
(117,119,222):(109,129,191)/(155,133,217)/(238,140,141)/(167,234,229)
(107,184,168):(164,233,166)/(173,204,120)/(143,147,114)/(227,123,219)
(224,183,104):(164,233,166)/(114,178,129)/(111,237,177)/(135,156,161)
(228,172,108):(227,123,219)/(173,204,120)/(135,156,161)/(159,172,151)/(207,149,101)/(131,150,110)/(114,178,129)
(227,123,219):(173,204,120)/(164,233,166)/(107,184,168)/(228,172,108)/(114,178,129)
(168,140,187):(155,133,217)/(207,149,101)/(176,204,213)/(143,114,198)/(131,150,110)/(159,166,127)
(207,149,101):(168,140,187)/(159,172,151)/(166,218,221)/(143,114,198)/(131,150,110)/(228,172,108)
(159,172,151):(207,149,101)/(135,156,161)/(237,128,198)/(202,171,225)/(113,171,164)/(166,218,221)/(222,128,188)/(228,172,108)
(143,114,198):(168,140,187)/(207,149,101)/(202,171,225)/(166,218,221)/(184,163,168)/(159,166,127)
(164,233,166):(227,123,219)/(114,178,129)/(107,184,168)/(224,183,104)
(150,201,147):(176,120,113)/(240,168,211)/(237,128,198)/(105,199,125)/(111,237,177)/(222,128,188)
(111,237,177):(135,156,161)/(224,183,104)/(222,128,188)/(176,120,113)/(150,201,147)
(159,166,127):(184,163,168)/(168,140,187)/(176,204,213)/(143,114,198)
(155,133,217):(212,209,198)/(168,140,187)/(167,234,229)/(176,204,213)/(109,129,191)/(117,119,222)/(131,150,110)
(109,129,191):(212,209,198)/(117,119,222)/(155,133,217)
(132,138,234):(184,163,168)/(202,171,225)/(111,218,139)
(240,168,211):(150,201,147)/(237,128,198)/(105,199,125)/(175,118,238)/(113,171,164)
(167,234,229):(155,133,217)/(117,119,222)/(238,140,141)/(176,204,213)/(229,134,102)
(173,204,120):(227,123,219)/(147,210,106)/(143,147,114)/(107,184,168)/(131,150,110)/(228,172,108)
(135,156,161):(114,178,129)/(224,183,104)/(159,172,151)/(111,237,177)/(222,128,188)/(228,172,108)
(237,128,198):(150,201,147)/(159,172,151)/(222,128,188)/(240,168,211)/(113,171,164)
(176,204,213):(155,133,217)/(168,140,187)/(159,166,127)/(167,234,229)
(202,171,225):(132,138,234)/(184,163,168)/(159,172,151)/(113,171,164)/(166,218,221)/(143,114,198)/(111,218,139)
(184,163,168):(143,114,198)/(132,138,234)/(159,166,127)/(202,171,225)
"""




node_grouper.py

node_grouper.py



import itertools
import math
from collections import deque
from ast import literal_eval as totuple 

import data

class NodeGrouper():

    finished = False

    steps = 0

    nodes = {}
    groups = {}
    weights  = {}

    best_group = []
    returned_results_cache = []

    # parse node string 
    # works with two different conventions: 
    # - node names that are numbers
    # - node names that are tuples
    def parse_node(self, string):
        try:
            return totuple(string)
        except:
            return int(string)

    # Get a dictionary from the source string.
    # receives node_string (str): the node string passed in settings
    # returns {160: [170, 100], 240: [40, 30, 60] ... }
    def parse_string(self, node_string):
        nodes = {}
        for line in node_string.split():
            split = line.split(":")
            node = self.parse_node(split[0])
            if node not in nodes:
                nodes[node] = set()
            for connection in split[1].split("/"):
                connection = self.parse_node(connection.strip())
                if connection not in nodes:
                    nodes[connection] = set()
                nodes[node].add(connection)
                nodes[connection].add(node)
        self.nodes = nodes

        for node in self.nodes:
            self.nodes[node] = sorted(self.nodes[node])
        return nodes


    # Get all possible combinations.
    # Build a tuple with every possible 3 node connection.
    # receives node (dict) as created by self.parse_string()
    # returns ((90, 180, 200), (80, 100, 190), (60, 70, 220) ... )
    def possible_combinations(self, nodes):
        possible_links = {}
        for node, links in nodes.items():
            link_possibilities = itertools.combinations(links, self.nodes_per_group - 1)
            for link_possibility in link_possibilities:
                if node not in possible_links:
                    possible_links[node] = []
                possible_links[node] += [link_possibility]
        groups = set()
        for node, possible_groups in possible_links.items():
            for possible_group in possible_groups:
                groups.add(tuple(sorted(possible_group + (node,))))
        return tuple(groups)



    # Handle status messages (if settings['log_status'] == True).
    def log_status(self):
        # log how many steps were needed
        if self.settings['log_status']:
            print 'Error margin (unused nodes): %s' % (len(self.nodes) -(len(self.best_group) * 3))
            print '%s steps' % self.steps
            print self.format_output(self.best_group)


    # Create an output string in the format of the Stack Overflow ticket.
    def format_output(self, group_combination):

        out_dict = {}

        for group in group_combination:
            for node in group:
                links = set([n for n in group if n != node])

                if set(links).issubset(set(self.nodes[node])):
                    if node not in out_dict:
                        out_dict[node] = set()
                    out_dict[node] = links
                    break

        output = ''
        for node, links in out_dict.items():
            output += '%s:' % str(node)
            output += '/'.join([str(link) for link in links])
            output += '\n'

        return output



    # Start with groups with nodes harder to match.
    def sort_possible_groups(self, queried_group):
        gid = self.groups.index(queried_group) + 1

        for queried_group in queried_group:
            if queried_group in self.preferred_nodes:
                self.weights[gid - 1] -= 3

        return self.weights[gid - 1]


    def initial_weights(self):
        weights = []
        groups_with_this_node = {}
        for group_id, group in enumerate(self.groups):
            ordinarity = 0
            for node in group:
                if node not in groups_with_this_node:
                    groups_with_this_node[node] = sum(x.count(node) for x in self.groups)
                ordinarity += groups_with_this_node[node] * 100
            weights += [ordinarity]
        return weights


    def selected_group_available(self, group_1, group_pool):
        for group_2 in group_pool:
            for node in group_1:

                self.steps += 1

                if self.settings['log_status']:

                    # log status each 500000 steps, if specified in the settings
                    if self.steps % self.settings['log_period'] == 0:
                        self.log_status()

                if node in group_2:
                    return False

        return True                                  

    def get_results(self, i = None, weights = None):

        group_length = len(self.groups)
        group_length_iterator = range(group_length)

        groups = deque(sorted(self.groups, key = self.sort_possible_groups))

        output = [groups[i]]
        for j in group_length_iterator:

            if i == j:
                continue

            selected_group = tuple(groups[j])

            if self.selected_group_available(selected_group, output):
                output += [selected_group]


        match = sorted(tuple(output))

        if len(output) > len(self.best_group):
            self.best_group = output


        if len(output) >= self.maximum_groups or \
            (
                self.settings['maximum_steps'] > 0 and \
                self.steps > self.settings['maximum_steps']
            ):
            self.finished = True


        # print len(output)
        if match not in self.returned_results_cache:
            self.returned_results_cache += [match]
            self.preferred_nodes = [n for n in self.nodes]
            for g in match:
                for n in g:
                    if n in self.preferred_nodes:
                        self.preferred_nodes.remove(n)

        return len(output)




    def __init__(self, **settings):

        # save settings in a property
        self.settings = {}
        for key, value in settings.iteritems(): 
            self.settings[key] = value

        self.nodes_per_group = 3
        self.nodes = self.parse_string(self.settings['node_string'])
        self.preferred_nodes = [node for node in self.nodes]
        self.groups = self.possible_combinations(self.nodes)        


        self.groupable_nodes = list(set(sum(self.groups, ())))
        self.groupable_nodes_total = len(self.groupable_nodes)

        self.maximum_groups = int(math.floor(self.groupable_nodes_total / self.nodes_per_group))

        self.weights = self.initial_weights()


    def calculate(self):

        for i in range(len(ng.groups)):
            self.get_results(i)
            if self.finished:
                break


if __name__ == "__main__":

    ng = NodeGrouper(

        # 0 is for unlimitted 
        # (try until all nodes are used)
        maximum_steps = 10000000,

        # Log number of iterations
        log_status = False,
        # Log each x iterations
        log_period = 5000000,

        node_string = data.node_string_3,

    )

    ng.calculate()

    print ng.format_output(ng.best_group)

它以128500万步处理了此示例:

It handled this example in 1285000000 steps:

这篇关于将连接的节点分为三个组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆