如何将互连的ID对列表转换为ID集群? [英] How do I turn a list of interconnected pairs of ids into a cluster of ids?

查看:62
本文介绍了如何将互连的ID对列表转换为ID集群?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一张桌子,桌子上有成对的ID(有时是三对),它们是链中的一种链接

I have a table with pairs (and sometimes triples) of ids, which act as sort of links in a chain

+------+-----+
| from | to  |
+------+-----+
| id1  | id2 |
| id2  | id3 |
| id4  | id5 |
+------+-----+

我想创建一个新表,其中所有链接都聚集到链/族中:

I want to create a new table where all the links are clustered into chains/families:

+-----+----------+
| id  | familyid |
+-----+----------+
| id1 |        1 |
| id2 |        1 |
| id3 |        1 |
| id4 |        2 |
| id5 |        2 |
+-----+----------+

即将链接中的所有链添加到一个家族中,并为其指定一个ID.在上面的示例中,第一个表的前2行创建一个家庭,最后一行创建另一个家庭.

i.e. add up all chains in a link into a single family, and give it an id. in the example above, the first 2 rows of the first table create one family, and the last row creates another family.

解决方案

我将使用node.js查询大批行(每批几千行),对其进行处理,然后将它们插入具有家族ID的我自己的表中.

I will use node.js to query big batches of rows (a few thousands every batch), process them, and insert them into my own table with a family id.

问题

问题是我有成千上万个ID对,并且在最初创建Familys表之后,随着时间的推移,我还需要添加新的ID,并且我需要将ID添加到现有的Family中.

The problem is I have a few tens of thousands of id pairs, and I will also need to add new ids over time after the initial creation of the families table, and i will need to add ids to existing families

是否有好的算法可以将数据对聚类到族/类中,从而牢记我的问题?

Are there good algorithms for clustering pairs of data into families/clusters, keeping my issue in mind?

推荐答案

这看起来很像在图数据集上进行聚类,其中"familyid"是聚类中心编号.

This looks a lot like clustering over graph dataset where 'familyid' is the cluster center number.

这是我认为与之相关的一个问题.

这是算法描述.您需要在描述的条件下实施.

Here is the algorithm description. You will need to implement under the conditions you described.

这篇关于如何将互连的ID对列表转换为ID集群?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆