根据他们的字符集簇的话 [英] clustering words based on their char set

查看：100 发布时间：2015/11/30 21:29:57 algorithm anagram

本文介绍了根据他们的字符集簇的话的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

说有一个词集，我想根据自己的炭包（多集），以集群它们。例如

Say there is a word set and I would like to clustering them based on their char bag (multiset). For example

{喝茶，吃饭，ABBA，AABB，你好}

{tea, eat, abba, aabb, hello}

将聚成

{{茶，吃}，{ABBA，AABB}，{你好}}。

{{tea, eat}, {abba, aabb}, {hello}}.

ABBA 和 AABB 聚集在一起，因为它们具有相同的炭包，即两个在和两个 B 。

abba and aabb are clustered together because they have the same char bag, i.e. two a and two b.

要让它有效，一个天真的方法可以让我想到的是隐蔽的每一个字成一个char-CNT系列，为〔实施例， ABBA 和 AABB 将都转换为 A2B2 ，茶/吃了会被转换为 a1e1t1 。所以，我可以建立与相同的密钥字典和组词。

To make it efficient, a naive way I can think of is to covert each word into a char-cnt series, for exmaple, abba and aabb will be both converted to a2b2, tea/eat will be converted to a1e1t1. So that I can build a dictionary and group words with same key.

两个问题：首先，我要的字符排序来构建的关键;第二，该字符串键看起来很笨拙且性能不如CHAR / INT键。

Two issues here: first I have to sort the chars to build the key; second, the string key looks awkward and performance is not as good as char/int keys.

有没有解决问题的更有效的方法？

Is there a more efficient way to solve the problem?

根据他们的字符集簇的话 [英] clustering words based on their char set

问题描述

推荐答案

相关文章

C/C++最新文章

热门教程

热门工具

登录关闭

根据他们的字符集簇的话 [英] clustering words based on their char set

问题描述

推荐答案

相关文章

C/C++最新文章

热门教程

热门工具

登录 关闭

登录关闭