如何找到所有的兄弟字符串? [英] How to find all brotherhood strings?

查看:129
本文介绍了如何找到所有的兄弟字符串?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个字符串,而另外一个文本文件,它包含一个字符串列表。

I have a string, and another text file which contains a list of strings.

我们称之为2串兄弟字符串时,他们完全按字母顺序排序后相同。

We call 2 strings "brotherhood strings" when they're exactly the same after sorting alphabetically.

例如,ABC和CBA将被分类到ABC和ABC,所以原来的两个是兄弟。但ABC和AAA都没有。

For example, "abc" and "cba" will be sorted into "abc" and "abc", so the original two are brotherhood. But "abc" and "aaa" are not.

那么,有没有挑选出从文本文件中的所有兄弟串的有效途径,按照提供的一根弦?

So, is there an efficient way to pick out all brotherhood strings from the text file, according to the one string provided?

例如,我们有ABC并写这样一个文本文件:

For example, we have "abc" and a text file which writes like this:

abc
cba
acb
lalala

然后ABCCBAACB的答案。

当然,排序和放大器,比较是一个不错的尝试,但高效,我的意思是,如果有一种方法,我们能够确定候选字符串与否后的一个通处理原来的兄弟情谊。

Of course, "sort & compare" is a nice try, but by "efficient", i mean if there is a way, we can determine a candidate string is or not brotherhood of the original one after one pass processing.

这是最有效的方式,我想。毕竟,你不能告诉出去,甚至没有阅读的候选串的答案。对于排序,最多的时候,我们需要做1个多传球给考生字符串。因此,哈希表可能是一个很好的解决方案,但我不知道是什么散列函数来选择。

This is the most efficient way, i think. After all, you can not tell out the answer without even reading candidate strings. For sorting, most of the time, we need to do more than 1 pass to the candidate string. So, hash table might be a good solution, but i've no idea what hash function to choose.

推荐答案

最有效的算法,我能想到的:

Most efficient algorithm I can think of:

  • 设置了一个哈希表的原始字符串。让每个字母是键,字母出现在字符串中是值的次数。调用此哈希表inputStringTable
  • 解析输入字符串,你可以看到一个字符每次,一个
  • 递增哈希项的值
  • 在该文件中的每个字符串
  • 创建一个新的哈希表。称它是brotherStringTable。
  • 的字符串中的每个字符,添加一个新的哈希表。如果brotherStringTable [人物]> inputStringTable【性状】,此字符串不是一个弟弟(一个字符显示了太多次了)
  • 在一次字符串进行分析,比较各inputStringTable值与相应的brotherStringTable值。如果一个人是不同的,那么这个字符串不是兄弟字符串。如果所有的匹配,则该字符串是一个哥哥的字符串。
  • Set up a hash table for the original string. Let each letter be the key, and the number of times the letter appears in the string be the value. Call this hash table inputStringTable
  • Parse the input string, and each time you see a character, increment the value of the hash entry by one
  • for each string in the file
  • create a new hash table. Call this one brotherStringTable.
  • for each character in the string, add one to a new hash table. If brotherStringTable[character] > inputStringTable[character], this string is not a brother (one character shows up too many times)
  • once string is parsed, compare each inputStringTable value with the corresponding brotherStringTable value. If one is different, then this string is not a brother string. If all match, then the string is a brother string.

这将是为O(n * k),其中n是输入字符串的长度(比输入字符串不再有任何字符串可以被立即丢弃)和k是在该文件中的字符串的数目。任何基于排序算法将是为O(n * K LG n)的,因此,在某些情况下,这种算法是比基于排序算法更快。

This will be O(n*k), where n is the length of the input string (any strings longer than the input string can be discarded immediately) and k is the number of strings in the file. Any sort based algorithm will be O(n*k lg n), so in certain cases, this algorithm is faster than a sort based algorithm.

这篇关于如何找到所有的兄弟字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆