如何在给定数组中查找重复的字符序列? [英] How to find repeating sequence of characters in a given array?
问题描述
我的问题是在给定数组中找到重复的字符序列.简单来说,就是识别字符出现的模式.
My problem is to find the repeating sequence of characters in the given array. simply, to identify the pattern in which the characters are appearing.
.---.---.---.---.---.---.---.---.---.---.---.---.---.---.
1: | J | A | M | E | S | O | N | J | A | M | E | S | O | N |
'---'---'---'---'---'---'---'---'---'---'---'---'---'---'
.---.---.---.---.---.---.---.---.---.---.---.---.---.---.---.
2: | R | O | N | R | O | N | R | O | N | R | O | N | R | O | N |
'---'---'---'---'---'---'---'---'---'---'---'---'---'---'---'
.---.---.---.---.---.---.---.---.---.---.---.---.
3: | S | H | A | M | I | L | S | H | A | M | I | L |
'---'---'---'---'---'---'---'---'---'---'---'---'
.---.---.---.---.---.---.---.---.---.---.---.---.---.---.---.---.---.---.
4: | C | A | R | P | E | N | T | E | R | C | A | R | P | E | N | T | E | R |
'---'---'---'---'---'---'---'---'---'---'---'---'---'---'---'---'---'---'
给定之前的数据,结果应该是:
Given the previous data, the result should be:
JAMESON"
RON"
SHAMIL"
木匠"
<小时>
问题
- 如何有效地处理这个问题?
- 获取数组的第一个字符(对于您的最后一个示例,将是
C
) - 获取该字符在数组中下一次出现的索引(例如 9)
- 如果找到,则在字符的两次出现之间搜索子串的下一次出现(在本例中为
CARPENTER
) - 如果找到了,就完成了(结果就是这个子字符串).
推荐答案
对于你的例子,我的第一个方法是
For your examples, my first approach would be to
当然,这仅适用于可能数组的非常有限的子集,其中同一个单词从头开始一遍又一遍地重复,中间没有杂散字符,并且它的第一个字符在单词中不重复.但是您的所有示例都属于这一类-我更喜欢可能可行的最简单的解决方案:-)
Of course, this works only for a very limited subset of possible arrays, where the same word is repeated over and over again, starting from the beginning, without stray characters in between, and its first character is not repeated within the word. But all your examples fall into this category - and I prefer the simplest solution which could possibly work :-)
如果重复的单词多次包含第一个字符(例如 CACTUS
),算法可以扩展为查找该字符的后续出现,而不仅仅是第一个(以便它找到整个重复的单词,而不仅仅是它的一个子串).
If the repeated word contains the first character multiple times (e.g. CACTUS
), the algorithm can be extended to look for subsequent occurrences of that character too, not only the first one (so that it finds the whole repeated word, not only a substring of it).
请注意,此扩展算法将为您的第二个示例提供不同的结果,即 RONRON
而不是 RON
.
Note that this extended algorithm would give a different result for your second example, namely RONRON
instead of RON
.
这篇关于如何在给定数组中查找重复的字符序列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!