如何检测字符串中的相同部分? [英] How to detect identical part(s) inside string?
本文介绍了如何检测字符串中的相同部分?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我尝试将解码算法需求问题分解为更小的问题.这是第一部分.
I try to break down the decoding algorithm wanted question into smaller questions. This is Part I.
问题:
- 两个字符串:s1 和 s2
- s1 的一部分与 s2 的一部分相同
- 空格是分隔符
- 如何提取相同的部分?
示例 1:
s1 = "12 November 2010 - 1 visitor"
s2 = "6 July 2010 - 100 visitors"
the identical parts are "2010", "-", "1" and "visitor"
示例 2:
s1 = "Welcome, John!"
s2 = "Welcome, Peter!"
the identical parts are "Welcome," and "!"
示例 3:(澄清!"示例)
example 3: (to clarify the "!" example)
s1 = "Welcome, Sam!"
s2 = "Welcome, Tom!"
the identical parts are "Welcome," and "m!"
首选 Python 和 Ruby.谢谢
Python and Ruby preferred. Thanks
推荐答案
更新了此示例以适用于所有示例,包括 #1:
Updated this example to work with all the examples, including #1:
def scan(s1, s2):
# Find the longest match where s1 starts with s2
# Returns None if no matches
l = len(s1)
while 1:
if not l:
return None
elif s1[:l] == s2[:l]:
return s1[:l]
else:
l -= 1
def contains(s1, s2):
D = {} # Remove duplicates using a dict
L1 = s1.split(' ')
L2 = s2.split(' ')
# Don't add results which have already
# been processed to satisfy example #1!
DProcessed = {}
for x in L1:
yy = 0
for y in L2:
if yy in DProcessed:
yy += 1
continue
# Scan from the start to the end of the words
a = scan(x, y)
if a:
DProcessed[yy] = None
D[a] = None
break
# Scan from the end to the start of the words
a = scan(x[::-1], y[::-1])
if a:
DProcessed[yy] = None
D[a[::-1]] = None
break
yy += 1
return list(D.keys())
print contains("12 November 2010 - 1 visitor",
"6 July 2010 - 100 visitors")
print contains("Welcome, John!",
"Welcome, Peter!")
print contains("Welcome, Sam!",
"Welcome, Tom!")
输出:
['1', 'visitor', '-', '2010']
['Welcome,', '!']
['Welcome,', 'm!']
这篇关于如何检测字符串中的相同部分?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文