比较2个相似的字符串? [英] Comparing 2 similar strings?
问题描述
如何比较2个字符串,并确定它们的数量关闭?到对方
?例如。
aqwerty
qwertyb
彼此相似,除了第一个/最后一个字符。但是,我如何量化这个呢?
我想你可以说上面的2个字符串
- 最多,7个中的6个字符是相同的序列 - > 85%最高
但是,对于
qawerty
qwerbty
最大相关性
- 7个中的3个字符是相同的序列 - > 42%最高
(截至我最喜欢的3个新闻组。)
-
William Park< ; op ********** @ yahoo.ca> ;,加拿大多伦多
ThinFlash:USB密钥(闪存)驱动器上的Linux瘦客户端
http://home.eol.ca/~parkw/thinflash.html
How do you compare 2 strings, and determine how much they are "close" to
each other? Eg.
aqwerty
qwertyb
are similar to each other, except for first/last char. But, how do I
quantify that?
I guess you can say for the above 2 strings that
- at max, 6 chars out of 7 are same sequence --> 85% max
But, for
qawerty
qwerbty
max correlation is
- 3 chars out of 7 are the same sequence --> 42% max
(Crossposted to 3 of my favourite newsgroup.)
--
William Park <op**********@yahoo.ca>, Toronto, Canada
ThinFlash: Linux thin-client on USB key (flash) drive
http://home.eol.ca/~parkw/thinflash.html
推荐答案
Hello William,
Hello William,
如何比较2个字符串,并确定它们的数量靠近"彼此相对?例如,
aqwerty
qwertyb
彼此相似,除了第一个/最后一个字符。但是,我该如何量化呢?
How do you compare 2 strings, and determine how much they are "close" to
each other? Eg.
aqwerty
qwertyb
are similar to each other, except for first/last char. But, how do I
quantify that?
这是计算机科学的经典问题。
观看:
http://odur.let.rug。 nl /~kleiweg / lev /
http://www.levenshtein。 net /
此解决方案可用于演讲
识别以及DNA研究。
This is a classic problem of computer science.
Watch this:
http://odur.let.rug.nl/~kleiweg/lev/
http://www.levenshtein.net/
This solution has application in speech
recognition and also in DNA research.
>
William Park写道:
William Park wrote:
你如何比较2个字符串,并确定它们的数量关闭彼此相对?例如,
aqwerty
qwertyb
彼此相似,除了第一个/最后一个字符。但是,我该如何量化呢?
我想你可以说上面的2个字符串
- 最多,7个中的6个字符是相同的序列 - > ;最大85%
但是,对于
qawerty
qwerbty
最大相关性是 - 7个中的3个字符是相同的序列 - >最多42%
(转到我最喜欢的3个新闻组。)
How do you compare 2 strings, and determine how much they are "close" to
each other? Eg.
aqwerty
qwertyb
are similar to each other, except for first/last char. But, how do I
quantify that?
I guess you can say for the above 2 strings that
- at max, 6 chars out of 7 are same sequence --> 85% max
But, for
qawerty
qwerbty
max correlation is
- 3 chars out of 7 are the same sequence --> 42% max
(Crossposted to 3 of my favourite newsgroup.)
然而你喜欢可能是正确的答案,但一种方法可能是
比较他们的soundex编码
( http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi?soundex )并弄清楚
基于比较数字部分的
百分比差异。
Ed。
"However you like" is probably the right answer, but one way might be to
compare their soundex encoding
(http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi?soundex) and figure out
percentage difference based on comparing the numeric part.
Ed.
周三,18 2005年5月15:06:53 -0500,Ed Morton< mo **** @ lsupcaemnt.com>
写道:
On Wed, 18 May 2005 15:06:53 -0500, Ed Morton <mo****@lsupcaemnt.com>
wrote:
William Park写道:
William Park wrote:
你如何比较2个字符串,并确定它们的数量关闭?彼此相对?例如,
aqwerty
qwertyb
彼此相似,除了第一个/最后一个字符。但是,我该如何量化呢?
我想你可以说上面的2个字符串
- 最多,7个中的6个字符是相同的序列 - > ;最大85%
但是,对于
qawerty
qwerbty
最大相关性是 - 7个中的3个字符是相同的序列 - >最多42%
(转到我最喜欢的3个新闻组。)
How do you compare 2 strings, and determine how much they are "close" to
each other? Eg.
aqwerty
qwertyb
are similar to each other, except for first/last char. But, how do I
quantify that?
I guess you can say for the above 2 strings that
- at max, 6 chars out of 7 are same sequence --> 85% max
But, for
qawerty
qwerbty
max correlation is
- 3 chars out of 7 are the same sequence --> 42% max
(Crossposted to 3 of my favourite newsgroup.)
但是你喜欢可能是正确的答案,但一种方法可能是比较他们的soundex编码
( http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi?soundex )并找出基于的百分比差异比较数字部分。
"However you like" is probably the right answer, but one way might be to
compare their soundex encoding
(http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi?soundex) and figure out
percentage difference based on comparing the numeric part.
很棒的建议。这是一小段现实生活中的测试数据:
比较姓氏Mousaferiadis和麦克弗森。
Fantastic suggestion. Here''s a tiny piece of real-life test data:
compare the surnames "Mousaferiadis" and "McPherson".
这篇关于比较2个相似的字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!