如何使用替换距离比较两个字符串以查找 R 中匹配的字符数? [英] How can I compare two strings to find the number of characters that match in R, using substitution distance?
问题描述
在 R 中,我有两个字符向量,a 和 b.
In R, I have two character vectors, a and b.
a <- c("abcdefg", "hijklmnop", "qrstuvwxyz")
b <- c("abXdeXg", "hiXklXnoX", "Xrstuvwxyz")
我想要一个函数来计算 a 的每个元素和 b 的相应元素之间的字符不匹配.使用上面的例子,这样的函数应该返回 c(2,3,1)
.无需对齐字符串.我需要逐个字符地比较每对字符串并计算每对中的匹配和/或不匹配.R 中有这样的函数吗?
I want a function that counts the character mismatches between each element of a and the corresponding element of b. Using the example above, such a function should return c(2,3,1)
. There is no need to align the strings.
I need to compare each pair of strings character-by-character and count matches and/or mismatches in each pair. Does any such function exist in R?
或者,换一种方式问这个问题,是否有一个函数可以给我两个字符串之间的编辑距离,其中唯一允许的操作是替换(忽略插入或删除)?
Or, to ask the question in another way, is there a function to give me the edit distance between two strings, where the only allowed operation is substitution (ignore insertions or deletions)?
推荐答案
使用一些 mapply
的乐趣:
mapply(function(x,y) sum(x!=y),strsplit(a,""),strsplit(b,""))
#[1] 2 3 1
这篇关于如何使用替换距离比较两个字符串以查找 R 中匹配的字符数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!