agrep:只返回最佳匹配 [英] agrep: only return best match(es)

查看:16
本文介绍了agrep:只返回最佳匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 R 中使用了 'agrep' 函数,它返回一个匹配向量.我想要一个类似于 agrep 的函数,它只返回最佳匹配,或者如果有关系则返回最佳匹配.目前,我在结果向量的每个元素上使用包 'cba' 中的 'sdist()' 函数执行此操作,但这似乎非常多余.

I'm using the 'agrep' function in R, which returns a vector of matches. I would like a function similar to agrep that only returns the best match, or best matches if there are ties. Currently, I am doing this using the 'sdist()' function from the package 'cba' on each element of the resulting vector, but this seems very redundant.

/edit: 这是我目前使用的功能.我想加快速度,因为两次计算距离似乎是多余的.

/edit: here is the function I'm currently using. I'd like to speed it up, as it seems redundant to calculate distance twice.

library(cba)
word <- 'test'
words <- c('Teest','teeeest','New York City','yeast','text','Test')
ClosestMatch <- function(string,StringVector) {
  matches <- agrep(string,StringVector,value=TRUE)
  distance <- sdists(string,matches,method = "ow",weight = c(1, 0, 2))
  matches <- data.frame(matches,as.numeric(distance))
  matches <- subset(matches,distance==min(distance))
  as.character(matches$matches)
}

ClosestMatch(word,words)

推荐答案

RecordLinkage 包已从 CRAN 中移除,使用 stringdist 代替:

RecordLinkage package was removed from CRAN, use stringdist instead:

library(stringdist)

ClosestMatch2 = function(string, stringVector){

  stringVector[amatch(string, stringVector, maxDist=Inf)]

}

这篇关于agrep:只返回最佳匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆