stringdist_join的结果为NA [英] stringdist_join results in NAs

查看:123
本文介绍了stringdist_join的结果为NA的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在试验stringdist软件包以进行模糊连接,但遇到了一个我不理解且无法找到答案的问题. 我想用"dl"方法将这两个数据表连接起来,并产生一个NA,我完全不了解.也许你们中的一个对此有一个解释. 代码:

i am experimenting with the stringdist package in order to make fuzzy joins and i run into a problem which i do not understand and fail to find an answer for. I want to join these 2 data tables with the "dl" method and it produces a NA, which i completely do not understand. Maybe one of you has an explanation for this. The code:

library(fuzzyjoin)
test1<-as.data.frame(test1<-c("techniker"))
test2<-as.data.frame(test2<-c("technician"))
setnames(test2,1,"label")
setnames(test1,1,"label")
x <- stringdist_join(test1, test2, by = "label", mode = "left", distance_col="distance", method="dl") 

但是,如果我使用jaccard方法,则有一个匹配项:

if i use the jaccard method however, there is a match:

y <- stringdist_join(test1, test2, by = "label", mode = "left", distance_col="distance", method="jaccard", q=4) 

希望任何人都可以澄清.

Hope anyone can clarify.

欢呼 圆顶

推荐答案

max_dist默认设置为2.

"tekniker""technician"之间的dl距离大于2.

The dl distance between "tekniker" and "technician" is more than 2.

所以没有匹配项.

stringdist_join(test1, test2, by = "label", mode = "left", distance_col="distance", method="dl",max_dist=5)
#     label.x label.y distance
# 1 techniker  techni        3

这篇关于stringdist_join的结果为NA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆