在Elasticsearch中使用Levenshtein算法进行模糊匹配 [英] Fuzzy string matching using Levenshtein algorithm in Elasticsearch

查看:269
本文介绍了在Elasticsearch中使用Levenshtein算法进行模糊匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚刚开始探索Elasticsearch。我创建了如下文件:

  curl -XPUThttp:// localhost:9200 / cities / city / 1 d'
{
name:圣路易斯

}'

我现在尝试在Levenshtein距离为5的名称字段上进行模糊搜索,如下所示:

  curl -XGEThttp:// localhost:9200 / _search-d'
{
查询:{
fuzzy:{
name {
value:圣路易斯,
fuzziness:5
}

}
}
}'

但它没有返回任何匹配。我期待圣路易斯纪录被归还。如何修复我的查询?



谢谢。

解决方案

您的查询的问题是只允许最大编辑距离2。



在上述情况下,您可能想要做的是具有同义词 for St. to Saint,这将适合你。当然,这取决于你的数据,因为St也可以是街道。



如果你只是测试模糊搜索,你可以尝试这个例子



  curl -XGEThttp:// localhost:9200 / _search-d'
{
查询:{
fuzzy:{
name:{
value:Louiee,
fuzziness:2
}

}
}
}


I have just started exploring Elasticsearch. I created a document as follows:

curl -XPUT "http://localhost:9200/cities/city/1" -d'
{
    "name": "Saint Louis"

}'

I now tried do a fuzzy search on the name field with a Levenshtein distance of 5 as follows :

curl -XGET "http://localhost:9200/_search " -d'
{
    "query": {
       "fuzzy": {
           "name" : {
               "value" : "St. Louis",
               "fuzziness" : 5
           }

       }
    }
}'

But its not returning any match. I expect the Saint Louis record to be returned. How can i fix my query ?

Thanks.

解决方案

The problem with your query is that only a maximum edit distance of 2 is allowed.

In the case above what you probably want to do is have a synonym for St. to Saint, and that would match for you. Of course, this would depend on your data as St could also be "street".

If you want to just test the fuzzy searching, you could try this example

curl -XGET "http://localhost:9200/_search " -d'
{
    "query": {
       "fuzzy": {
           "name" : {
               "value" : "Louiee",
               "fuzziness" : 2
           }

       }
    }
}

这篇关于在Elasticsearch中使用Levenshtein算法进行模糊匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆