ElasticSearch a borderNGram for autocomplete\typeahead,是我的search_analyzer被忽略 [英] ElasticSearch an edgeNGram for autocomplete\typeahead, is my search_analyzer being ignored

查看:92
本文介绍了ElasticSearch a borderNGram for autocomplete\typeahead,是我的search_analyzer被忽略的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有三个文件带有userName字段:

I've got three documents with a "userName" field:


  • 'briandilley'

  • 'briangumble'

  • 'briangriffen'

当我搜索brian我让所有的三个回到预期,但是当我搜索'briandilley',我仍然得到三回。分析API告诉我,它在我的搜索字符串上使用ngram过滤器,但我不知道为什么。这是我的设置:

when i search for 'brian' i get all three back as expected, but when i search for 'briandilley' i still get all three back. The analyze API is telling me that it's using the ngram filter on my search string, but i'm not sure why. here's my setup:

索引设置:

{
    "analysis": {
        "analyzer": {
            "username_index": {
                "tokenizer": "keyword",
                "filter": ["lowercase", "username_ngram"]
            },
            "username_search": {
                "tokenizer": "keyword",
                "filter": ["lowercase"]
            }
        },
        "filter": {
            "username_ngram": {
                "type": "edgeNGram",
                "side" : "front",
                "min_gram": 1,
                "max_gram": 15
            }
        }
    }
}

映射:

{
    "user_follow": {

        "properties": {
            "targetId": { "type": "string", "store": true },
            "followerId": { "type": "string", "store": true },
            "dateUpdated": { "type": "date", "store": true },

            "userName": {
                "type": "multi_field",
                "fields": {
                    "userName": {
                        "type": "string",
                        "index": "not_analyzed"
                    },
                    "autocomplete": {
                        "type": "string",
                        "index_analyzer": "username_index",
                        "search_analyzer": "username_search"
                    }
                }
            }
        }
    }
}

搜索:

{
    "from" : 0,
    "size" : 50,
    "query" : {
        "bool" : {
            "must" : [ {
                "field" : {
                    "targetId" : "51888c1b04a6a214e26a4009"
                }
            }, {
                "match" : {
                    "userName.autocomplete" : {
                        "query" : "brian",
                        "type" : "boolean"
                    }
                }
            } ]
        }
    },
    "fields" : "followerId"
}

我尝试过matchQuery,matchPhraseQuery,textQuery和termQuery(java DSL api),我得到相同

I've tried matchQuery, matchPhraseQuery, textQuery and termQuery (java DSL api) and i get the same results every time.

推荐答案

我觉得你没有完全符合你的想法。这就是为什么使用完整的 curl 语句来呈现一个实际的测试用例,而不是缩写。

I think that you're not doing exactly what you think you're doing. This is why it is useful to present an actual test case with full curl statements, rather than abbreviating it.

上面的例子适用于我(稍微修改):

Your example above works for me (slightly modified):

使用设置和映射创建索引:

Create the index with settings and mapping:

curl -XPUT 'http://127.0.0.1:9200/test/?pretty=1'  -d '
{
  "mappings" : {
     "test" : {
        "properties" : {
           "userName" : {
              "fields" : {
                 "autocomplete" : {
                    "search_analyzer" : "username_search",
                    "index_analyzer" : "username_index",
                    "type" : "string"
                 },
                 "userName" : {
                    "index" : "not_analyzed",
                    "type" : "string"
                 }
              },
              "type" : "multi_field"
           }
        }
     }
  },
  "settings" : {
     "analysis" : {
        "filter" : {
           "username_ngram" : {
              "max_gram" : 15,
              "min_gram" : 1,
              "type" : "edge_ngram"
           }
        },
        "analyzer" : {
           "username_index" : {
              "filter" : [
                 "lowercase",
                 "username_ngram"
              ],
              "tokenizer" : "keyword"
           },
           "username_search" : {
              "filter" : [
                 "lowercase"
              ],
              "tokenizer" : "keyword"
           }
        }
     }
  }
}
'

索引一些数据:

curl -XPOST 'http://127.0.0.1:9200/test/test?pretty=1'  -d '{
  "userName" : "briangriffen"
}
'

curl -XPOST 'http://127.0.0.1:9200/test/test?pretty=1'  -d '
{
  "userName" : "brianlilley"
}
'

curl -XPOST 'http://127.0.0.1:9200/test/test?pretty=1'  -d '
{
  "userName" : "briangumble"
}
'

搜索 brian 查找所有文档:

curl -XGET 'http://127.0.0.1:9200/test/test/_search?pretty=1'  -d '{
  "query" : {
     "match" : {
        "userName.autocomplete" : "brian"
     }
  }
}
'

# {
#    "hits" : {
#       "hits" : [
#          {
#             "_source" : {
#                "userName" : "briangriffen"
#             },
#             "_score" : 0.1486337,
#             "_index" : "test",
#             "_id" : "AWzezvEFRIykOAr75QbtcQ",
#             "_type" : "test"
#          },
#          {
#             "_source" : {
#                "userName" : "briangumble"
#             },
#             "_score" : 0.1486337,
#             "_index" : "test",
#             "_id" : "qIABuMOiTyuxLOiFOzcURg",
#             "_type" : "test"
#          },
#          {
#             "_source" : {
#                "userName" : "brianlilley"
#             },
#             "_score" : 0.076713204,
#             "_index" : "test",
#             "_id" : "fGgTITKvR6GJXI_cqA4Vzg",
#             "_type" : "test"
#          }
#       ],
#       "max_score" : 0.1486337,
#       "total" : 3
#    },
#    "timed_out" : false,
#    "_shards" : {
#       "failed" : 0,
#       "successful" : 5,
#       "total" : 5
#    },
#    "took" : 8
# }

搜索 brianlilley 只找到该文件:

curl -XGET 'http://127.0.0.1:9200/test/test/_search?pretty=1'  -d '
{
  "query" : {
     "match" : {
        "userName.autocomplete" : "brianlilley"
     }
  }
}
'

# {
#    "hits" : {
#       "hits" : [
#          {
#             "_source" : {
#                "userName" : "brianlilley"
#             },
#             "_score" : 0.076713204,
#             "_index" : "test",
#             "_id" : "fGgTITKvR6GJXI_cqA4Vzg",
#             "_type" : "test"
#          }
#       ],
#       "max_score" : 0.076713204,
#       "total" : 1
#    },
#    "timed_out" : false,
#    "_shards" : {
#       "failed" : 0,
#       "successful" : 5,
#       "total" : 5
#    },
#    "took" : 4
# }

这篇关于ElasticSearch a borderNGram for autocomplete\typeahead,是我的search_analyzer被忽略的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆