ElasticSearch n-gram tokenfilter没有找到部分单词 [英] ElasticSearch n-gram tokenfilter not finding partial words

查看：213 发布时间：2017/8/6 23:21:06 n-gram elasticsearch

本文介绍了ElasticSearch n-gram tokenfilter没有找到部分单词的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我一直在和ElasticSearch一起玩一个我的新项目。我已经设置了默认分析器来使用ngram tokenfilter。这是我的elasticsearch.yml文件：

 索引：
分析：
分析器：
 default_index：
 tokenizer：standard 
 filter：[standard，stop，mynGram] 
 default_search：
 tokenizer：standard 
 filter：[standard，stop] 
 
过滤器：
 mynGram：
类型：nGram 
 min_gram：1 
 max_gram：10

我创建了一个新的索引并添加了以下文档：

  $ curl -XPUT http：// localhost：9200 / test / newtype / 3 -d'{text：one two three four five six}'
 {ok：true，_ index test，_ type：newtype，_ id：3}

当我使用查询文本：hree 或 text：ive 或任何其他部分条款搜索时，ElasticSearch不会返回文件。只有当我搜索确切的术语（如 text：two ）时，它才返回该文档。

我也尝试更改配置文件，以使default_search也使用ngram令牌过滤器，但结果是一样的。我在这里做错了什么，如何纠正？

解决方案

不确定default_ *设置。
但应用指定index_analyzer和search_analyzer的映射工作：

  curl -XDELETE localhost：9200 / twitter 
 curl -XPOST localhost：9200 / twitter -d'
 {index：
 {number_of_shards：1，
analysis：{
filter 
mynGram：{type：nGram，min_gram：2，max_gram：10} 
}，
analyzer：{a1 b $ btype：custom，
tokenizer：standard，
filter：[smallcase，mynGram] 
} 
} 
} 
} 
} 
}'
 
 curl -XPUT localhost：9200 / twitter / tweet / _mapping -d'{
 tweet：{
index_analyzer：a1，
search_analyzer：standard，
date_formats：[yyyy-MM-dd，dd-MM- yyyy]，
properties：{
user：{type：string，analyzer：stan dard}，
message：{type：string} 
} 
}}'
 
 curl -XPUT'http：// localhost ：9200 / twitter / tweet / 1'-d'{
user：kimchy，
post_date：2009-11-15T14：12：12，
消息：尝试弹性搜索
}'
 
 curl -XGET localhost：9200 / twitter / _search？q = ear 
 curl -XGET localhost：9200 / twitter / _search？q = sea 
 
 curl -XGET localhost：9200 / twitter / _mapping

I have been playing around with ElasticSearch for a new project of mine. I have set the default analyzers to use the ngram tokenfilter. This is my elasticsearch.yml file:

index:
analysis:
    analyzer:
        default_index:
            tokenizer: standard
            filter: [standard, stop, mynGram]
        default_search:
            tokenizer: standard
            filter: [standard, stop]

    filter:
        mynGram:
            type: nGram
            min_gram: 1
            max_gram: 10

I created a new index and added the following document to it:

$ curl -XPUT http://localhost:9200/test/newtype/3 -d '{"text": "one two three four five six"}'
{"ok":true,"_index":"test","_type":"newtype","_id":"3"}

However, when I search using the query text:hree or text:ive or any other partial terms, ElasticSearch does not return this document. It returns the document only when I search for the exact term (like text:two).

I have also tried changing the config file such that default_search also uses the ngram token filter, but the result was the same. What am I doing wrong here and how do I correct it?

解决方案

Not sure about the default_* settings. But applying a mapping that specifies index_analyzer and search_analyzer works:

curl -XDELETE localhost:9200/twitter
curl -XPOST localhost:9200/twitter -d '
{"index": 
  { "number_of_shards": 1,
    "analysis": {
       "filter": {
                  "mynGram" : {"type": "nGram", "min_gram": 2, "max_gram": 10}
                 },
       "analyzer": { "a1" : {
                    "type":"custom",
                    "tokenizer": "standard",
                    "filter": ["lowercase", "mynGram"]
                    }
                  } 
     }
  }
}
}'

curl -XPUT localhost:9200/twitter/tweet/_mapping -d '{
    "tweet" : {
        "index_analyzer" : "a1",
        "search_analyzer" : "standard", 
        "date_formats" : ["yyyy-MM-dd", "dd-MM-yyyy"],
        "properties" : {
            "user": {"type":"string", "analyzer":"standard"},
            "message" : {"type" : "string" }
        }
    }}'

curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{
    "user" : "kimchy",
    "post_date" : "2009-11-15T14:12:12",
    "message" : "trying out Elastic Search"
}'

curl -XGET localhost:9200/twitter/_search?q=ear
curl -XGET localhost:9200/twitter/_search?q=sea

curl -XGET localhost:9200/twitter/_mapping

这篇关于ElasticSearch n-gram tokenfilter没有找到部分单词的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

ElasticSearch n-gram tokenfilter没有找到部分单词 [英] ElasticSearch n-gram tokenfilter not finding partial words

问题描述

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录关闭

ElasticSearch n-gram tokenfilter没有找到部分单词 [英] ElasticSearch n-gram tokenfilter not finding partial words

问题描述

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录 关闭

登录关闭