用于Word Count的弹性搜索查询过滤器 [英] Elasticsearch Query Filter for Word Count

查看:134
本文介绍了用于Word Count的弹性搜索查询过滤器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在寻找一种在某个字段中返回最多n个单词的文档。

I am currently looking for a way to return documents with a maximum of n words in a certain field.

对于包含以下内容的结果集,查询可能如下所示:在名称字段中有少于三个字的文档,但据我所知,没有什么像word_count。

The query could look like this for a resultset that contains documents with less than three words in the "name" field but there is nothing like word_count as far as I know.

有谁知道如何处理这个,甚至在一个不同的方式?

Does anyone know how to handle this, maybe even in a different way?

GET myindex/myobject/_search
{
  "query": {
    "filtered": {
      "filter": {
        "bool": {
          "must": [
            {
              "word_count": {
                "name": {
                  "lte": 3
                }
              }
            }
          ]
        }
      },
      "query": {
        "match_all" : { }
      }
    }
  }
}


推荐答案

您可以使用 token_count 数据类型,以索引给定字段中的令牌数量然后搜索该字段。

You can use the token_count data type in order to index the number of tokens in a given field and then search on that field.

# 1. create the index/mapping with a token_count field
PUT myindex
{
  "mappings": {
    "myobject": {
      "properties": {
        "name": { 
          "type": "string",
          "fields": {
            "word_count": { 
              "type":     "token_count",
              "analyzer": "standard"
            }
          }
        }
      }
    }
  }
}

# 2. index some documents

PUT index/myobject/1
{
   "name": "The quick brown fox"
}
PUT index/myobject/2
{
   "name": "brown fox"
}

# 3. the following query will only return document 2
POST myindex/_search
{
  "query": {
    "range": {
      "name.word_count": { 
        "lt": 3  
      }
    }
  }
}

这篇关于用于Word Count的弹性搜索查询过滤器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆