在 elasticsearch 上查找具有空字符串值的文档 [英] Find documents with empty string value on elasticsearch

查看:37
本文介绍了在 elasticsearch 上查找具有空字符串值的文档的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在尝试使用 elasticsearch 过滤那些正文中包含空字符串的文档.到目前为止,我没有运气.

I've been trying to filter with elasticsearch only those documents that contains an empty string in its body. So far I'm having no luck.

在我继续之前,我应该提一下,我已经尝试了围绕 Interwebz 和 StackOverflow 传播的许多解决方案".

Before I go on, I should mention that I've already tried the many "solutions" spread around the Interwebz and StackOverflow.

所以,下面是我尝试运行的查询,然后是对应的查询:

So, below is the query that I'm trying to run, followed by its counterparts:

{
    "query": {
        "filtered":{
            "filter": {
                "bool": {
                    "must_not": [
                        {
                            "missing":{
                                "field":"_textContent"
                            }
                        }
                    ]
                }
            }
        }
    }
}

我还尝试了以下方法:

 {
    "query": {
        "filtered":{
            "filter": {
                "bool": {
                    "must_not": [
                        {
                            "missing":{
                                "field":"_textContent",
                                "existence":true,
                                "null_value":true
                            }
                        }
                    ]
                }
            }
        }
    }
}

以及以下内容:

   {
    "query": {
        "filtered":{
            "filter": {
                    "missing": {"field": "_textContent"}
            }
        }
    }
}

以上都没有奏效.当我确定有包含空字符串字段的记录时,我得到一个空结果集.

None of the above worked. I get an empty result set when I know for sure that there are records that contains an empty string field.

如果有人能为我提供任何帮助,我将不胜感激.

If anyone can provide me with any help at all, I'll be very grateful.

谢谢!

推荐答案

如果您使用的是默认分析器 (standard),那么如果它是一个空字符串,它就没有任何东西可以分析.所以你需要逐字索引字段(未分析).下面是一个例子:

If you are using the default analyzer (standard) there is nothing for it to analyze if it is an empty string. So you need to index the field verbatim (not analyzed). Here is an example:

添加一个映射来索引未标记的字段,如果您还需要索引字段的标记化副本,您可以使用 多字段 类型.

Add a mapping that will index the field untokenized, if you need a tokenized copy of the field indexed as well you can use a Multi Field type.

PUT http://localhost:9200/test/_mapping/demo
{
  "demo": {
    "properties": {
      "_content": {
        "type": "string",
        "index": "not_analyzed"
      }
    }
  }
}

接下来,索引几个文档.

Next, index a couple of documents.

/POST http://localhost:9200/test/demo/1/
{
  "_content": ""
}

/POST http://localhost:9200/test/demo/2
{
  "_content": "some content"
}

执行搜索:

POST http://localhost:9200/test/demo/_search
{
  "query": {
    "filtered": {
      "filter": {
        "term": {
          "_content": ""
        }
      }
    }
  }
}

返回带有空字符串的文档.

Returns the document with the empty string.

{
    took: 2,
    timed_out: false,
    _shards: {
        total: 5,
        successful: 5,
        failed: 0
    },
    hits: {
        total: 1,
        max_score: 0.30685282,
        hits: [
            {
                _index: test,
                _type: demo,
                _id: 1,
                _score: 0.30685282,
                _source: {
                    _content: ""
                }
            }
        ]
    }
}

这篇关于在 elasticsearch 上查找具有空字符串值的文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆