如何在弹性搜索中避免嵌套类型的跨对象搜索行为 [英] How to avoid cross object search behavior with nested types in elastic search

查看:88
本文介绍了如何在弹性搜索中避免嵌套类型的跨对象搜索行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图确定在弹性搜索中索引文档的最佳方式。我有一个文件Doc,它有一些字段:

I am trying to determine the best way to index a document in elastic search. I have a document, Doc, which has some fields:

Doc
  created_at
  updated_at
  field_a
  field_b

但是,Doc也会有一些特定于个人用户的字段。例如,field_x对于用户1将具有值A,对于用户2,field_x将具有值B。对于每个文档,将存在非常有限数量的用户(通常为2,最多〜10)。当用户搜索field_x时,他们必须搜索属于它们的值。我一直在ES中探索嵌套类型。

But Doc will also have some fields specific to individual users. For example, field_x will have value 'A' for user 1, and field_x will have value 'B' for user 2. For each doc, there will be a very limited number of users (typically 2, up to ~10). When a user searches on field_x, they must search on the value that belongs to them. I have been exploring nested types in ES.

Doc
  created_at
  updated_at
  field_x: [{
    user: 1
    field_x: A
  },{
    user: 2
    field_x: B
  }]

当用户1在field_x上搜索值A时,该文档应该导致命中。但是,当用户1按值'B'搜索时,不应该。

When user 1 searches on field_x for value 'A', this doc should result in a hit. However, it should not when user 1 searches by value 'B'.

但是,根据文档


索引内部对象发生问题之一文件中的几个
次是跨对象搜索匹配将发生

One of the problems when indexing inner objects that occur several times in a doc is that "cross object" search match will occur

有没有办法避免这种行为嵌套类型还是我应该探索另一种类型?

Is there a way to avoid this behavior with nested types or should I explore another type?

有关这些查询的性能的其他信息将是非常有价值的。只是从阅读文档,它表示嵌套查询在性能方面与常规查询相关性没有太大的不同。如果有任何人有真实的体会,我很乐意听到。

Additional information regarding performance of such queries would be very valuable. Just from reading the docs, its stated that nested queries are not too different in terms of performance as related to regular queries. If anyone has real experience this, I would love to hear it.

推荐答案

嵌套类型是你正在寻找的,不要担心性能过高。

Nested type is what you are looking for, and don't worry too much about performance.

在索引文档之前,您需要设置文档的映射:

Before indexing your documents, you need to set the mapping for your documents:

curl -XDELETE localhost:9200/index
curl -XPUT localhost:9200/index
curl -XPUT localhost:9200/index/type/_mapping -d '{
    "type": {
        "properties": {
            "field_x": {
                "type": "nested",
                "include_in_parent": false,
                "include_in_root": false,
                "properties": {
                    "user": {
                        "type": "string"
                    },
                    "field_x": {
                        "type": "string",
                        "index" : "not_analyzed" // NOTE*
                    }
                }
            }
        }
    }
}'



如果这只是你的例子,并且在你真实的文件中,你正在搜索正确的单词,删除这一行,并弹性搜索分析该领域。

然后,索引您的文档:

curl -XPUT http://localhost:9200/index/type/1 -d '
{ 
    "field_a": "foo",
    "field_b": "bar",
    "field_x" : [{
        "user" : "1",
        "field_x" : "A"
    },
    {
        "user" : "2",
        "field_x" : "B"
    }]
}'

并运行您的查询:

curl -XGET localhost:9200/index/type/_search -d '{ 
    "query": {
        "nested" : {
            "path" : "field_x",
            "score_mode" : "avg",
            "query" : {
                "bool" : {
                    "must" : [
                        {
                            "term": {
                                "field_x.user": "1"
                            }
                        },
                        {
                            "term": {
                                "field_x.field_x": "A"
                            }
                        }
                    ]
                }
            }
        }
    }
}';

这将导致

{"took":13,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":1.987628,"hits":[{"_index":"index","_type":"type","_id":"1","_score":1.987628, "_source" : 
{ 
    "field_a": "foo",
    "field_b": "bar",
    "field_x" : [{
        "user" : "1",
        "field_x" : "A"
    },
    {
        "user" : "2",
        "field_x" : "B"
    }]
}}]}}

但是,查询

curl -XGET localhost:9200/index/type/_search -d '{ 
    "query": {
        "nested" : {
            "path" : "field_x",
            "score_mode" : "avg",
            "query" : {
                "bool" : {
                    "must" : [
                        {
                            "term": {
                                "field_x.user": "1"
                            }
                        },
                        {
                            "term": {
                                "field_x.field_x": "B"
                            }
                        }
                    ]
                }
            }
        }
    }
}';

不会返回任何结果

{"took":6,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":0,"max_score":null,"hits":[]}}

这篇关于如何在弹性搜索中避免嵌套类型的跨对象搜索行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆