在弹性搜索查询中将孩子视为父母的领域 [英] Treat child as field of parent in elastic search query

查看:103
本文介绍了在弹性搜索查询中将孩子视为父母的领域的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在阅读弹性搜索的文档,这个[page] [1]谈到使用 _parent 将一个孩子映射到父类型。



如果我有一个称为帐户的家长附有电子邮件的孩子:



每种类型的字段:

 帐户(http:// localhost:9200 / myapp / account / 1)
========
id
name
some_other_info
state

电子邮件(http: // localhost:9200 / myapp / email / 1?parent = 1)
========
id
电子邮件
/ pre>


  • 如何在名称 code>帐户和电子邮件字段电子邮件,条件是状态帐户活动


  • 有没有办法让所有的孩子(某种类型或任何类型的)父母拥有?


  • 当索引子文档时,是否可以将父代作为对象传递JSON数据,而不是查询字符串的一部分?







在尝试imotov的建议后,我想出了这个查询:



这是在 http:// localhost:9200 / myapp / account / _search

  {
查询:{
bool:{
must:[
{
前缀:{
name:a
}
},
{
term:{
statuses:active
}
}
],
should:[
{
has_child:{
type:emailaddress,
query:{
prefix:{
email :a
}
}
}
}
]
}
}
}

问题是上面没有给我任何帐号电子邮件匹配的地方。



我想要的效果本质就是这样:




  • 有一个搜索框

  • 用户开始输入和搜索框自动填充。

  • 用户的查询根据帐户或任何 emailaddress 类型。

  • 如果 / code>匹配,只是返回。如果 emailaddress 匹配,请返回其父帐户。

  • 每次搜索最多限制为x(例如10)个帐户。 / li>


所以,我基本上需要能够 2种类型之间的搜索并返回父类型的匹配。






测试数据:

  curl -XPUT http:// localhost:9200 / test / account / 1 -d'{
name:John Smith,
status :active
}'

curl -XPUT http:// localhost:9200 / test / account / 2 -d'{
name:Peter Smith ,
status:active
}'

curl -XPUT http:// localhost:9200 / test / account / 3 -d'{
name:Andy Smith,
status:active
}'

//设置父/子关系的映射

curl -XPUT'http:// localhost:9200 / test / email / _mapping'-d'{
emails:{
_parent:{type:account }
}
}'

curl -XPUT http:// localhost:9200 / test / email / 1?parent = 1 -d'{
email:john@smith.com
}'

curl -XPUT http:// localhost:9200 / test / email / 2?parent = 1 -d'{
email:admin@mycompany.com
}'

curl -XPUT http:// localhost:9200 / test / email / 3?parent = 1 -d'{
email:abcd@efg.com
} '

curl -XPUT http:// localhost:9200 / test / email / 4?parent = 2 -d'{
email:peter@peter.com
$'

curl -XPUT http:// localhost:9200 / test / email / 5?parent = 3 -d'{
email:andy@yahoo.com
}'

curl -XPUT http:// localhost:9200 / test / email / 6?parent = 3 -d'{
email:support @ mycompany.com
}'






imotov的解决方案为我工作。我找到的另一个解决方案是查询帐户 s status = active ,然后运行 bool 过滤结果,使用子类型上的 has_child 前缀 <$ code code code code code code code code code $ c

弹性搜索和关系数据库之间的一个重要区别是,弹性搜索不能执行连接。在弹性搜索中,您总是搜索索引的单个索引或联合。但是在父/子关系的情况下,可以使用对子索引的查询来限制父索引中的结果。例如,您可以在帐户类型上执行此查询。

  {
bool:{
must:[
{
text:{name:foo}
},{
term:{state:active}
},{
has_child:{
type:email,
query :{
text:{email:bar}
}
}
}
]
}
}

此查询将仅返回您的父文档(不会返回子文档)。您可以使用此查询返回的父ID,使用默认存储和索引的字段 _parent 来查找此父级的所有子级。

  {
term:{_parent:1}
}

或者您可以将结果限制为包含单词 bar 在字段电子邮件

  {
bool:{
must:[
{
term:{_parent:1}
},{
:{email:bar}
}
]
}
}

我不认为可以在json中指定parent,除非你使用 _批量索引



这是如何使用问题提供的测试数据实现电子邮件查找:

 #!/ bin / sh 
curl -XDELETE'http:// localhost:9200 / test'&& echo
curl -XPOST'http:// localhost:9200 / test'-d'{
settings:{
number_of_shards:1,
number_of_replicas: 0
},
mappings:{
account:{
_source:{enabled:true},
properties
name:{type:string,analyzer:standard},
status:{type:string,index:not_analyzed
}
},
email:{
_parent:{
type:account
},
属性:{
email:{type:string,analyzer:standard}
}
}
}
}'&& echo

curl -XPUT'http:// localhost:9200 / test / account / 1'-d'{
name:John Smith,
status :active
}'

curl -XPUT'http:// localhost:9200 / test / account / 2'-d'{
name: Peter Smith,
status:active
}'

curl -XPUT'http:// localhost:9200 / test / account / 3'-d' {
name:Andy Smith,
statuses:active
}'

//设置父/子关系的映射

curl -XPUT'http:// localhost:9200 / test / email / 1?parent = 1'-d'{
email:john@smith.com
$'

curl -XPUT'http:// localhost:9200 / test / email / 2?parent = 1'-d'{
email:admin @ mycompany .com
}'

curl -XPUT'http:// localhost:9200 / test / email / 3?parent = 1'-d'{
email :abcd@efg.com
}'

curl -XPUT'http:// localhost:9200 / test / email / 4?parent = 2'-d'{
email:peter@peter.com
}'

curl -XPUT'http:// localhost:9200 / test / email / 5?parent = 3'-d'{
email:andy@yahoo.com
}'

curl -XPUT'http:// localhost:9200 / test / email / 6?parent = 3'-d'{
email:support @ mycompany.com
}'

curl -XPOST'http:// localhost:9200 / test / _refresh'
echo
curl'http:// localhost :9200 / test / account / _search'-d'{
query:{
bool:{
must:[
{

状态:活动
}
}
],
应该:[
{
前缀:{
name:a
}
},
{
has_child:{
type:email ,
查询:{
前缀:{
电子邮件:a
}
}
}
}
],
minimum_number_should_比赛:1
}
}
}'&& echo


I am reading the docs for elasticsearch and this [page][1] talks about mapping a child to a parent type using _parent.

If I have childs called email attached to parents called account:

Fields in each type:

account (http://localhost:9200/myapp/account/1)
========
id
name
some_other_info
state

email (http://localhost:9200/myapp/email/1?parent=1)
========
id
email

  • How can I search on the name field of account and the email field of email provided that the state of account is active?

  • Is there a way to get all the children (of a certain type or of any type) a parent owns?

  • When indexing a child document, is it possible to to pass the parent as an object property in the JSON data as opposed to it being part of the query string?


After trying imotov's suggestion, I came up with this query:

This is executed on http://localhost:9200/myapp/account/_search

{
  "query": {
    "bool": {
      "must": [
        {
          "prefix": {
            "name": "a"
          }
        },
        {
          "term": {
            "statuses": "active"
          }
        }
      ],
      "should": [
        {
          "has_child": {
            "type": "emailaddress",
            "query": {
              "prefix": {
                "email": "a"
              }
            }
          }
        }
      ]
    }
  }
}

The problem is that the above does not give me any accounts where the email matches.

The effect I want is essentially this:

  • There is one search box
  • Users start typing and the search box autocompletes.
  • The user's query is checked against the name of the account or any of the emailaddress type.
  • If accounts were matched, just return them. If emailaddress were match, return its parent account.
  • Limit to a maximum of x (say 10) accounts for each search.

So, I basically need to be able to OR the search between 2 types and return the parent type of matches.


Test data:

curl -XPUT http://localhost:9200/test/account/1 -d '{
    "name": "John Smith",
    "statuses": "active"
}'

curl -XPUT http://localhost:9200/test/account/2 -d '{
    "name": "Peter Smith",
    "statuses": "active"
}'

curl -XPUT http://localhost:9200/test/account/3 -d '{
    "name": "Andy Smith",
    "statuses": "active"
}'

//Set up mapping for parent/child relationship

curl -XPUT 'http://localhost:9200/test/email/_mapping' -d '{
    "emails" : {
        "_parent" : {"type" : "account"}
    }
}'

curl -XPUT http://localhost:9200/test/email/1?parent=1 -d '{
    "email": "john@smith.com"
}'

curl -XPUT http://localhost:9200/test/email/2?parent=1 -d '{
    "email": "admin@mycompany.com"
}'

curl -XPUT http://localhost:9200/test/email/3?parent=1 -d '{
    "email": "abcd@efg.com"
}'

curl -XPUT http://localhost:9200/test/email/4?parent=2 -d '{
    "email": "peter@peter.com"
}'

curl -XPUT http://localhost:9200/test/email/5?parent=3 -d '{
    "email": "andy@yahoo.com"
}'

curl -XPUT http://localhost:9200/test/email/6?parent=3 -d '{
    "email": "support@mycompany.com"
}'


imotov's solution worked for me. Another solution I have found is to query accounts for status = active, then run a bool filter on the result and use has_child on the child type and prefix on name inside the bool filter.

解决方案

An important difference between elasticsearch and relational databases is that elasticsearch cannot perform joins. In elasticsearch you are always searching a single index or union of indices. But in case of parent/child relationship, it's possible to limit results in the parent index using a query on the child index. For example, you can execute this query on the account type.

{
    "bool": {
        "must": [
            { 
                "text" : { "name": "foo" } 
            }, { 
                "term" : { "state": "active" } 
            }, {
                "has_child": {
                    "type": "email",
                    "query": {
                        "text": {"email": "bar" }
                    }
                }
            }
        ]
    }
}

This query will return you the parent document only (no child documents will be returned). You can use the parent id returned by this query to find all children of this parent using the field _parent, which is stored and indexed by default.

{
    "term" : { "_parent": "1" } 
}

Or you can limit your results only to the children that contain the word bar in the field email:

{
    "bool": {
        "must": [
            { 
                "term" : { "_parent": "1" } 
            }, { 
                "text" : { "email": "bar" } 
            }
        ]
    }
}

I don't think it's possible to specify parent in the json unless you are using _bulk indexing.

This is how email lookup can be implemented using test data provided in the question:

#!/bin/sh
curl -XDELETE 'http://localhost:9200/test' && echo 
curl -XPOST 'http://localhost:9200/test' -d '{
    "settings" : {
        "number_of_shards" : 1,
        "number_of_replicas" : 0
    },
    "mappings" : {
      "account" : {
        "_source" : { "enabled" : true },
        "properties" : {
          "name": { "type": "string", "analyzer": "standard" },
          "statuses": { "type": "string",  "index": "not_analyzed" }
        }
      },
      "email" : {
        "_parent" : {
          "type" : "account"
        },
        "properties" : {
          "email": { "type": "string",  "analyzer": "standard" }
        }
      }
    }
}' && echo

curl -XPUT 'http://localhost:9200/test/account/1' -d '{
    "name": "John Smith",
    "statuses": "active"
}'

curl -XPUT 'http://localhost:9200/test/account/2' -d '{
    "name": "Peter Smith",
    "statuses": "active"
}'

curl -XPUT 'http://localhost:9200/test/account/3' -d '{
    "name": "Andy Smith",
    "statuses": "active"
}'

//Set up mapping for parent/child relationship

curl -XPUT 'http://localhost:9200/test/email/1?parent=1' -d '{
    "email": "john@smith.com"
}'

curl -XPUT 'http://localhost:9200/test/email/2?parent=1' -d '{
    "email": "admin@mycompany.com"
}'

curl -XPUT 'http://localhost:9200/test/email/3?parent=1' -d '{
    "email": "abcd@efg.com"
}'

curl -XPUT 'http://localhost:9200/test/email/4?parent=2' -d '{
    "email": "peter@peter.com"
}'

curl -XPUT 'http://localhost:9200/test/email/5?parent=3' -d '{
    "email": "andy@yahoo.com"
}'

curl -XPUT 'http://localhost:9200/test/email/6?parent=3' -d '{
    "email": "support@mycompany.com"
}'

curl -XPOST 'http://localhost:9200/test/_refresh'
echo
curl 'http://localhost:9200/test/account/_search' -d '{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "statuses": "active"
          }
        }
      ],
      "should": [
        {
          "prefix": {
            "name": "a"
          }
        },
        {
          "has_child": {
            "type": "email",
            "query": {
              "prefix": {
                "email": "a"
              }
            }
          }
        }
      ],
      "minimum_number_should_match" : 1
    }
  }
}' && echo

这篇关于在弹性搜索查询中将孩子视为父母的领域的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆