如何解决,删除或访问ElasticSearch中的子对象? [英] How to address, delete or access child objects in ElasticSearch?

查看:148
本文介绍了如何解决,删除或访问ElasticSearch中的子对象?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



假设我们创建了两个父母和三个孩子。请注意,有两个
孩子,ID为$ code> c2 ,但与父母不同:

 code> curl -XPUT localhost:9200 / test / parent / p1 -d'{
name:Parent 1
}'

curl -XPUT localhost:9200 / test / parent / p2 -d'{
name:Parent 2
}'

curl -XPOST localhost:9200 / test / child / _mapping -d'{
child:{
_parent:{type:parent}
}
}'

curl -XPOST localhost:9200 / test / child / c1?parent = p1 -d'{
child:Parent 1 - Child 1
}'

curl -XPOST localhost:9200 / test / child / c2?parent = p1 -d'{
child:Parent 1 - Child 2
}'

curl - XPOST localhost:9200 / test / child / c2?parent = p2 -d'{
child:Parent 2 - Child 2
}'

如果我们搜索孩子,我们会看到有两个孩子的 _id c2

  curl -XGET localhost:9200 / test / _search 

{
_shards:{
failed:0,
successful:5,
total:5
},
hits:{
hits:[
{
_id:c1,
_index:test,
_score:1.0,
_source:{
child:Parent 1 - Child 1
},
_type:child
$,
{
_id:c2,
_index:test,
_score:1.0,
_source {
child:Parent 1 - Child 2
},
_type:child
},
{
_id :c2,
_index:test,
_score:1.0,
_source:{
child 2
},
_type:child
}
],
max_score:1.0,
total:3
},
timed_out:false,
taken:1
}

如何处理 p1 / c2 ?没有父子关系,可以使用 _id 来访问,更改或删除子对象。在我的情况下,我让弹性搜索创建对象的 id



要访问子对象, _id 是不够的:

  curl -XGET localhost:9200 / test / child / c2 

我必须指定父母:

  curl -XGET localhost:9200 / test / child / c2?parent = p1 

在我的系统中,更糟糕的是,我可以直接访问一些对象,而没有和其他我无法访问的对象。 (为什么?)



如果我删除c2(没有父母!):

 code> curl -XDELETE http:// localhost:9200 / test / child / c2 

两个孩子都被删除。要删除一个孩子,我必须使用?parent = p1

 卷曲-XDELETE http:// localhost:9200 / test / child / c2?parent = p1 

是我的问题




  • 管理子对象身份的最佳做法是什么?

    / li>
  • 这是否意味着我必须以某种方式将父ID手动放入子对象中,然后将对象构造为 id?parent = parent_id


  • 为什么弹性搜索不返回父ID?


  • p>如果我让elasticseach创建子对象的id,它们是否保证是唯一的,否则可能会发生两个不同父母的孩子得到相同的 id



解决方案

子文档只是Elasticsearch中的普通文档,另外还有一个_parent字段指向父类型的文档。

访问子文档时,无论是在索引或获取时,都需要在请求中指定父ID。这是因为父ID实际上用于路由的子文档(请参见例如路由 - http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search.html#search-routing )。

这意味着子文档根据 ID分片,因此它位于与父母相同的分片上。



在您的上面的例子,可能发生的情况是,每个c2文档都是在单独的分片上创建的 - 一个被自己的id划分,另一个根据父ID被分隔开。另一个(根据父ID指定父项)。



这是很重要的理解,所以你不会有索引,获取和搜索之间的不一致。所以你需要记住,当你使用孩子文档时,总是传递父母,以便他们被路由到正确的分片。



关于文档ID - 您需要像所有其他文档一样对待它。这意味着它需要是唯一的,即使父母不同,也不能拥有相同ID的2个文档。

您可以使用父ID作为子文档ID的一部分(如你建议),或者让ES生成一个唯一的id,如果这在你的用例中是有意义的。 ES生成的文档ID是唯一的,无论是父级。



关于获取父字段,需要明确请求,默认情况下不会返回。 (请使用fields参数 - http ://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-get.html#get-fields 或搜索 - http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request- fields.html )。


How are child objects addressed in elasticseacrch?

Suppose we create two parents and three children. Note that there are two children with id c2 but with different parents:

curl -XPUT localhost:9200/test/parent/p1 -d'{
  "name": "Parent 1"
}'

curl -XPUT localhost:9200/test/parent/p2 -d'{
  "name": "Parent 2"
}'

curl -XPOST localhost:9200/test/child/_mapping -d '{
  "child":{
    "_parent": {"type": "parent"}
  }
}'

curl -XPOST localhost:9200/test/child/c1?parent=p1 -d '{
   "child": "Parent 1 - Child 1"
}'

curl -XPOST localhost:9200/test/child/c2?parent=p1 -d '{
   "child": "Parent 1 - Child 2"
}'

curl -XPOST localhost:9200/test/child/c2?parent=p2 -d '{
   "child": "Parent 2 - Child 2"
}'

If we search the children, we see that there are two children with _id of c2

curl -XGET localhost:9200/test/_search

{
  "_shards": {
    "failed": 0, 
    "successful": 5, 
    "total": 5
  }, 
  "hits": {
    "hits": [
      {
        "_id": "c1", 
        "_index": "test", 
        "_score": 1.0, 
        "_source": {
          "child": "Parent 1 - Child 1"
        }, 
        "_type": "child"
      }, 
      {
        "_id": "c2", 
        "_index": "test", 
        "_score": 1.0, 
        "_source": {
          "child": "Parent 1 - Child 2"
        }, 
        "_type": "child"
      }, 
      {
        "_id": "c2", 
        "_index": "test", 
        "_score": 1.0, 
        "_source": {
          "child": "Parent 2 - Child 2"
        }, 
        "_type": "child"
      }
    ], 
    "max_score": 1.0, 
    "total": 3
  }, 
  "timed_out": false, 
  "took": 1
}

How do I address p1/c2? Without the parent child relationship, the _id can be used access, change or delete a child object. In my case I let elasticsearch create the id of the objects.

To access the child objects, the _id is not enough:

curl -XGET localhost:9200/test/child/c2

I have to specify the parent as well:

curl -XGET localhost:9200/test/child/c2?parent=p1

In my system it is worse, some objects I can directly access without the parent and others I cannot access. (Why???)

If I delete c2 (without parent!):

curl -XDELETE http://localhost:9200/test/child/c2

both children are deleted. To delete only one child I have to use ?parent=p1

curl -XDELETE http://localhost:9200/test/child/c2?parent=p1

Here are my questions.

  • What is the best practices to manage the identity of child objects?

  • Does that mean, that I have to somehow put the parent id manually into the child object and then construct the of the object as id?parent=parent_id

  • Why does elasticsearch not return the parent id?

  • If I let elasticseach create the id of the child objects, are they guaranteed to be unique or can it happen that two children of different parents get the same id?

解决方案

Child documents are just ordinary documents in Elasticsearch, with an additional _parent field that points to a document in the parent type.
When accessing child documents, either when indexing or when getting, you need to specify the parent id in the request. This is because the parent id is actually used for routing of the child document (see for instance about routing -http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search.html#search-routing).
This means that the child document is sharded according to the parent id, so it resides on the same shard as the parent.

In your example above, what probably happened is that each of your c2 documents was created on a separate shard - one was sharded by its own id, and the other (where you specified the parent) according to the parent id.

This is important to understand so you won't have inconsistencies between index, get and search. So you need to remember to always pass the parent when you're working with child documents so they will be routed to the right shard.

About the document id - you need to treat it like all other documents. This means that it needs to be unique, you can't have 2 documents with the same id even if they have different parents.
You can either use the parent id as part of the child document id (as you suggested), or let ES generate a unique id if that makes sense in your use case. Document IDs that ES generates are unique, no matter the parent.

About getting the parent field back, you need to request it explicitly, it is not returned by default. (Request it using the fields parameter - http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-get.html#get-fields, or in search - http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-fields.html).

这篇关于如何解决,删除或访问ElasticSearch中的子对象?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆