在Elasticsearch中_source和_all有什么区别? [英] what is the difference between _source and _all in Elasticsearch

查看:246
本文介绍了在Elasticsearch中_source和_all有什么区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



如果我的文档有:



pre> {mydoc:
{properties:
{name:{type:string,store true}},
{number:{type:long,store:false}},
{title:{type ,include_in_all:false,store:true}}

}
}

我明白_source是一个包含所有字段的字段。但是_all呢
这是否意味着name被保存多次(_src和_all中两次?),增加了文档所需的磁盘空间?



是名称存储一次为字段,一次为_source,一次为_all?
关于号码,他是否存储在所有,即使不在_source?



什么时候应该在查询中使用_source,而_all?



什么是usecase,我可以禁用_all,什么功能将被拒绝我?

解决方案

与lucene中的索引字段和存储字段的区别几乎相同。



当您想要使用索引字段搜索他们,而您要存储要返回的字段作为搜索结果。



_source 字段是意味着存储最初发送到弹性搜索的整个源文档。它被用作搜索结果,被检索。你不能搜索它。实际上,它是一个存储在lucene中的字段,而不是索引。



_all 字段是为了索引所有来自您的文档所有领域的内容。你可以搜索它,但是永远不会返回它,因为它是索引的,但没有存储在lucene中。



没有冗余,这两个字段是用于不同的用例,存储在不同的地方,在lucene指数内。 _all 字段成为我们称为反向索引的一部分,用于索引文本,并能够执行全文搜索,而 _source 字段仅作为lucene文档的一部分存储。



您永远不会使用 _source 字段,只有当你收到结果,因为这是弹性搜索返回默认情况下。有一些取决于 _source 字段的功能,如果禁用它,则会丢失该功能。其中一个是更新API 。另外,如果你禁用它,你需要记住配置为 store:yes 在映射所有要返回的字段作为搜索结果。我宁愿说不要禁用它,除非它打扰你,因为它在很多情况下真的很有帮助。另一个常见的用例是当您需要重新索引您的数据时;您可以从弹性搜索本身检索所有文档,并将其重新发送到另一个索引。



另一方面, _all 字段只是默认的所有字段,当您只想在所有可用的字段上进行搜索时,您可以使用该字段,并且不想在查询中指定所有字段。它很方便,但我不会太依赖于生产,最好在不同的领域运行更复杂的查询,每个不同的权重。你可能想要禁用它,如果你不使用它,这将比禁用 _source 的影响更小。


The difference between the two, who hold all of the fields, eludes me.

If my document has:

{"mydoc":
  {"properties":
      {"name":{"type":"string","store":"true"}},
      {"number":{"type":"long","store":"false"}},
      {"title":{"type":"string","include_in_all":"false","store":"true"}}

  }
}

I understand that _source is a field that has all the fields. But so does _all? Does this mean that "name" are saved several times (twice? in _src and in _all), increasing the disk space the document takes?

Is "name" stored once for the field, once for _source, and once for _all? what about "number", is he stored in all, even though not in _source?

When should I use _source in my query, and when _all?

What is the usecase where I can disable "_all", and what functionality would then be denied me?

解决方案

It's pretty much the same as the difference between indexed fields and stored fields in lucene.

You use indexed fields when you want to search on them, while you store fields that you want to return as search results.

The _source field is meant to store the whole source document that was originally sent to elasticsearch. It's use as search result, to be retrieved. You can't search on it. In fact it is a stored field in lucene and not indexed.

The _all field is meant to index all the content that come from all the fields that your documents are composed of. You can search on it but never return it, since it's indexed but not stored in lucene.

There's no redundancy, the two fields are meant for a different usecase and stored in different places, within the lucene index. The _all field becomes part of what we call the inverted index, use to index text and be able to execute full-text search against it, while the _source field is just stored as part of the lucene documents.

You would never use the _source field in your queries, only when you get back results since that's what elasticsearch returns by default. There are a few features that depend on the _source field, that you lose if you disable it. One of them is the update API. Also, if you disable it you need to remember to configure as store:yes in your mapping all the fields that you want to return as search results. I would rather say don't disable it unless it bothers you, since it's really helpful in a lot of cases. One other common usecase would be when you need to reindex your data; you can just retrieve all your documents from elasticsearch itself and just resend them to another index.

On the other hand, the _all field is just a default catch all field, that you can use when you just want to search on all fields available and you don't want to specify them all in your queries. It's handy but I wouldn't rely on it too much on production, where it's better to run more complex queries on different fields, with different weights each. You might want to disable it if you don't use it, this will have a smaller impact than disabling the _source in my opinion.

这篇关于在Elasticsearch中_source和_all有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆