为什么需要“存储”:“是”在弹性搜索? [英] Why do I need "store":"yes" in elasticsearch?

查看:266
本文介绍了为什么需要“存储”:“是”在弹性搜索?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我真的不明白为什么在核心类型链接中,它在属性中说描述(例如一个数字):

I really don't understand why in core types link it says in the attributes descriptions (for a number, for example):


  1. store - 设置为yes,将实际字段存储在索引中,否则不存储。默认为no(注意, JSON文档本身已存储,可以从其中检索

  2. index - 如果值不应为索引。在这种情况下,存储应设置为是,因为如果没有索引并且不存储,那么与它无关

  1. store - Set to yes to store actual field in the index, no to not store it. Defaults to no (note, the JSON document itself is stored, and it can be retrieved from it)
  2. index - Set to no if the value should not be indexed. In this case, store should be set to yes, since if it’s not indexed and not stored, there is nothing to do with it

这两个大胆的部分似乎是矛盾的。如果index:no,store:no我仍然可以从源中获取值。如果我有一个包含URL的字段,这可能是一个很好的用法。没有?

The two bold parts seem to contradict. If "index":"no", "store":"no" I could still get the value from the source. This could be a good use if I have a field containing a URL for example. No?

我有一个实验,我有两个映射,一个字段设置为store:yes / code>,另一个则为store:no

I had a little experiment, where I had two mappings, in one a field was set to "store":"yes" and in the other to "store":"no".

我仍然可以在我的查询中指定:

In both cases I could still specify in my query:

{"query":{"match_all":{}}, "fields":["my_test_field"]}

我得到相同的答案,返回字段。

and I got the same answer, returning the field.

我以为如果store设置为这意味着我无法检索具体字段,但是必须得到整个 _source 并在客户端解析。

I thought that if "store" is set to "no" it would mean I could not retreive the specific field, but had to get the whole _source and parse it on the client side.

那么设置storeyes有什么好处?如果我从_ source字段中排除该字段是唯一相关的?

So, what benefit is there in setting "store" to "yes"? Is it only relevant if I exclude the field from the "_source" field explicitly?

推荐答案


我以为如果store设置为no,这意味着我不能
检索特定的字段,而必须得到整个_source和
解析它在客户端。

I thought that if "store" is set to "no" it would mean I could not retrieve the specific field, but had to get the whole _source and parse it on the client side.

这是正确的弹性搜索对您没有存储字段(默认)和 _source 字段已启用(默认值)。

That's exactly what elasticsearch does for you when a field is not stored (default) and the _source field is enabled (default too).

您通常会将一个字段发送到弹性搜索,因为您要搜索在它,或检索它。但是,如果您不明确存储该字段,并且不会禁用源代码,那么仍然可以使用 _source 来检索该字段。这意味着在某些情况下,有一个字段没有编入索引并且没有被保存。实际上,这个字段实际上是没有索引的,也不是存储的。

You usually send a field to elasticsearch because you either want to search on it, or retrieve it. But it's true that if you don't store the field explicitly and you don't disable the source you can still retrieve the field using the _source. This means that in some cases it might actually make sense to have a field that is not indexed nor stored.

当你存储一个字段时,这是在底层的lucene中完成的。 Lucene是一个反向索引,它允许快速的全文搜索,并给出给定文本查询的文档ID。在反向索引之外,Lucene具有某种存储,其中可以存储字段值以便在给定文档ID的情况下被检索。您通常在lucene中存储要作为搜索结果返回的字段。 Elasticsearch不需要存储您想要返回的每个字段,因为它总是默认存储您发送给它的每个文档,因此它总是能够将您发送给您的所有内容作为搜索结果返回。

When you store a field, that's done in the underlying lucene. Lucene is an inverted index, that allows for fast full-text search and gives back document ids given text queries. Beyond the inverted index Lucene has some kind of storage where the field values can be stored in order to be retrieved given a document id. You usually store in lucene the fields that you want to return as search results. Elasticsearch doesn't require to store every field that you want to return because it always stores by default every document that you send to it, thus it's always able to return everything you sent to it as search result.

在一些情况下,在lucene中显式存储字段可能是有用的:当 _source 字段被禁用时,或者当我们想避免解析它,即使解析是通过弹性搜索自动完成的。
请记住,从lucene检索许多存储的字段可能需要每个字段一个磁盘查找,而只从lucene检索 _source 并解析它以便检索所需的字段只是一个单一的磁盘查找,并且在大多数情况下更快。

In just a few cases it might be useful to store fields explicitly in lucene: when the _source field is disabled, or when we want to avoid parsing it, even if the parsing is done automatically by elasticsearch. Keep in mind though that retrieving many stored fields from lucene might require one disk seek per field while with retrieving only the _source from lucene and parsing it in order to retrieve the needed fields is just a single disk seek and just faster in most of the cases.

这篇关于为什么需要“存储”:“是”在弹性搜索?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆