无法解释的计数会导致ElasticSearch [英] Unexplainable count results in ElasticSearch
问题描述
-
curl - 200新新新新旗新新新新旗新新旗新新旗新新旗旗新200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 CE >查询返回正确的文档数量,241.047
curl -XGET 'http://localhost:9200/catawiki_development/_status?pretty'
returns 622.861curl -XGET 'http://localhost:9200/elasticsearch_development/_count?pretty'
returns 241.156- The
match_all
query returns the correct number of documents, 241.047
旁边,当我运行一个Curl命令来获取根文档的数量时,我得到一个不同于我运行 match_all
查询并要求返回的文件数量
H od可以解释这些差异吗?
计数api请求的路径与正常搜索请求的路径有很大的不同。实际上,这是一个快捷方式,它只能获得匹配查询的文档的计数。它与使用 search_type = count
,这实际上只是搜索的第一部分:将搜索请求广播到所有分片,但没有reduce / fetch,因为我们只想返回匹配文档的总数。您还可以在搜索请求中添加facet等(当使用 search_type = count
时),这是您无法使用count api。
在此期间,如果您有计数api的问题,我建议使用 search_type = count
的搜索请求。那个保证返回与正常搜索相同数量的文档,只是因为它是完全相同的逻辑。
We have an index running with 241.047 items in it. These items can have any number of subitems, which are indexed as nested documents. The total number of subitems is 381.705.
Both include_in_parent
and include_in_root
are not set in the mapping, which means that each nested document is indexed as additional documents. This should mean that there will be a total number of 241.047 + 381.705 = 622.752 documents in the index.
When I run the following Curl command to look up the number of documents in the index I get a different number, it's not far off but I'm wondering why it's giving me a different number and it's not returning the number I'm expecting.
Next to that, when I'm running a Curl command to get the number of root documents I get a different number than if I run a match_all
query and ask for the number of documents returned
How can these difference be explained?
The path of a count api request is quite different from the path of a normal search request. In fact it is a shortcut that allows to only get the count of the documents matching a query, thats' it. It differs from a search with search_type=count
too, which is effectively only the first part of a search: broadcast the search request to all shards, but no reduce/fetch since we only want to return the total number of matching documents. You can also add facets etc. to a search request (when using search_type=count
too), which is something that you cannot do with the count api.
That said, I'm not that surprised you see a difference for the above reason, it would be nice to understand exactly what the problem is though. The best would be to be able to reproduce the problem with a small number of documents and open an issue including a curl recreation so that we can have a look at it.
In the meantime, I would suggest to use a search request with search_type=count
if you have problems with the count api. That one is guaranteed to return the same number of documents as a normal search, just because it is exactly the same logic.
这篇关于无法解释的计数会导致ElasticSearch的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!