无法解释的计数会导致ElasticSearch [英] Unexplainable count results in ElasticSearch

查看:138
本文介绍了无法解释的计数会导致ElasticSearch的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有一个索引,其中包含241.047个项目。这些项目可以有任意数量的子项目,它们作为嵌套文档编入索引。 X-454545454545×20045 X-454545 X- 20045 X- 20045 X- 20045 X- 20045 X- 20045 X- 20045 X- 20045 X- 20045 X- 20045 X- 20045 X- 20045 X- $ c>不在映射中设置,这意味着每个嵌套文档都作为附加文档进行索引。 X- 20045 X-454545454545 X-4545 X-454545 X-45454545 X-45454545新新新新新新新新新新新新200新200新200新新200新200新新200新200新新200新200新新200新200新新新200新新200新新新200新新200新新200新新新200新新200新新新200新新200新新200新新新200新新200新新新200新新200新新200新新200新新新200新新200新新200新新200新新新200新新200新新200新新新200新新新200新新200新新200新新新200新新200新新新200新新200新新新200新新200新新新200新新新200新新新200新新200新新200新新新200新新新200新新新新200新新新新200新新200新新200新新新200新新新200新新新200新新新新200新新新新200新新200新新新新200新新新200新新新200新新200新新新新新在索引中我得到一个不同的数字,它不算远,但我想知道为什么它给我一个不同的数字,它不会返回我期待的数字。





    旁边,当我运行一个Curl命令来获取根文档的数量时,我得到一个不同于我运行 match_all 查询并要求返回的文件数量




    • curl - 200新新新新旗新新新新旗新新旗新新旗新新旗旗新200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 CE >查询返回正确的文档数量,241.047



    H od可以解释这些差异吗?

    解决方案

    计数api请求的路径与正常搜索请求的路径有很大的不同。实际上,这是一个快捷方式,它只能获得匹配查询的文档的计数。它与使用 search_type = count ,这实际上只是搜索的第一部分:将搜索请求广播到所有分片,但没有reduce / fetch,因为我们只想返回匹配文档的总数。您还可以在搜索请求中添加facet等(当使用 search_type = count 时),这是您无法使用count api。

    $那么说,我不是很惊讶,你看到上述原因有所不同,很明白什么是问题所在。最好的办法是能够通过少量文档来再现问题,并打开问题包括卷曲娱乐,以便我们可以看看它。



    在此期间,如果您有计数api的问题,我建议使用 search_type = count 的搜索请求。那个保证返回与正常搜索相同数量的文档,只是因为它是完全相同的逻辑。


    We have an index running with 241.047 items in it. These items can have any number of subitems, which are indexed as nested documents. The total number of subitems is 381.705.

    Both include_in_parent and include_in_root are not set in the mapping, which means that each nested document is indexed as additional documents. This should mean that there will be a total number of 241.047 + 381.705 = 622.752 documents in the index.

    When I run the following Curl command to look up the number of documents in the index I get a different number, it's not far off but I'm wondering why it's giving me a different number and it's not returning the number I'm expecting.

    • curl -XGET 'http://localhost:9200/catawiki_development/_status?pretty' returns 622.861

    Next to that, when I'm running a Curl command to get the number of root documents I get a different number than if I run a match_all query and ask for the number of documents returned

    • curl -XGET 'http://localhost:9200/elasticsearch_development/_count?pretty' returns 241.156
    • The match_all query returns the correct number of documents, 241.047

    How can these difference be explained?

    解决方案

    The path of a count api request is quite different from the path of a normal search request. In fact it is a shortcut that allows to only get the count of the documents matching a query, thats' it. It differs from a search with search_type=count too, which is effectively only the first part of a search: broadcast the search request to all shards, but no reduce/fetch since we only want to return the total number of matching documents. You can also add facets etc. to a search request (when using search_type=count too), which is something that you cannot do with the count api.

    That said, I'm not that surprised you see a difference for the above reason, it would be nice to understand exactly what the problem is though. The best would be to be able to reproduce the problem with a small number of documents and open an issue including a curl recreation so that we can have a look at it.

    In the meantime, I would suggest to use a search request with search_type=count if you have problems with the count api. That one is guaranteed to return the same number of documents as a normal search, just because it is exactly the same logic.

    这篇关于无法解释的计数会导致ElasticSearch的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆