同一查询的不同Elasticsearch结果 [英] Different Elasticsearch results for the same query

查看:318
本文介绍了同一查询的不同Elasticsearch结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我为Elasticsearch设置了1个群集á4个节点。
每个索引的分片数量:1;每个索引的副本数:3

I've setup Elasticsearch with 1 cluster á 4 nodes. Number of shards per index: 1; Number of replicas per index: 3

当我多次调用以下类似的简单查询时,我得到不同的结果(不同的总命中率和不同的前10个文档):

When I call a simple query like the following one multiple times I get different results (different total hits and different top 10 documents):

http://localhost:9200/index_name/_search?q=term

每个分片上的数据不同?我喜欢让所有碎片都保持最新状态。我该怎么办?

Different data on each shard? I like to have all shards up to date. What can I do?

这是/ _cluster / health的结果:

This is the result of /_cluster/health:

{
  "cluster_name" : "secret",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 4,
  "number_of_data_nodes" : 4,
  "active_primary_shards" : 24,
  "active_shards" : 96,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0
}

作为临时解决方案,我重建索引通过Ruby宝石轮胎:ModelName.rebuild_index

As a temporary solution I rebuild the index through Ruby gem tire: ModelName.rebuild_index

但是我需要一个长期解决方案。

But I need a long-term solution.

推荐答案

我们遇到了一个类似的问题,结果是因为在搜索时,Elasticsearch在不同分片之间进行轮询。由于ES _score 略有不同。 -of-deleted-documents rel = nofollow noreferrer>处理索引中的已删除文档。在我们的案例中,这意味着相似的结果通常会在结果顺序中稍低或较高,并且在与分页结合使用时(使用 from size 在搜索查询中),这意味着相同的结果出现在两个单独的页面上或根本不在页面上出现。

We ran into a similar problem and it turned out to be because Elasticsearch round-robins between different shards when searching. Each shard returns a slightly different _score because of slightly different indexing due to the way ES handles deleted documents in an index. In our case this meant similar results often placed slightly lower or higher in the results order, and, when combined with pagination (using from and size in the search query) it meant the same results were turning up on two separate "pages" or not at all from page to page.

我们发现有关一致性评分的文章这非常简洁,并实现了 preference 参数,以确保通过查询相同的分片始终为特定搜索获得相同的分数:

We found an Elasticsearch article on consistent scoring which explains this quite neatly and implemented a preference parameter to ensure that we always get the same scores for a particular search by querying the same shards:

http://localhost:9200/index_name/_search?q=term&preference=blablabla

我们也考虑过使用排序,但是Elas ticsearch通过内部Lucene文档ID对具有相同分数的结果进行排序,以确保始终以相同顺序返回具有相同分数的结果。

We also thought about using sorting, but Elasticsearch sorts results with the same scores by an internal Lucene document ID, ensuring that results with the same scores are always returned in the same order.

这篇关于同一查询的不同Elasticsearch结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆