Elasticsearch读写一致性 [英] Elasticsearch read and write consistency
问题描述
Elasticsearch没有读取一致性"参数(例如Cassandra). 但是它具有"写入一致性"和"阅读偏好"
Elasticsearch doesn't have "read consistency" param (like Cassandra). But it has "write consistency" and "read preference".
文档说了以下有关写的内容一致性
写一致性
为了防止在网络分区的错误"侧进行写操作,默认情况下,只有在仲裁的仲裁(> replicas/2 + 1)可用时,索引操作才会成功.可以使用action.write_consistency设置逐个覆盖此默认设置.要更改每次操作的行为,可以使用一致性请求参数.
Write Consistency
To prevent writes from taking place on the "wrong" side of a network partition, by default, index operations only succeed if a quorum (>replicas/2+1) of active shards are available. This default can be overridden on a node-by-node basis using the action.write_consistency setting. To alter this behavior per-operation, the consistency request parameter can be used.
有效的写一致性值是1,法定人数和全部.
Valid write consistency values are one, quorum, and all.
请注意,对于副本数为1(数据的2个副本的总数)的情况,则默认行为是如果1个副本(主副本)可以执行写操作,则成功.
Note, for the case where the number of replicas is 1 (total of 2 copies of the data), then the default behavior is to succeed if 1 copy (the primary) can perform the write.
仅在复制组中的所有活动分片为文档建立索引(同步复制)之后,索引操作才会返回.
The index operation only returns after all active shards within the replication group have indexed the document (sync replication).
我的问题是关于最后一段:
My question is about the last paragraph:
仅在复制组中的所有活动分片为文档建立索引(同步复制)之后,索引操作才会返回.
The index operation only returns after all active shards within the replication group have indexed the document (sync replication).
如果write_consistency=quorum
(默认)并且所有分片都处于活动状态(无节点故障,无网络分区),则:
1)索引操作是否在仲裁定数后立即返回
分片已完成索引编制? (即使所有分片都处于活动状态/活动状态)
2)还是在所有活动/活动分片都已完成索引后返回索引操作? (即仅在失败/超时的情况下才考虑仲裁)
If write_consistency=quorum
(default) and all shards are live (no node failures, no network-partition), then:
1) Does index operation return as soon as quorum of
shards have finished indexing? (even though all shards are live/active)
2) Or does index operation return when all live/active shards have finished indexing? (i.e. quorum is considered only in case of failures/timeouts)
在第一种情况下-读取可能会最终保持一致(可能会获取陈旧的数据),写入会更快.
在第二种情况下-读取是一致的(只要没有网络分区),写入则较慢(因为它等待较慢的分片/节点).
In the first case - read may be eventual-consistent (may get stale data), write is quicker.
In the second case - read is consistent (as long as there are no network-partitions), write is slower (as it waits for the slower shard/node).
有人知道它是如何工作的吗?
Does anyone know how it works?
我想知道的另一件事-为什么'首选项'参数(在获取/搜索请求中)是randomized
,但不是_local
(我想应该是更有效的)
Another thing that I wonder about - is why the default value for 'preference' param (in get/search request) is randomized
but not _local
(which must have been more efficient I suppose)
推荐答案
我认为我现在可以回答自己的问题了:)
I think I can answer my own question now :)
关于第一个问题,请重新阅读文档(此和 this )几次:)我意识到这句话应该是正确的:
Regarding the first question, by re-re-reading the documentation (this and this) a few times :) I realized that this statement should be right:
所有活动/活动分片都已完成索引编制时,无论一致性参数如何,索引编制操作都会返回.一致性参数只能在没有足够的可用分片(节点)的情况下阻止操作开始.
Index operation return when all live/active shards have finished indexing, regardless of consistency param. Consistency param may only prevent the operation to start if there are not enough available shards(nodes).
例如,如果有3个分片(一个主副本和两个副本),并且所有分片都可用-操作将等待所有3个(考虑到所有3个都处于活动状态/可用),不考虑一致性参数(即使consistency=one
)
这使系统保持一致(至少是document-api部分);除非存在网络分区.
但是,我还没有机会进行测试.
So for example, if there are 3 shards (one primary and two replicas), and all shards are available - the operation will be waiting for all 3 (considering that all 3 are live/available), regardless of consistency param (even when consistency=one
)
This makes the system consistent (at least the document-api part); unless there is a network-partition.
But, I didn't have a chance to test this yet.
更新:这里的一致性并不是我要说的ACID一致性,它只是保证在返回请求时更新所有副本的方法.
UPDATE: by consistency here, I don't mean ACID-consistency, it is just the guarantee that all replicas are updated at the moment when request is returned.
关于第二个问题:
显而易见的答案是-分散负载为randomized
;另一方面,客户端可以选择一个随机节点与之对话,但可能效率不是100%,因为单个请求可能需要多个分片.
Regarding the second question:
The obvious answer is - it is randomized
to spread the load; on the other hand, a client can pick a random node to talk to, but probably it is not 100% efficient as a single request may need multiple shards.
这篇关于Elasticsearch读写一致性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!