Cassandra配置为默认丢失10秒的数据? [英] Cassandra is configured to lose 10 seconds of data by default?

查看:908
本文介绍了Cassandra配置为默认丢失10秒的数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

请注意, commitlog_sync_period_in_ms (默认为10秒)。是否意味着Cassandra默认情况下配置为丢失10秒的数据?由于数据/分区首先到Memtable,ack立即发送回客户端,但Commitlog中的数据每10秒周期性地刷新到磁盘(由 commitlog_sync_period_in_ms 控制)。

Regards, the commitlog_sync_period_in_ms (which is 10 seconds by default). Does it mean that Cassandra by default is configured to lose 10 seconds of data? As the data/partition comes first to Memtable, ack is sent back to the client immediately, but the data in the Commitlog is flushed to the disk periodically every 10 seconds (controlled by commitlog_sync_period_in_ms). So if anything happens during this period, 10 seconds of the data will be lost?

推荐答案

如果一个节点在更新之前崩溃了提交日志在磁盘上,然后是,你可能会损失多达十秒的数据。

If a node crashed right before updating the commit log on disk, then yes, you could lose up to ten seconds of data.

如果保存多个副本,通过使用高于1的复制因子或多个数据中心,则大部分丢失的数据将在其他节点上,并且当被修复时将在崩溃的节点上恢复。

If you keep multiple replicas, by using a replication factor higher than 1 or have multiple data centers, then much of the lost data would be on other nodes, and would be recovered on the crashed node when it was repaired.

同样,提交日志可以被写入在不到十秒钟的时间内,写入量高到足以在十秒钟之前达到大小限制。

Also the commit log may be written in less than ten seconds it the write volume is high enough to hit size limits before the ten seconds.

如果你想要更高的耐用性(以更高的延迟),那么可以将commitlog_sync设置从periodic更改为batch。在批处理模式下,它使用commitlog_sync_batch_window_in_ms设置来控制写入磁盘的批次的频率。在批处理模式下,写入操作不会被写入磁盘。

If you want more durability than this (at the cost of higher latency), then you can change the commitlog_sync setting from periodic to batch. In batch mode it uses the commitlog_sync_batch_window_in_ms setting to control how often batches of writes are written to disk. In batch mode the writes are not acked until written to disk.

周期模式的第十个默认设置用于旋转磁盘,因为它们非常慢,命中如果你阻塞acks等待提交日志写入。因此,如果您使用批处理模式,他们建议为提交日志使用专用磁盘,以便写入磁头不需要执行任何寻求使添加的延迟尽可能低。

The ten second default for periodic mode is designed for spinning disks, since they are so slow there is a performance hit if you block acks waiting for commit log writes. For this reason if you use batch mode, they recommend a dedicated disk for the commit log so that the write head doesn't need to do any seeks to keep the added latency as low as possible.

如果您使用SSD,那么您可以使用更积极的时间,因为与旋转磁盘相比,延迟大大降低。

If you are using SSD's, then you can use more aggressive timing since the latency is greatly reduced compared to a spinning disk.

这篇关于Cassandra配置为默认丢失10秒的数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆