如何防止 Cassandra 提交日志填满磁盘空间 [英] How to prevent Cassandra commit logs filling up disk space

查看:28
本文介绍了如何防止 Cassandra 提交日志填满磁盘空间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在 AWS 上运行一个两节点 Datastax AMI 集群.昨天,Cassandra 开始拒绝与所有事物建立联系.系统日志什么也没显示.经过很多的修补,我发现提交日志已经填满了分配安装上的所有磁盘空间,这似乎导致了连接拒绝(删除了一些提交日志,重新启动并可以连接).

I'm running a two node Datastax AMI cluster on AWS. Yesterday, Cassandra started refusing connections from everything. The system logs showed nothing. After a lot of tinkering, I discovered that the commit logs had filled up all the disk space on the allotted mount and this seemed to be causing the connection refusal (deleted some of the commit logs, restarted and was able to connect).

我使用的是 DataStax AMI 2.5.1 和 Cassandra 2.1.7

I'm on DataStax AMI 2.5.1 and Cassandra 2.1.7

如果我决定从头开始擦除并重新启动所有内容,我如何确保这种情况不再发生?

If I decide to wipe and restart everything from scratch, how do I ensure that this does not happen again?

推荐答案

您可以尝试降低 cassandra.yaml 中的 commitlog_total_space_in_mb 设置.64 位系统的默认值是 8192MB(它应该在你的 .yaml 文件中被注释掉……你必须在设置它时取消注释).在调整磁盘大小时,最好为此做好计划.

You could try lowering the commitlog_total_space_in_mb setting in your cassandra.yaml. The default is 8192MB for 64-bit systems (it should be commented-out in your .yaml file... you'll have to un-comment it when setting it). It's usually a good idea to plan for that when sizing your disk(s).

您可以通过在 commitlog 目录中运行 du 来验证这一点:

You can verify this by running a du on your commitlog directory:

$ du -d 1 -h ./commitlog
8.1G    ./commitlog

尽管,较小的提交日志空间会导致更频繁的刷新(增加磁盘 I/O),因此您需要密切关注这一点.

Although, a smaller commit log space will cause more frequent flushes (increased disk I/O), so you'll want to keep any eye on that.

编辑 20190318

刚刚有一个相关的想法(关于我 4 岁的回答).我看到它最近受到了一些关注,并想确保那里有正确的信息.

Just had a related thought (on my 4-year-old answer). I saw that it received some attention recently, and wanted to make sure that the right information is out there.

需要注意的是,有时提交日志会以失控"的方式增长.从本质上讲,这可能是因为节点上的写入负载超过了 Cassandra 跟上刷新内存表(从而删除旧的提交日志文件)的能力.如果您发现一个节点有数十个提交日志文件,并且数量似乎还在不断增长,那么这可能是您的问题.

It's important to note that sometimes the commit log can grow in an "out of control" fashion. Essentially, this can happen because the write load on the node exceeds Cassandra's ability to keep up with flushing the memtables (and thus, removing old commitlog files). If you find a node with dozens of commitlog files, and the number seems to keep growing, this might be your issue.

本质上,您的memtable_cleanup_threshold 可能太低了.虽然此属性已被弃用,但您仍然可以通过降低 memtable_flush_writers 的数量来控制它的计算方式.

Essentially, your memtable_cleanup_threshold may be too low. Although this property is deprecated, you can still control how it is calculated by lowering the number of memtable_flush_writers.

memtable_cleanup_threshold = 1 / (memtable_flush_writers + 1)

文档从 3.x 开始更新,但曾经这样说:

The documentation has been updated as of 3.x, but used to say this:

# memtable_flush_writers defaults to the smaller of (number of disks,
# number of cores), with a minimum of 2 and a maximum of 8.
# 
# If your data directories are backed by SSD, you should increase this
# to the number of cores.
#memtable_flush_writers: 8

...这(我觉得)导致许多人将这个值WAY设置得太高.

...which (I feel) led to many folks setting this value WAY too high.

假设值为 8,memtable_cleanup_threshold.111.当所有内存表的占用空间超过可用总内存的这个比例时,就会发生刷新.太多的刷新(阻塞)写入器可以方便地防止这种情况发生.对于单个 /data 目录,我建议将此值设置为 2.

Assuming a value of 8, the memtable_cleanup_threshold is .111. When the footprint of all memtables exceeds this ratio of total memory available, flushing occurs. Too many flush (blocking) writers can prevent this from happening expediently. With a single /data dir, I recommend setting this value to 2.

这篇关于如何防止 Cassandra 提交日志填满磁盘空间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆