Cassandra 控制 SSTable 大小 [英] Cassandra control SSTable size

查看:19
本文介绍了Cassandra 控制 SSTable 大小的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有没有办法控制 SSTable 的最大大小,例如 100 MB,这样当 CF 的数据实际超过 100MB 时,Cassandra 会创建下一个 SSTable?

Is there a way I could control max size of a SSTable, for example 100 MB so that when there is actually more than 100MB of data for a CF, then Cassandra creates next SSTable?

推荐答案

不幸的是,答案并没有那么简单,你的 SSTable 的大小会受到你的压缩策略的影响,并且没有直接的方法来控制你的最大 sstable 大小.

Unfortunately the answer is not so simple, the sizes of your SSTables will be influenced by your compaction Strategy and there is no direct way to control your max sstable size.

SSTables 最初是在 memtables 作为 SSTables 刷新到磁盘时创建的.这些表的大小最初取决于您的内存表设置和堆的大小(memtable_total_space_in_mb 是一个很大的影响因素).通常,这些 SSTable 非常小.SSTables 合并在一起,作为名为 compaction 的过程的一部分.

SSTables are initially created when memtables are flushed to disk as SSTables. The size of these tables initially depends on your memtable settings and the size of your heap (memtable_total_space_in_mb being a large influencer). Typically these SSTables are pretty small. SSTables get merged together as part of a process called compaction.

如果您使用大小分层压缩策略,您就有机会拥有非常大的 SSTable.当至少有 min_threshold(默认为 4)个相同大小的 sstables 时,STCS 会将 SSTables 合并到一个小压缩中,方法是将它们合并到一个文件中,过期数据和合并键.这有可能在一段时间后创建非常大的 SSTable.

If you use Size-Tiered Compaction Strategy you have an opportunity to have really large SSTables. STCS will combine SSTables in a minor compaction when there are at least min_threshold (default 4) sstables of the same size by combining them into one file, expiring data and merging keys. This has the possibility to create very large SSTables after a while.

使用分级压缩策略,有一个 sstable_size_in_mb 选项可以控制 SSTable 的目标大小.通常,SSTable 将小于或等于此大小,除非您的分区键包含大量数据(宽行").

Using Leveled Compaction Strategy there is a sstable_size_in_mb option that controls a target size for SSTables. In general SSTables will be less than or equal to this size unless you have a partition key with a lot of data ('wide rows').

我还没有对日期分层压缩策略进行太多实验,但它的工作原理类似于 STCS,因为它合并相同大小的文件,但它按时间顺序将数据放在一起,并且它具有停止压缩旧的配置数据 (max_sstable_age_days) 可能很有趣.

I haven't experimented much with Date-Tiered Compaction Strategy yet, but that works similar to STCS in that it merges files of the same size, but it keeps data together in time order and it has a configuration to stop compacting old data (max_sstable_age_days) which could be interesting.

关键是找到最适合您的数据的压缩策略,然后围绕最适合您的数据模型/环境的方式调整属性.

The key is to find the compaction strategy which works best for your data and then tune the properties around what works best for your data model / environment.

您可以在此处阅读有关压缩配置设置的更多信息 并阅读本指南以帮助了解 STCS 或LCS 适合您.

You can read more about the configuration settings for compaction here and read this guide to help understand whether STCS or LCS is appropriate for you.

这篇关于Cassandra 控制 SSTable 大小的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆