Cassandra控件的SSTable大小 [英] Cassandra control SSTable size

查看:148
本文介绍了Cassandra控件的SSTable大小的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有没有一种方法可以控制SSTable的最大大小,例如100 MB,以便当CF实际有超过100MB的数据时,Cassandra会创建下一个SSTable?

解决方案

不幸的是答案不是那么简单,您的SSTables的大小将受压缩策略的影响,并且没有直接的方法来控制最大sstable的大小。 / p>

当将内存表作为SSTables刷新到磁盘时,将首先创建SSTables。这些表的大小最初取决于您的内存表设置和堆的大小( memtable_total_space_in_mb 是很大的影响力)。通常,这些SSTable非常小。 SSTable被合并在一起,作为名为 compaction的过程的一部分



如果使用大小分层压缩策略,则有机会拥有非常大的SSTables。当至少有相同大小的 min_threshold (默认为4)个sstable通过合并到一个文件中,到期数据并合并键时,STCS将以较小的压缩方式组合SSTable。



使用分层压缩策略时,可以使用 sstable_size_in_mb 选项控制SSTables的目标大小。通常,SSTables会小于或等于此大小,除非您有一个包含大量数据的分区键(宽行)。



我没有尝试过日期分层压缩策略在很多方面都非常有用,但是它的工作原理与STCS类似,因为它合并了相同大小的文件,但是它按时间顺序将数据保持在一起,并且具有停止压缩旧数据的配置( max_sstable_age_days ),这可能很有趣。



关键是找到最适合您的数据的压缩策略,然后根据最适合您的数据模型/环境的特性调整属性。



您可以阅读有关压缩的配置设置的更多信息在此并阅读本指南帮助您了解STCS或LCS是否适合您。


Is there a way I could control max size of a SSTable, for example 100 MB so that when there is actually more than 100MB of data for a CF, then Cassandra creates next SSTable?

解决方案

Unfortunately the answer is not so simple, the sizes of your SSTables will be influenced by your compaction Strategy and there is no direct way to control your max sstable size.

SSTables are initially created when memtables are flushed to disk as SSTables. The size of these tables initially depends on your memtable settings and the size of your heap (memtable_total_space_in_mb being a large influencer). Typically these SSTables are pretty small. SSTables get merged together as part of a process called compaction.

If you use Size-Tiered Compaction Strategy you have an opportunity to have really large SSTables. STCS will combine SSTables in a minor compaction when there are at least min_threshold (default 4) sstables of the same size by combining them into one file, expiring data and merging keys. This has the possibility to create very large SSTables after a while.

Using Leveled Compaction Strategy there is a sstable_size_in_mb option that controls a target size for SSTables. In general SSTables will be less than or equal to this size unless you have a partition key with a lot of data ('wide rows').

I haven't experimented much with Date-Tiered Compaction Strategy yet, but that works similar to STCS in that it merges files of the same size, but it keeps data together in time order and it has a configuration to stop compacting old data (max_sstable_age_days) which could be interesting.

The key is to find the compaction strategy which works best for your data and then tune the properties around what works best for your data model / environment.

You can read more about the configuration settings for compaction here and read this guide to help understand whether STCS or LCS is appropriate for you.

这篇关于Cassandra控件的SSTable大小的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆