卡桑德拉大量SSTables的 [英] Cassandra high number of SSTables

查看:116
本文介绍了卡桑德拉大量SSTables的的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

推出一些长期运行的写入作业(从星火卡桑德拉连接器一个Apache星火作业批量插入)之后,卡珊德拉(V 2.1)创造了数以千计的SSTables为目标表(4500以上)。
次要压实阈值被设置为默认值(4至32)。这意味着,从理论上讲,很多小的压实作业应自动调度。

After launching some long running write jobs (batch insert from an Apache Spark Job with Spark Cassandra Connector), Cassandra (v. 2.1) created thousands of SSTables for the target table (more than 4500). The minor compaction thresholds are set to the default values (4 to 32). This means that, in theory, a lot of minor compaction tasks should be scheduled automatically.

我检查的地位和nodetool表示,没有任务正在调度。我不再做了几个小时的任何操作。然后,我集群多次重新启动。等待一些时间。禁用和重新启用autocompaction。等待着。增加了吞吐量为999 MB /秒。等待着。

I checked the status and nodetool indicated that no tasks were being scheduled. I stopped doing any operation for few hours. Then I restarted the cluster multiple times. Waited some more time. Disabled and re-enabled autocompaction. Waited. Increased the throughput to 999 MB/s. Waited.

在这些测试中,只是一些小的压制,随机在一些节点开始为在有限的时间段。大多数节点已经无所事事一整天。

During these tests, just few minor compaction were randomly started in some nodes for a limited period of time. Most of the nodes have been doing nothing for an entire day.

于是,我决定手动启动重大压实(这是要花费几天......亚马逊EBS)。

Then, I decided to manually launch a Major compaction (it is going to take days... Amazon EBS).

为什么卡桑德拉没有做任何轻微的自动压实,即使SSTables的数量比阈值(32)

Why is Cassandra not doing any minor auto-compaction, even if the number of SSTables is 100 times greater than the threshold (32) ?

推荐答案

答案是在文档中:

By default, a minor compaction can begin any time Cassandra creates four SSTables on disk for a column family. A minor compaction must begin before the total number of SSTables reaches 32.

我SSTables的总数是相当大于32 ...

The total number of my SSTables is fairly greater than 32...

这篇关于卡桑德拉大量SSTables的的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆