如何加快单节点的引导速度 [英] How to speedup the bootstrap of single node

查看:98
本文介绍了如何加快单节点的引导速度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在开发机器上安装了一个单节点Cassandra(对Cassandra的经验很少)。我在节点中始终只有很少的数据,并且没有遇到任何问题。我今天在一张表中插入了大约9,000个元素,以试验一个实际的用例。当我启动节点时,启动时间现在非常长。我在system.log中得到了这个

I have a single node Cassandra installation on my development machine (and very little experience with Cassandra). I always had very few data in the node and I experienced no problems. I inserted about 9,000 elements in a table today to experiment with a real world use case. When I start up the node the boot time is extremely long now. I get this in system.log

Replaying /var/lib/cassandra/commitlog/CommitLog-3-1388134836280.log
...
Log replay complete, 9274 replayed mutations

这花费了13分钟,很难忍受。我想知道是否存在一种可以立即读取而不重播日志的方式存储数据的方法。毕竟9,000个元素都是空的,必须有一种更快的引导方式。我搜索了提示并搜索了Cassandra的文档,但没有找到任何东西。很明显,我不是在寻找正确的东西,有人会这么友好地将我指向正确的文档吗?谢谢。

That took 13 minutes and is hardly bearable. I wonder if there is a way to store data in such a way that can be read at once without replaying the log. After all 9,000 elements are nothing and there must be a quicker way to boot. I googled for hints and searched into Cassandra's documentation but I didn't find anything. It's obvious that I'm not looking for the right things, would anybody be so kind to point me to the right documents? Thanks.

推荐答案

有些事情可能会有所帮助。您可以做的最明显的事情是在关闭Cassandra之前刷新提交日志。在生产中也是一个好主意。在停止生产中的Cassandra节点之前,我将运行以下命令:

There are a few things that might help. The most obvious thing you can do is flush the commit log before you shutdown Cassandra. This is a good idea to do in production too. Before I stop a Cassandra node in production I'll run the following commands:

nodetool disablethrift
nodetool disablegossip
nodetool drain

前两个命令可以正常关闭与该节点连接的客户端的连接,然后关闭环中的其他节点。命令drain将内存表刷新到磁盘(sstable)。这应该最小化启动时需要重播的内容。

The first two commands gracefully shut down connections to clients connected to this node and then to other nodes in the ring. The drain command flushes memtables to disk (sstables). This should minimize what needs to be replayed on startup.

还有其他因素可能会使启动花费很长时间。 Cassandra在启动时打开磁盘上的所有SSTables。因此,磁盘上拥有的列族和SSTables越多,节点开始服务客户端所花费的时间就越长。在 1.2版本中进行了一些工作以加快速度(因此,您尚未安装1.2,则应考虑升级)。减少SSTable的数量可能会缩短您的启动时间。

There are other factors that can make startup take a long time. Cassandra opens all the SSTables on disk at startup. So the more column families and SSTables you have on disk the longer it will take before a node is able to start serving clients. There was some work done in the 1.2 release to speed this up (so if you are not on 1.2 yet you should consider upgrading). Reducing the number of SSTables would probably improve your start time.

由于您提到这是一台开发机器,因此我还将提供我的开发环境观察结果。在开发机器上,我做了很多创建和删除列族和键空间的操作。这可能会导致某些系统CF显着增长,并最终导致明显的减速。解决此问题的最简单方法是拥有一个脚本,该脚本可以快速引导新数据库并删除 / var / lib / cassandra 中的所有旧数据。

Since you mentioned this was a development machine I'll also give you my dev environment observations. On my development machine I do a lot of creating and dropping column families and key spaces. This can cause some of the system CFs to grow significantly and eventually cause a noticeable slowdown. The easiest way to handle this is to have a script that can quickly bootstrap a new database and blow away all the old data in /var/lib/cassandra.

这篇关于如何加快单节点的引导速度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆