如何知道nodetool修复是否完成 [英] how do i know if nodetool repair is finished

查看:1121
本文介绍了如何知道nodetool修复是否完成的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个2节点apache cassandra(2.0.3)集群,其rep因子为1.我使用以下命令在cqlsh中更改rep因子为2

I have a 2 node apache cassandra (2.0.3) cluster with rep factor of 1. I change rep factor to 2 using the following command in cqlsh

ALTER KEYSPACE "mykeyspace" WITH REPLICATION =   { 'class' : 'SimpleStrategy', 'replication_factor' : 2 };

然后我尝试在做这种类型的alter之后运行推荐的nodetool修复。

I then tried to run recommended "nodetool repair" after doing this type of alter.

问题是这个命令有时会很快完成。当它完成像它通常会说失去通知...和退出代码不为零。

The problem is that this command sometimes finishes very quickly. When it does finishes like that it will normally say 'Lost notification...' and exit code is not zero.

所以我只要重复这个nodetool修复,直到它完成没有错误。我还检查nodetool status报告每个节点的预期磁盘空间。 (用rep因子1,每个节点都说约7GB每个和我期望在nodetool修复后,每个是14GB每个假设没有集群使用在平均时间)

So I just repeat this 'nodetool repair' until it finishes without error. I also check that 'nodetool status' reports expected disk space for each node. (with rep factor 1, each node has say about 7GB each and I expect after nodetool repair that each is 14GB each assuming no cluster usage in the mean time)

有一个更正确的方法来确定'nodetool修复'在这种情况下完成?

Is there a more correct way to determine that 'nodetool repair' is finished in this case?

推荐答案

一般来说, code> nodetool repair 使用两个nodetool命令进行操作:

Generally speaking, you can monitor a nodetool repair operation with two nodetool commands:


  • compactionstats

  • netstats

修复操作有两个不同的阶段。首先,它计算节点之间的差异(要完成的修复工作),然后通过将数据流传输到适当的节点来处理这些差异。

The repair operation has two distinct phases. First it calculates the differences between the nodes (repair work to be done), and then it acts on those differences by streaming data to the appropriate nodes.

活动Merkle Tree计算:

This checks on the active Merkle Tree calculations:

$ nodetool compactionstats
pending tasks: 0
Active compaction remaining time :        n/a

修复流可通过以下方式监控:

The repair streams can be monitored by:

$ nodetool netstats

href =http://thelastpickle.com/> TheLastPickle 的Aaron Morton建议使用以下Bash脚本/命令来监视任何活动的修复流:

In fact, TheLastPickle's Aaron Morton suggests using the following Bash script/command to monitor any active repair streams:

while true; do date; diff <(nodetool -h localhost netstats) <(sleep 5 && nodetool -h localhost netstats); done

DataStax在其支持论坛中发布了关于疑难解答悬挂维修。如果你有任何挂起的修复流,你应该能够看到他们与 netstats 。如果您的某个节点在修复过程中变得不可用,就会发生这种情况。要监控特定的修复操作,您可以检查日志文件中的条目,如下所示:

DataStax has a posting in their support forums about troubleshooting hanging repairs. If you have any hung repair streams, you should be able to see them with a netstats. This can happen if one of your nodes becomes unavailable during the repair process. To monitor the specific repair operations, you can check your log file for entries like this:


DEBUG [WRITE- / 172.30.77.197] 2013 -05-03 12:43:09,107 OutboundTcpConnection.java(行165)错误写入/172.30.77.197
java.net.SocketException:连接重置

DEBUG [WRITE-/172.30.77.197] 2013-05-03 12:43:09,107 OutboundTcpConnection.java (line 165) error writing to /172.30.77.197 java.net.SocketException: Connection reset

这篇关于如何知道nodetool修复是否完成的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆