完全删除退役的 Cassandra 节点 [英] Fully removing a decommissioned Cassandra node
问题描述
运行 Cassandra 1.0,我将一个环从 5 个节点缩小到 4 个.为了做到这一点,我在要删除的节点上运行了 nodetool decommission
,然后在该主机上停止了 cassandra 并使用 nodetool move
和 nodetool cleanup
更新剩余 4 个节点上的令牌以重新平衡集群.
Running Cassandra 1.0, I am shrinking a ring from 5 nodes down to 4. In order to do that I ran nodetool decommission
on the node I want to remove, then stopped cassandra on that host and used nodetool move
and nodetool cleanup
to update the tokens on the remaining 4 nodes to rebalance the cluster.
我的种子节点是 A 和 B.我删除的节点是 C.
My seed nodes are A and B. The node I removed is C.
这似乎可以正常工作 6-7 天,但现在我的四个节点之一认为退役的节点仍然是环的一部分.
That seemed to work fine for 6-7 days, but now one of my four nodes thinks the decommissioned node is still part of the ring.
为什么会发生这种情况,从环中完全移除退役节点的正确方法是什么?
Why did this happen, and what's the proper way to fully remove the decommissioned node from the ring?
这是一个节点上 nodetool ring
的输出,该节点仍然认为退役的节点是环的一部分:
Here's the output of nodetool ring
on the one node that still thinks the decommissioned node is part of the ring:
Address DC Rack Status State Load Owns Token
127605887595351923798765477786913079296
xx.x.xxx.xx datacenter1 rack1 Up Normal 616.17 MB 25.00% 0
xx.xxx.xxx.xxx datacenter1 rack1 Up Normal 1.17 GB 25.00% 42535295865117307932921825928971026432
xx.xxx.xx.xxx datacenter1 rack1 Down Normal ? 9.08% 57981914123659253974350789668785134662
xx.xx.xx.xxx datacenter1 rack1 Up Normal 531.99 MB 15.92% 85070591730234615865843651857942052864
xx.xxx.xxx.xx datacenter1 rack1 Up Normal 659.92 MB 25.00% 127605887595351923798765477786913079296
这是其他 3 个节点上 nodetool ring
的输出:
Here's the output of nodetool ring
on the other 3 nodes:
Address DC Rack Status State Load Owns Token
127605887595351923798765477786913079296
xx.x.xxx.xx datacenter1 rack1 Up Normal 616.17 MB 25.00% 0
xx.xxx.xxx.xxx datacenter1 rack1 Up Normal 1.17 GB 25.00% 42535295865117307932921825928971026432
xx.xx.xx.xxx datacenter1 rack1 Up Normal 531.99 MB 25.00% 85070591730234615865843651857942052864
xx.xxx.xxx.xx datacenter1 rack1 Up Normal 659.92 MB 25.00% 127605887595351923798765477786913079296
更新:我尝试在节点 B 上使用 nodetool removetoken
删除节点,该节点仍然声称节点 C 在环中.该命令运行了 5 个小时,但似乎什么也没做.唯一的变化是当我在节点 B 上运行 nodetool ring
时,节点 C 的状态现在是离开".
UPDATE:
I tried to remove the node using nodetool removetoken
on node B, which is the one that still claims node C is in the ring. That command ran for 5 hours and didn't seem to do anything. The only change is that the state of Node C is "Leaving" now when I run nodetool ring
on Node B.
推荐答案
我能够使用 nodetool removetoken
删除退役的节点,但我不得不使用 force
选项.
I was able to remove the decommissioned node using nodetool removetoken
, but I had to use the force
option.
这是我的命令的输出:
iowalker:~$ nodetool -h `hostname` removetoken 57981914123659253974350789668785134662
<waited 5 hours, the node was still there>
iowalker:~$ nodetool -h `hostname` removetoken status
RemovalStatus: Removing token (57981914123659253974350789668785134662). Waiting for replication confirmation from [/xx.xxx.xxx.xx,/xx.x.xxx.xx,/xx.xx.xx.xxx].
iowalker:~$ nodetool -h `hostname` removetoken force
RemovalStatus: Removing token (57981914123659253974350789668785134662). Waiting for replication confirmation from [/xx.xxx.xxx.xx,/xx.x.xxx.xx,/xx.xx.xx.xxx].
iowalker:~$ nodetool -h `hostname` removetoken status
RemovalStatus: No token removals in process.
这篇关于完全删除退役的 Cassandra 节点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!