AWS和自动扩展cassandra [英] AWS and auto scaling cassandra

查看:136
本文介绍了AWS和自动扩展cassandra的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经设置了一个带有cassandra的AWS实例,然后还设置了一个自动伸缩组,以根据alarma启动另一个4-8个实例.但是,Cassandra如何知道何时启动自动缩放?它如何知道要连接的其他节点?我是否需要在Cassandra中配置一些内容以使其嗅探节点?

I've setup a AWS instance with cassandra on it and then also setup an auto scaling group to spin up another 4-8 instances depending on alarma. But how does Cassandra know when auto scaling kicks in? How does it know what other nodes to connect to? Do I need to configure something in Cassandra in order for it to sniff the nodes?

当我运行节点工具时,自动缩放节点不会显示...

when I run node tool, the auto scaling nodes don't show up...

[root@ip-10-205-119-104 bin]# sh nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens  Owns    Host ID                               Rack
UN  127.0.0.1  107.12 MB  256     ?       a50294ac-2150-4d9e-9dd2-0a56906e9531  rack1

Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless

推荐答案

Cassandra中自动发现的最佳选择是种子节点,它们是锚定"节点,当节点新的出现,可以在每次需要时查询集群的节点列表.

The best option for auto-discovery in Cassandra are seed nodes, which are 'anchor' nodes supposed to be always there when a new one shows up, and can be queried for cluster's node list every time it is needed.

因此,您在每个节点的配置文件中提供了一个种子节点列表(包括种子本身),一旦它上升,它将从种子中获取节点列表.当然,这要求种子节点是静态的并且始终在运行(当然,为了冗余,您必须拥有多个种子节点). Cassandra要求它也要按其IP列出(以避免DNS出现问题).

So, you deliver every node with a list of seed nodes in its config file (including the seeds themselves), and once it goes up, it will get the nodes list from a seed. This, off course, demands seed nodes to be static and always running (off course, for redundancy, you must have more than just one seed node). Cassandra demands it to be listed by their IP as well (to avoid having problems with DNS).

尽管如此,我认为自动缩放Cassandra并不是一件好事. Cassandra在节点之间对数据(行)进行分区,并且每次添加或删除节点时,它都需要重新分区和重新分配行,这取决于您的数据量,这会花费相当长的时间(并且可能需要其他管理操作,例如维修等).即使您有足够多的副本来承受突然的节点丢失(使用自动缩放功能,这种情况也会发生),这还是很混乱的.首先,因为Cassandra不会自动取消停用节点-群集将知道该节点不可用,但是它只是等待它返回,并尝试使群集保持尽可能的健康(包括一种将写入数据保存到磁盘的机制).其他节点一段时间内无法使用该节点).

Nonetheless, I don't think auto-scaling Cassandra would be a good thing. Cassandra partitions its data (rows) across nodes, and every time you add or remove a node, it needs to repartition and redistribute rows, which, depending on how big are you data, takes quite long (and may demand other administrative actions, like repairing, etc). Even if you have enough replicas to afford a sudden node loss (which is what WILL occur using auto-scaling), that's messy. First, because Cassandra won't automatically decomission nodes - the cluster will know the node is unavailable, but it just waits for it to come back, and try to keep the cluster as healthy as possible (including a mechanism that saves the writes to the unavailable node in other nodes for some period).

因此,您将需要监视节点并从外部管理这些起伏.而且,您甚至可能没有时间去使用一个节点并在另一个节点再次出现然后再次关闭之前将所有数据(您的数据)重新设置到位,而这实际上可能会完全破坏您的集群.

So, you would need to watch your nodes and manage those ups and downs from outside. And, you may not even have time for decomission one node and set everything (your data) in place again before another one comes up, and down again, and all that could really screw your cluster totally up.

好吧,也许有人在外面做,但是根据我对Cassandra的了解和经验,它并不像自动缩放那样像使用Web应用程序那样简单和神奇.最终会丢失数据,并使系统变得非常不稳定和不稳定.

Well, maybe there's some people out there doing this, but according to my knowledge and experience with Cassandra, it's not so simple and magic as that to be auto-scaled like you would do with a web application, and you would probably end up losing data and having a very inconsistent and unstable system.

这篇关于AWS和自动扩展cassandra的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆