如何将Cassandra数据从一个群集复制到另一个群集 [英] How to copy cassandra data from one cluster to another

查看:146
本文介绍了如何将Cassandra数据从一个群集复制到另一个群集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在不同的数据中心上有2个cassandra集群(请注意,这是2个不同的集群,而不是具有multidc的单个集群),并且这两个集群都具有相同的键空间和列族模型。我希望以最有效的方式将列族C的数据从群集A复制到群集B。我可以通过get和put操作复制其他一些ColumnFamily,因为它是一个时间序列,并且键是顺序的。但是,这另一列C系列,我可以复制。我正在使用节俭和pycassa。我使用了CQL COPY命令,但不幸的是CF太大,我收到了rpc_timeout。我该如何完成呢?

解决方案

如果您只是想一次性完成此操作,请快照并使用sstableloader将其加载到集群中。如果您希望随着时间的推移继续加载新数据,则需要打开增量备份,然后制作快照以加载初始数据,然后定期从增量备份中抓取sstables到sstableload中以保持最新状态。 / p>

I Have 2 cassandra clusters, on different datacenter (note that these are 2 different clusters, NOT a single cluster with multidc), and both clusters have the same keyspace and columnfamily models. I wish to copy data of columnfamily C from Cluster A to cluster B in the most efficient way. Some other ColumnFamily I was able to copy with get and put operations, since it was a time series and the keys sequential. But this other column family C, I coulnd copy. I'm using thrift and pycassa. I've ried the CQL COPY command, but unfortunately the CF is too large and I get a rpc_timeout. How can I accomplish this?

解决方案

If you just want to do this as a one time thing, then take a snapshot and use the sstableloader to load that into the cluster. If you want to keep loading new data over time you will want to turn on incremental_backups, then take a snapshot to load for the initial data, and then periodically grab the sstables out of the incremental backups to sstableload to keep things up to date.

这篇关于如何将Cassandra数据从一个群集复制到另一个群集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆