在Cassandra中将数据从一个集群传输到另一个集群 [英] Transferring data from one cluster to another in Cassandra

查看:691
本文介绍了在Cassandra中将数据从一个集群传输到另一个集群的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个旧的Cassandra集群,我想摆脱它,并且只想将数据从旧集群的几个选定表中转移到我创建的新表中。
我尝试在具有约1500万行(每行约20列)的表上使用Cassandra的 COPY 命令。当我尝试将数据从csv文件导入到新集群中的同一表中时,我不断收到此响应:

I have an old Cassandra cluster that I want to get rid of, and want to transfer data from only few selected tables from old cluster to a new one that I have created. I have tried using Cassandra's COPY command on a table that has about 15 million rows (approx 20 columns for each row). When I try to import data from the csv file to the same table in our new cluster, I am getting this response constantly :


导入20行:WriteTimeout-服务器错误:代码= 1100 [协调器节点在等待副本时超时,没有
des的响应] message =操作超时-仅收到0个响应。 info = {'received_responses':0,'required_resp
onses':1,'consistency':'ONE'},稍后将重试,尝试5次中的1次

Failed to import 20 rows: WriteTimeout - Error from server: code=1100 [Coordinator node timed out waiting for replica no des' responses] message="Operation timed out - received only 0 responses." info={'received_responses': 0, 'required_resp onses': 1, 'consistency': 'ONE'}, will retry later, attempt 1 of 5

显然,这种方法不起作用。有没有一种方法可以仅将某些表从一个集群流传输到另一个集群?注意,尽管我们有数百万行,但数据并不是那么大。我拥有的最大表大约为2.5 GB。

Apparently, this approach is not working. Is there a way that I can stream only some tables from one cluster to another? Note, although we have millions of rows, the data is not that huge. The biggest table I have is about 2.5 GB.

它们的键空间当前配置为使用SimpleStrategy。使用NetworkTopologyStrategy会有所帮助吗?我应该指出,我只想从几个表中流式传输数据,而将其他表排除在外。

They keyspace is currently configured to use SimpleStrategy. Will using NetworkTopologyStrategy help? I should point out that I only want to stream data from few tables, leaving other tables out.

推荐答案

我已经成功使用了将数据从一个群集复制到另一个群集的策略。

I have successfully used the strategy you are using for copying data from one cluster to another.

通常,建议从快照还原。但是,当用例不是将整个数据还原到一个新的群集中,而只是转移几个不是很大的表时,COPY FROM然后COPY TO是简单有效的策略。

In general restoring from snapshot is recommended. But when the use case is not to restore whole data to a new cluster, but only to transfer few not so big tables, COPY FROM and then COPY TO is simple effective strategy.

坚持您的策略,只专注于您遇到的错误。

Stick to your strategy and focus only on the error you are getting.

我建议尝试使用较小的批量。

I would suggest try using smaller batch size.


  • cqlsh $ host -e使用$ keyspace; COPY $ keyspace。$ table from'$ {file}'WITH MAXBATCHSIZE ='1';

这篇关于在Cassandra中将数据从一个集群传输到另一个集群的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆