Cassandra批量插入操作,内部 [英] Cassandra bulk insert operation, internally

查看:1055
本文介绍了Cassandra批量插入操作,内部的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在寻找Cassandra / CQL的 INSERT INTO ... SELECT ... FROM ... 的常用SQL成语的表哥,并且无法找到任何东西通过编程或CQL来做这样的操作。只是不支持吗?

I am looking for Cassandra/CQL's cousin of the common SQL idiom of INSERT INTO ... SELECT ... FROM ... and have been unable to find anything to do such an operation programmatically or in CQL. Is it just not supported?

我的用例是从一个表到另一个表做一个相当大的副本。我不需要任何特定的并发保证,但它是很多数据,所以我想避免编写一个客户端从一个表中检索数据,然后批量插入到另一个表中的额外的网络开销。我理解,改变仍然需要在Cassandra集群的节点之间根据复制设置进行传输,但似乎有一个内部选项来执行从一个表到另一个表的批量操作似乎是合理的。在CQL或其他地方有这样的事情吗?我目前正在使用Hector与Cassandra交谈。

My use case is to do a reasonably bulky copy from one table to another. I don't need any particular concurrent guarantees, but it's a lot of data so I'd like to avoid the additional network overhead of writing a client that retrieves data from one table, then issues batches of inserts into the other table. I understand that the changes will still need to be transported between nodes of the Cassandra cluster according to the replication set-up, but it seems reasonable for there to be an "internal" option to do a bulk operation from one table to another. Is there such a thing in CQL or elsewhere? I'm currently using Hector to talk to Cassandra.

编辑:看起来像 sstableloader 可能是相关的,是非常低级的东西,我期望是一个相当普遍的用例。

it looks like sstableloader might be relevant, but is awfully low-level for something that I'd expect to be a fairly common use case. Taking just a subset of rows from one table to another also seems less than trivial in that framework.

推荐答案

正确,这不是一个简单的例子。本机支持。 (另一个替代方法是map / reduce作业。)Cassandra的API专注于大规模应用程序的短请求,而不是批处理或分析查询。

Correct, this is not supported natively. (Another alternative would be a map/reduce job.) Cassandra's API focuses on short requests for applications at scale, not batch or analytical queries.

这篇关于Cassandra批量插入操作,内部的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆