并行从Cassandra读取数据的最佳方法是什么? [英] What is the best way to read data from Cassandra in parallel?

查看:97
本文介绍了并行从Cassandra读取数据的最佳方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是Cassandra的新手,我试图弄清楚如何存储数据以便能够并行执行快速读取。我读过分区数据会带来性能问题吗?

I'm new to Cassandra and I'm trying to figure out how I should store data in order to be able to perform fast reads in parallel. I have read that partitioning data can give performance issues? Is it possible to read data from Cassandra tables in the same partition in parallel?

推荐答案

DataStax的Oliver Michallat的博客文章不错,可以并行读取同一分区中的Cassandra表中的数据吗?讨论以下内容:

DataStax's Oliver Michallat has a good blog post which discusses this:

使用Java驱动程序进行异步查询

在那篇文章中,他介绍了如何编写并行查询代码来解决与多分区键相关的问题

In that article, he describes how to code in-parallel queries to solve the issues associated with multi-partition-key queries.

他使用的示例不是运行单个查询(来自Java),例如:

The example he uses, is instead of running a single query (from Java) for something like this:

SELECT * FROM users WHERE id IN (
    e6af74a8-4711-4609-a94f-2cbfab9695e5,
    281336f4-2a52-4535-847c-11a4d3682ec1);

更好的方法是使用异步未来,例如:

A better way is to use an async "future" like this:

Future<List<ResultSet>> future = ResultSets.queryAllAsList(session,
    "SELECT * FROM users WHERE id = ?",
      UUID.fromString("e6af74a8-4711-4609-a94f-2cbfab9695e5"),
      UUID.fromString("281336f4-2a52-4535-847c-11a4d3682ec1")
);

for (ResultSet rs : future.get()) {
    ... // here is where you process the result set    
}

对于从同一分区内查询数据,当然可以。我假设您的意思是使用不同的集群键(否则将没有意义),并且应该以与上面列出的类似的方式工作。

As for querying data from within the same partition, of course you can. I assume that you mean with differing clustering keys (otherwise there would be no point), and that should work in a similar way to what is listed above.

这篇关于并行从Cassandra读取数据的最佳方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆