通过java中的主键列表进行cassandra查找 [英] cassandra lookup by list of primary keys in java

查看：27 发布时间：2021/12/31 17:50:15 java cassandra cql cassandra-3.0 datastax-java-driver

本文介绍了通过java中的主键列表进行cassandra查找的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在实现一项功能，该功能需要通过主键列表查找 Cassandra.

下面是一个示例数据，其中 id 是主键

mytableid 列 11 4232 5423 6784 455345 4356346 24357 6788 45649 546

我的大多数查询都是按 id 查找，但对于某些特殊情况，我想获取 id 列表的数据.我目前的做法如下:

<代码>公共对象 fetchFromCassandraForId(int id);int ids[] = {1, 3, 5, 7, 9};列表<对象>结果;for(int id: ids) {results.add(fetchFromCassandraForId(id));}

这导致向 cassandra 发出多个网络调用，是否可以以某种方式进行批处理，因此我想知道 cassandra 是否支持通过 id 列表进行快速查找

select coulmn1 from mytable where id in (1, 3, 5, 7, 9);

?任何帮助或指示将不胜感激?

解决方案

如果 id 是完整的主键，那么 Cassandra 支持这个，尽管从性能的角度不推荐:

请求被发送到协调器节点
协调器节点为每个id找到一个副本，并向它们发送单独的请求
等待每个节点的结果，将它们收集到结果集 &发回

结果:

您的所有子查询都需要等待最慢的副本
你有一个额外的网络希望，从协调者到副本
您给协调器节点施加了更大的压力，因为它需要将结果保存在内存中

如果您对来自应用程序的每个 id 值进行大量并行异步请求，那么您:

避免额外的跃点 - 如果您使用带有令牌感知负载平衡的准备好的语句，则查询将直接发送到副本
你可能会在得到结果时开始处理，而不是等待一切

因此发送并行异步请求可能比使用 IN 发送一个请求更快...

I am implementing a feature which requires looking up Cassandra by a list of primary keys.

Below is an example data where id is primary key

mytable
id          column1
1           423
2           542
3           678
4           45534
5           435634
6           2435
7           678
8           4564
9           546

Most of my queries a lookup by id, but for some special cases I would like to get data for a list of ids. The way I am currently doing is a follows:


public Object fetchFromCassandraForId(int id);

int ids[] = {1, 3, 5, 7, 9};
List<Object> results;
for(int id: ids) {
  results.add(fetchFromCassandraForId(id));
}

This results in issuing multiple network call to cassandra, Is it possible to batch this somehow, therefore i would like to know if cassandra supports fast lookup by list of ids

select coulmn1 from mytable where id in (1, 3, 5, 7, 9);

? Any help or pointers would be appreciated?

解决方案

If the id is the full primary key, then Cassandra supports this, although it's not recommended from performance point of view:

request is sent to coordinator node
coordinator node finds a replica for each of the id, and send individual request to them
wait for results from every node, collect them to result set & send back

As result:

all your sub-queries need to wait for slowest of the replicas
you have an additional network hope from coordinator to replica
you put more pressure to the coordinator node as it need to keep results in memory

If you do a lot of parallel, asynchronous requests for each of the id values from application, then you:

avoid an additional hop - if you're using prepared statements with token-aware load balancing, then query is sent directly to replicas
you may start to process results as you get them, not waiting for everything

So sending parallel asynchronous requests could be faster than sending one request with IN...

这篇关于通过java中的主键列表进行cassandra查找的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

通过java中的主键列表进行cassandra查找 [英] cassandra lookup by list of primary keys in java

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

通过java中的主键列表进行cassandra查找 [英] cassandra lookup by list of primary keys in java

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭