Cassandra SELECT DISTINCT和超时问题 [英] Cassandra SELECT DISTINCT and timeout issue

查看:475
本文介绍了Cassandra SELECT DISTINCT和超时问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

运行以下CQL查询时:

When running the following CQL query:

SELECT DISTINCT partition_key FROM table_name;

据说是要返回给定表使用的分区键列表。但是,默认超时设置为10s,它总是超时:

This is supposedly meant to return the list of partition keys that are in use for the given table. However, with the default timeout settings of 10s, it always times out:

ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting for replica nodes' responses] message="Operation timed out - received only 0 responses." info={'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}

将超时设置更改为:

read_request_timeout_in_ms: 60000
range_request_timeout_in_ms: 60000
request_timeout_in_ms: 60000

然后运行所述查询会导致多个Cassandra节点崩溃,包括协调器节点。该表大约有超过1亿行,带有大约5000个唯一的分区键。

And then running said query results in several Cassandra nodes crashing, including the coordinator node. The table has approximately >100M rows with about 5000 unique partition keys.

是否有一种变通方法来查找唯一的分区键列表?

Is there a workaround to find the unique list of partition keys?

推荐答案

该查询在cassandra的现代版本(2.1及更高版本)上应该可以正常工作,前提是您使用的是支持页面调度/提取大小的客户端,并且使用足够低的提取大小(实际限制取决于您的服务器负载)。

This query should work fine on modern versions of cassandra (2.1 and newer) assuming you're using a client that supports paging/fetch-size, and use a sufficiently low fetch-size (the actual limit depends on your server load).

使用第三方驱动程序,寻找降低页面/提取大小的选项。

Using a third party driver, look for an option to drop the page/fetch size. Set it to 100 and see if it behaves better.

使用cqlsh,如果您使用的是Cassandra 3.0或更高版本,请尝试 PAGING 100;

Using cqlsh, if you have cassandra 3.0 or newer, try PAGING 100;

这篇关于Cassandra SELECT DISTINCT和超时问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆