cassandra 的 cqlsh 控制台中的操作超时错误 [英] Operation Time Out Error in cqlsh console of cassandra

查看:27
本文介绍了cassandra 的 cqlsh 控制台中的操作超时错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个三节点的 Cassandra 集群,我创建了一个包含超过 2,000,000 行的表.

I have a three nodes Cassandra Cluster and I have created one table which has more than 2,000,000 rows.

当我在 cqlsh 中执行此 (select count(*) from userdetails) 查询时,出现此错误:

When I execute this (select count(*) from userdetails) query in cqlsh, I got this error:

OperationTimedOut: errors={}, last_host=192.168.1.2

OperationTimedOut: errors={}, last_host=192.168.1.2

当我为更少的行或限制为 50,000 运行计数函数时,它工作正常.

When I run count function for less row or with limit 50,000 it works fine.

推荐答案

count(*) 实际上翻遍了所有数据.因此,没有限制的 select count(*) from userdetails 预计会因这么多行而超时.这里的一些细节:http://planetcassandra.org/blog/counting-key-in-cassandra/

count(*) actually pages through all the data. So a select count(*) from userdetails without a limit would be expected to timeout with that many rows. Some details here: http://planetcassandra.org/blog/counting-key-in-cassandra/

您可能需要考虑自己维护计数,使用 Spark,或者如果您只想要一个棒球场号码,您可以从 JMX 获取它.

You may want to consider maintaining the count yourself, using Spark, or if you just want a ball park number you can grab it from JMX.

从 JMX 中获取它可能有点棘手,具体取决于您的数据模型.要获取分区数,请获取 org.apache.cassandra.metrics:type=ColumnFamily,keyspace={{Keyspace}},scope={{Table }},name=EstimatedColumnCountHistogram mbean 并求和增加所有 90 个值(这是 nodetool cfstats 输出的内容).它只会为您提供 sstables 中存在的数字,因此为了使其更准确,您可以进行刷新或尝试从 MemtableColumnsCount mbean

To grab from JMX it can be a little tricky depending on your data model. To get the number of partitions grab the org.apache.cassandra.metrics:type=ColumnFamily,keyspace={{Keyspace}},scope={{Table​}},name=EstimatedColumnCountHistogram mbean and sum up all the 90 values (this is what nodetool cfstats outputs). It will only give you the number that exist in sstables so to make it more accurate you can do a flush or try to estimate number in memtables from the MemtableColumnsCount mbean

对于一个非常基本的大概数字,您可以从 system.size_estimates 获取所有列出的范围内的估计分区计数(请注意,这只是一个节点上的数字).将其乘以节点数,然后除以 RF.

For a very basic ballpark number you can grab the estimated partition counts from system.size_estimates across all the ranges listed (note that this is only number on one node). Multiply that out by number of nodes, then divided by RF.

这篇关于cassandra 的 cqlsh 控制台中的操作超时错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆