使用带有varchar键的CQL3在Cassandra中分页大型结果集 [英] Paging large resultsets in Cassandra with CQL3 with varchar keys

查看:122
本文介绍了使用带有varchar键的CQL3在Cassandra中分页大型结果集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在将旧的基于节俭的代码更新为CQL3。

I’m working on updating an old thrift-based code to CQL3.

其中一部分代码遍历包含20M +行的表的整个数据集。这部分最初是由于内存使用而使程序崩溃,因此我创建了RowIterator类,该类使用TokenRanges(和Hector)遍历列族。

One part of the code is walking through the entire dataset of a table consisting of 20M+ rows. This part was initially crashing the program due to memory usage, so I created a RowIterator class which iterated through the column family using TokenRanges (and Hector).

尝试使用CQL3重写此代码时,我在分页数据时遇到了麻烦。我在 http://www.datastax.com上找到了一些信息/documentation/cql/3.0/cql/cql_using/paging_c.html ,但是在尝试对第一个页面使用此代码时

When trying to rewrite this using CQL3, I’m having trouble paging through the data. I found some info over at http://www.datastax.com/documentation/cql/3.0/cql/cql_using/paging_c.html, but when trying this code for the first "page"

resultSet = session.execute("select * from " + TABLE + " where token(key) <= token(" + offset + ")");

我遇到错误


com.datastax.driver.core.exceptions.InvalidTypeException:无效的类型
对于CQL类型varchar的值为0,期望使用类java.lang.String但提供了
类java.lang.Integer

com.datastax.driver.core.exceptions.InvalidTypeException: Invalid type for value 0 of CQL type varchar, expecting class java.lang.String but class java.lang.Integer provided

该链接的示例使用数字键。有没有办法用varchar(UTF8Type)键来做到这一点?

Granted, the example at the link uses numerical keys. Is there a way to do this with varchar (UTF8Type) keys?

似乎现在有内置功能( https://issues.apache.org/jira/browse/CASSANDRA-4415 ),但我不能找到让我前进的例子。此外,我现在必须解决Cassandra 1.2.9的问题。

It seems that there is now a built-in functionality for this (https://issues.apache.org/jira/browse/CASSANDRA-4415), but I can’t find examples that get me going. Besides, I have to solve it for Cassandra 1.2.9 for now.

推荐答案

所以简单的答案是升级到Cassandra 2.0 .X并使用新的内置分页功能。但是要在Cassandra 1.2上完成它,您的方向正确。您的语法应该正常工作,如果您在cqlsh中运行查询,是否会遇到相同的错误?当像这样进行分页时,最好在示例中使用>,这可能是问题所在。您想从表限制100中的 select *开始,然后转到表中 select *,其中token(key)> token('last key' )限制100

So the easy answer is to upgrade to Cassandra 2.0.X and use the new built in paging functionality. But to get it done on Cassandra 1.2 you are on the right path. Your syntax should be working, if you run the query you are trying in cqlsh do you get the same error? When paging like this it is best to use ">" like in the example, that might be the issue. You want to start with select * from table limit 100 then go to select * from table where token(key)>token('last key') limit 100

我也可以通过准备好的语句来尝试。字符串操作可能会使偏移量变得很有趣。

Also I would try it with a prepared statement. The string manipulations may be doing something funny to the offset.

这篇关于使用带有varchar键的CQL3在Cassandra中分页大型结果集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆