复合键在Cassandra与猪 [英] Composite key in Cassandra with Pig
问题描述
我们有一个CQL表,看起来像这样:
We have a CQL table that looks something like this:
CREATE table data (
occurday text,
seqnumber int,
occurtimems bigint,
unique bigint,
fields map<text, text>,
primary key ((occurday, seqnumber), occurtimems, unique)
)
$ c> cqlsh 像这样:
I can query this table from cqlsh
like this:
select * from data where seqnumber = 10 AND occurday = '2013-10-01';
此查询工作并返回预期数据。
This query works and returns the expected data.
如果我在Pig中执行此查询作为 LOAD
的一部分,则无法正常工作。
If I execute this query as part of a LOAD
from within Pig, however, things don't work.
-- Need to URL encode the query
data = LOAD 'cql://ks/data?where_clause=seqnumber%3D10%20AND%20occurday%3D%272013-10-01%27' USING CqlStorage();
提供
InvalidRequestException(why:seqnumber cannot be restricted by more than one relation if it includes an Equal)
at org.apache.cassandra.thrift.Cassandra$prepare_cql3_query_result.read(Cassandra.java:39567)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
at org.apache.cassandra.thrift.Cassandra$Client.recv_prepare_cql3_query(Cassandra.java:1625)
at org.apache.cassandra.thrift.Cassandra$Client.prepare_cql3_query(Cassandra.java:1611)
at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader$RowIterator.prepareQuery(CqlPagingRecordReader.java:591)
at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader$RowIterator.executeQuery(CqlPagingRecordReader.java:621)
这些行为应该不一样吗?为什么版本通过Pig失败,直接 cqlsh
命令工作?
Shouldn't these behave the same? Why is the version through Pig failing where the straight cqlsh
command works?
推荐答案
Hadoop正在使用 CqlPagingRecordReader 以尝试加载数据。这导致与您输入的查询不一致的查询。寻呼记录读取器尝试一次获取小片Cassandra数据,以避免超时。
Hadoop is using CqlPagingRecordReader to try to load your data. This is leading to queries that are not identical to what you have entered. The paging record reader is trying to obtain small slices of Cassandra data at a time to avoid timeouts.
这意味着您的查询执行为
This means that your query is executed as
SELECT * FROM "data" WHERE token("occurday","seqnumber") > ? AND
token("occurday","seqnumber") <= ? AND occurday='A Great Day'
AND seqnumber=1 LIMIT 1000 ALLOW FILTERING
这就是为什么你看到你重复的键错误。我将向Cassandra项目提交错误。
And this is why you are seeing your repeated key error. I'll submit a bug to the Cassandra Project.
Jira:
https://issues.apache.org/jira/browse/CASSANDRA-6151
这篇关于复合键在Cassandra与猪的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!