如何在这种特殊情况下模拟cassandra? [英] How to model cassandra in this particular situations?
问题描述
如果我有下面的表结构,我如何查询
if I have table structure below, how can i query by
"source = 'abc' and created_at >= '2016-01-01 00:00:00'"?
CREATE TABLE articles (
id text,
source text,
created_at timestamp,
category text,
channel text,
last_crawled timestamp,
text text,
thumbnail text,
title text,
url text,
PRIMARY KEY (id)
)
我想根据这个来建模我的系统:
http://www.ebaytechblog.com/2012/07/16/cassandra- data-modeling-best-practices-part-1 /
I would like to model my system according to this: http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-part-1/
编辑:
正在做的是非常类似于你提出的。区别是我们的主键没有方括号来源:
PRIMARY KEY(source,created_at,id)
。我们还有另外两个索引:
What we are doing is very similar to what you are proposing. The difference is our primary key doesn't have brackets around source:
PRIMARY KEY (source, created_at, id)
. We also have two other indexes:
CREATE INDEX articles_id_idx ON crawler.articles (id);
CREATE INDEX articles_url_idx ON crawler.articles (url);
我们的系统真的很慢。
Our system is really slow like this. What do you suggest?
推荐答案
感谢您的回复。给定表结构
Given the table structure
CREATE TABLE articles (
id text,
source text,
created_at timestamp,
category text,
channel text,
last_crawled timestamp,
text text,
thumbnail text,
title text,
url text,
PRIMARY KEY ((source),created_at, id)
)
您可以发出以下查询:
SELECT * FROM articles WHERE source=xxx // Give me all article given the source xxx
SELECT * FROM articles WHERE source=xxx AND created_at > '2016-01-01 00:00:00'; // Give me all articles whose source is xxx and created after 2016-01-01 00:00:00
主键中的夫妇(created_at,id)是为了保证文章的一致性。事实上,在同一已创建时间,可能有两个不同的文章
The couple (created_at,id) in the primary key is here to guarantee article unicity. Indeed, it is possible to have, at the same created_at time, 2 different articles
这篇关于如何在这种特殊情况下模拟cassandra?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!