卡桑德拉索引与物化视图 [英] Cassandra indexes vs materialized view
问题描述
我有下一个Cassandra表结构:
I have next Cassandra table structure:
CREATE TABLE ringostat.hits (
hitId uuid,
clientId VARCHAR,
session MAP<VARCHAR, TEXT>,
traffic MAP<VARCHAR, TEXT>,
PRIMARY KEY (hitId, clientId)
);
INSERT INTO ringostat.hits (hitId, clientId, session, traffic)
VALUES('550e8400-e29b-41d4-a716-446655440000'. 'clientId', {'id': '1', 'number': '1', 'startTime': '1460023732', 'endTime': '1460023762'}, {'referralPath': '/example_path_for_example', 'campaign': '(not set)', 'source': 'www.google.com', 'medium': 'referal', 'keyword': '(not set)', 'adContent': '(not set)', 'campaignId': '', 'gclid': '', 'yclid': ''});
INSERT INTO ringostat.hits (hitId, clientId, session, traffic)
VALUES('650e8400-e29b-41d4-a716-446655440000'. 'clientId', {'id': '1', 'number': '1', 'startTime': '1460023732', 'endTime': '1460023762'}, {'referralPath': '/example_path_for_example', 'campaign': '(not set)', 'source': 'www.google.com', 'medium': 'cpc', 'keyword': '(not set)', 'adContent': '(not set)', 'campaignId': '', 'gclid': '', 'yclid': ''});
INSERT INTO ringostat.hits (hitId, clientId, session, traffic)
VALUES('750e8400-e29b-41d4-a716-446655440000'. 'clientId', {'id': '1', 'number': '1', 'startTime': '1460023732', 'endTime': '1460023762'}, {'referralPath': '/example_path_for_example', 'campaign': '(not set)', 'source': 'www.google.com', 'medium': 'referal', 'keyword': '(not set)', 'adContent': '(not set)', 'campaignId': '', 'gclid': '', 'yclid': ''});
我要选择 source ='www.google.com的所有行'
和 medium ='referral'
。
SELECT * FROM hits WHERE traffic['source'] = 'www.google.com' AND traffic['medium'] = 'referal' ALLOW FILTERING;
不添加允许过滤
我得到了错误: 未找到针对非主键列限制的受支持的二级索引
。
Without add ALLOW FILTERING
I got error: No supported secondary index found for the non primary key columns restrictions
.
这就是为什么我看到两个选项的原因:
That's why I see two options:
- 在流量列上创建索引。
- 创建实例化视图。
- 创建另一个表,并将
INDEX
设置为traffic
列。
- Create index on traffic column.
- Create materialized view.
- Create another table and set
INDEX
fortraffic
column.
哪个是最佳选择?另外,我有很多类型的 MAP
类型的字段需要过滤。如果在每个字段上都添加 INDEX
会出现什么问题?
Which is the best option ? Also, I have many fields with MAP
type on which I will need to filter. What issues can be if on every field I will add INDEX
?
谢谢。
推荐答案
来自何时使用索引。
在以下情况下请勿使用索引:
Do not use an index in these situations:
- 在高基数列上,因为随后您会查询大量记录以获取少量结果。 [...]相反,在基数极低的列(如布尔列)上创建索引是没有意义的。
- 在使用计数器列的表中
- 在经常更新或删除的列上。
- 除非有狭窄的查询,否则在大分区中查找行。
- On high-cardinality columns because you then query a huge volume of records for a small number of results. [...] Conversely, creating an index on an extremely low-cardinality column, such as a boolean column, does not make sense.
- In tables that use a counter column
- On a frequently updated or deleted column.
- To look for a row in a large partition unless narrowly queried.
如果您的计划使用满足这些条件中的一个或多个,则最好使用实例化视图。
If your planned usage meets one or more of these criteria, it is probably better to use a materialized view.
这篇关于卡桑德拉索引与物化视图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!