Cassandra - 使用ORDER BY和UPDATE对密钥进行聚类的备用方法 [英] Cassandra - alternate way for clustering key with ORDER BY and UPDATE
问题描述
我的架构是:
CREATE TABLE friends (
userId timeuuid,
friendId timeuuid,
status varchar,
ts timeuuid,
PRIMARY KEY (userId,friendId)
);
CREATE TABLE friends_by_status (
userId timeuuid,
friendId timeuuid,
status varchar,
ts timeuuid,
PRIMARY KEY ((userId,status), ts)
)with clustering order by (ts desc);
在这里,无论何时发出朋友请求,我都会在两个表中插入记录。
当我想检查用户的一对一状态,我会使用这个查询:
Here, whenever a friend-request is made, I'll insert record in both tables. When I want to check one to one status of users, i'll use this query:
SELECT status FROM friends WHERE userId=xxx AND friendId=xxx;
当我需要查询所有具有暂挂状态的记录时,我会使用:
When I need to query all the records with pending status, i'll use :
SELECT * FROM friends_by_status WHERE userId=xxx AND status='pending';
但是,当状态改变时,我可以更新'status'和'ts' 朋友表,但不在friends_by_status表中。
But, when there is a status change, I can update the 'status' and 'ts' in the 'friends' table, but not in the 'friends_by_status' table as both are part of PRIMARY KEY.
你可以看到,即使我将其取消规范,我肯定需要更新'friends_by_status'表中的'status'和'ts'以保持一致性。
You could see that even if I denormalise it, I definitely need to update 'status' and 'ts' in 'friends_by_status' table to maintain consistency.
可以保持一致性是删除记录并再次插入。 $ b $bÚ但是在cassandra模型中也不建议频繁删除。 如Cassaandra Spottify峰会中所述。
Only way I can maintain consistency is to delete the record and insert again.
But frequent delete is also not recommended in cassandra model. As said in Cassaandra Spottify summit.
我认为这是Cassandra的最大限制。
I find this as the biggest limitation in Cassandra.
是否有其他方法来排序这个问题。
Is there any other way to sort this issue.
任何解决方案是赞赏。
Any solution is appreciated.
推荐答案
我不知道你需要多久部署这个,但在Cassandra 3.0中,你可以使用实例化视图。你的朋友表将是基表,而friends_by_status将是基表的视图。
I don't know how soon you need to deploy this, but in Cassandra 3.0 you could handle this with a materialized view. Your friends table would be the base table, and the friends_by_status would be a view of the base table. Cassandra would take care updating the view when you changed the base table.
例如:
CREATE TABLE friends ( userid int, friendid int, status varchar, ts timeuuid, PRIMARY KEY (userId,friendId) );
CREATE MATERIALIZED VIEW friends_by_status AS
SELECT userId from friends WHERE userID IS NOT NULL AND friendId IS NOT NULL AND status IS NOT NULL AND ts IS NOT NULL
PRIMARY KEY ((userId,status), friendID);
INSERT INTO friends (userid, friendid, status, ts) VALUES (1, 500, 'pending', now());
INSERT INTO friends (userid, friendid, status, ts) VALUES (1, 501, 'accepted', now());
INSERT INTO friends (userid, friendid, status, ts) VALUES (1, 502, 'pending', now());
SELECT * FROM friends;
userid | friendid | status | ts
--------+----------+----------+--------------------------------------
1 | 500 | pending | a02f7fe0-49f9-11e5-9e3c-ab179e6a6326
1 | 501 | accepted | a6c80980-49f9-11e5-9e3c-ab179e6a6326
1 | 502 | pending | add10830-49f9-11e5-9e3c-ab179e6a6326
现在在视图中, :
SELECT * FROM friends_by_status WHERE userid=1 AND status='pending';
userid | status | friendid
--------+---------+----------
1 | pending | 500
1 | pending | 502
(2 rows)
基本表,它自动更新视图:
And then when you update the status in the base table, it automatically updates in the view:
UPDATE friends SET status='pending' WHERE userid=1 AND friendid=501;
SELECT * FROM friends_by_status WHERE userid=1 AND status='pending';
userid | status | friendid
--------+---------+----------
1 | pending | 500
1 | pending | 501
1 | pending | 502
(3 rows)
但是请注意,没有ts作为密钥的一部分,因为你只能从基表中添加一个非密钥字段作为密钥的一部分,在你的情况下将添加'状态'到密钥。
But note that in the view you couldn't have ts as part of the key, since you can only add one non-key field from the base table as part of the key in the view, which in your case would be adding 'status' to the key.
我认为3.0的第一个测试版将在明天发布,如果你想尝试这个。
I think the first beta release for 3.0 is coming out tomorrow if you want to try this out.
这篇关于Cassandra - 使用ORDER BY和UPDATE对密钥进行聚类的备用方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!