Cassandra表中的数据引用和更新 [英] Data reference and updation in cassandra tables
问题描述
我有一个名为 usertab的表,用于存储用户详细信息,例如:
I have a table Called 'usertab' to store user details such as:
userid uuid,
firstname text,
lastname text
email text
gender int
image text
大多数其他表都包含userid作为用于引用'usertab'的字段,
,但是当我从其他表中检索数据时,我需要执行另一个select查询以获取用户详细信息。
Most of the other tables contains userid as a field for referencing 'usertab', but when I retrieve data from other table, I need to execute another select query to get user details.
因此,如果检索到10,000或更多数据,则执行相同数量的选择查询以获取用户详细信息。这样会使我们的系统运行缓慢。
So if 10,000 or more data retrieved, same number of select query executed for getting user details. This makes our system slow.
因此,除了userid字段外,我们还在其他表中添加了usertab字段,例如firstname,lastname,gender,image。
So we add usertab fields such as firstname,lastname, gender, image in other tables in addition to userid field.
因此在数据检索中,系统变得很快,但是我们面临另一个问题。如果usertab表中的任何更改(例如名字,姓氏,性别或图像的更改),我们需要更新其他包含用户详细信息的表。如果我们考虑其他表中的大量数据,该如何处理?
So on data retrieval, the system become fast, but we faced another problem. If any changes in usertab table such as change in firstname, lastname, gender or image, we need to update other tables that contains user details. If we consider huge amount of data in other tables, how can I handle this?
我们正在使用Lucene索引和C#。
We are using lucene index and C#.
推荐答案
Cassandra的写入速度明显快于读取。
为什么cassandra偏爱去归化而不是归一化
Cassandra writes significantly faster and more efficient than reads.
That's why cassandra prefer Denationalization over normalization
去归一化是一个概念,应该设计数据模型,以便可以从结果中提供给定的查询从一行中查询。不必从多个表和行进行多次读取以收集响应所需的所有数据,而是修改应用程序逻辑以将所需数据多次插入将来可能需要的每行中。这样,所有必需数据都可以一次读取即可提供,可防止多次查找。
Denormalization is the concept that a data model should be designed so that a given query can be served from the results from one row and query. Instead of doing multiple reads from multiple tables and rows to gather all the required data for a response, instead modify your application logic to insert the required data multiple times into every row that might need it in the future This way, all required data can be available in just one read which prevents multiple lookups.
执行多个更新时,可以使用executeAsync。
When executing multiple update you can use executeAsync.
会话允许异步通过公开ExecuteAsync方法执行语句(对于任何类型的语句:简单,绑定或批处理)。
Session allows asynchronous execution of statements (for any type of statement: simple, bound or batch) by exposing the ExecuteAsync method.
//Execute a statement asynchronously using await
var rs = await session.ExecuteAsync(statement);
资料来源: https://www.hakkalabs.co/articles/cassandra-data-modeling-guide
这篇关于Cassandra表中的数据引用和更新的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!