我如何处理我的有效负载以在Cassandra中以原子性/一致性在多个表中插入批量数据? [英] How i can process my payload to insert bulk data in multiple tables with atomicity/consistency in cassandra?

查看:81
本文介绍了我如何处理我的有效负载以在Cassandra中以原子性/一致性在多个表中插入批量数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须为客户设计数据库,该客户在接下来的24个月内要通过多个供应商获取数百万种材料的价格。因此,该数据库将在未来24个月内每天存储特定供应商提供的每种物料的价格。现在我要解决多个用例,因此我创建了多个表以最佳方式解决每个用例。
现在,将数据定期插入这些表中是很大的一部分(比如说1k个项目),这应该确保数据的一致性,也就是应该将数据插入所有表或表中。没有一个。失败的情况应标记为失败,没有插入内容以采取进一步措施。我如何才能在Cassandra中有效地解决这个问题?

I have to design the database for customers having prices for millions of materials they acquire through multiple suppliers for the next 24 months. So the database will store prices on a daily basis for every material supplied by a specific supplier for the next 24 months. Now I have multiple use cases to solve so I created multiple tables to solve each use case in the best possible way. Now the insertion of data into these tables will happen on a regular basis in a big chunk (let's say for 1k items), which should ensure the data consistency as well i.e. the data should be inserted into all the tables or in none of them. Failure in doing so should be flagged as a "failure" with no inserts for further action. How can I solve this in Cassandra effectively?

我可以想到的选择是使用小型BATCH流程(例如1K件商品的数量为1K)。在插入具有不同主键集的不同表中时,我可能会遇到多个分区。

On option I can think of is to use small BATCH processes (1K in number for 1k items for example). I might hit multiple partitions during insertion in different tables having a different set of primary keys;

有什么想法吗?
谢谢

Any Thoughts? Thanks

推荐答案

如果您谈论的是Database(Cassandra),那么您应该考虑很多有关数据的事情建模点。
您需要仔细阅读下面与批处理链接的数据建模详细信息。
https://docs.datastax。 com / en / dse / 6.0 / cql / cql / ddl / dataModelingCQLTOC.html
https://docs.datastax.com/en/dse/6.0/cql/cql/cql_reference/cql_commands/cqlBatch.html

If you are talking about with respect of Database(Cassandra) then you should consider many things for data modelling point. You need to go through the data modeling detail on below link with batch. https://docs.datastax.com/en/dse/6.0/cql/cql/ddl/dataModelingCQLTOC.html https://docs.datastax.com/en/dse/6.0/cql/cql/cql_reference/cql_commands/cqlBatch.html

此外,根据应用程序的性质,您应该考虑用于处理高写入或读取的压缩策略。

Also, based on application nature you should think about compaction strategy for processing of high writes or reads.

这篇关于我如何处理我的有效负载以在Cassandra中以原子性/一致性在多个表中插入批量数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆