如何在Cassandra中执行批处理语句和LWT作为事务 [英] How to executing batch statement and LWT as a transaction in Cassandra

查看:314
本文介绍了如何在Cassandra中执行批处理语句和LWT作为事务的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下模型的两个表:

  CREATE TABLE IF NOT NOT EXISTS INV(
CODE TEXT,
PRODUCT_CODE TEXT,
LOCATION_NUMBER TEXT,
QUANTITY DECIMAL,
CHECK_INDICATOR BOOLEAN,
VERSION BIGINT,
PRIMARY KEY((LOCATION_NUMBER,PRODUCT_CODE)));

CREATE TABLE IF NOT NOT EXISTS LOOK_INV(
LOCATION_NUMBER TEXT,
CHECK_INDICATOR BOOLEAN,
PRODUCT_CODE TEXT,
CHECK_INDICATOR_DDTM TIMESTAMP,
PRIMARY KEY( (LOCATION_NUMBER),CHECK_INDICATOR,PRODUCT_CODE))
WITH CLUSTERING ORDER BY(CHECK_INDICATOR ASC,PRODUCT_CODE ASC);

我有一个业务操作,我需要更新表中的CHECK_INDICATOR和INV表中的QUANTITY。
由于CHECK_INDICATOR是LOOK_INV表中键的一部分,我需要先删除该行并插入一个新行。
以下是我需要以批处理方式执行的三项操作(要么全部将成功执行,要么不执行任何操作)


  1. 从LOOK_INV表中删除行。

  2. 在LOOK_INV表中插入行。

  3. 在INV表中更​​新QUANTITY和CHECK_INDICATOR。

由于INV表正在被多个线程访问,我需要确保在更新INV表行之前它自上次读取以来没有被更改过。
我正在使用LWT事务来使用VERSON列更新INV表并批量操作以便在LOOK_INV表中删除和插入。我想批量添加所有三个操作。但由于LWT在批处理中是不可接受的,我需要执行以上述方式。



这种方法的问题在于,在某些情况下批量执行成功,但更新INV表会导致超时异常,并且数据在表中都不存在。



cassandra是否提供优雅处理这类场景的功能?

解决方案

轻量级交易(LWT)的注意事项



轻量级交易目前被认为是Cassandra反模式,因为性能问题你很痛苦。



这里有一些上下文要解释。



Cassandra不使用RDBMS ACID交易回滚或锁定机制。由于对各种分布式数据存储的基本约束,它不提供锁定,称为 CAP定理。它指出分布式计算机系统不可能同时提供以下所有三种保证:




  • 一致性(所有节点都看到相同的(同时)数据)

  • 可用性(保证每个请求都收到有关成功或失败的响应)

  • 分区容差(尽管任意消息丢失或部分系统出现故障,系统仍继续运行。)





因此,Cassandra不适合原子操作,你不应该为此目的使用Cassandra。



<它确实提供了轻量级事务,在某些情况下可以替换锁定。但由于Paxos协议(LWT的基础)涉及节点之间发生的一系列操作,因此在提出LWT的节点与作为事务一部分的其他副本之间将存在多次往返。



这会对性能产生负面影响,这也是WriteTimeoutException错误的原因之一。在这种情况下,您无法知道是否已应用LWT操作,因此您需要重试它才能回退到稳定状态。由于LWT非常昂贵,驱动程序不会自动为您重试。



LTW如果频繁使用会带来很大的性能损失,我们会看到一些客户端出现超时问题由于使用LWT。



轻量级交易通常是一个坏主意,应该不经常使用。


I have two table with below model:

CREATE TABLE IF NOT EXISTS INV (
  CODE TEXT,
  PRODUCT_CODE TEXT,
  LOCATION_NUMBER TEXT,
  QUANTITY DECIMAL,
  CHECK_INDICATOR BOOLEAN,
  VERSION BIGINT,
PRIMARY KEY ((LOCATION_NUMBER, PRODUCT_CODE)));

CREATE TABLE IF NOT EXISTS LOOK_INV (
  LOCATION_NUMBER TEXT,
  CHECK_INDICATOR BOOLEAN,
  PRODUCT_CODE TEXT,
  CHECK_INDICATOR_DDTM TIMESTAMP,
PRIMARY KEY ((LOCATION_NUMBER), CHECK_INDICATOR, PRODUCT_CODE))
WITH CLUSTERING ORDER BY (CHECK_INDICATOR ASC, PRODUCT_CODE ASC);

I have a business operation where i need to update CHECK_INDICATOR in both the tables and QUANTITY in INV table. As CHECK_INDICATOR is a part of key in LOOK_INV table, i need to delete the row first and insert a new row. Below are the three operations i need to perform in batch fashion (either all will be executed sucessfully or none should be executed)

  1. Delete row from LOOK_INV table.
  2. Insert row in LOOK_INV table.
  3. Update QUANTITY and CHECK_INDICATOR in INV table.

As INV table is getting access by multiple threads, i need to make sure before updating INV table row that it has not been changed since last read. I am using LWT transaction to update INV table using VERSON column and batch operation for deletion and insertion in LOOK_INV table.I want to add all the three operation in batch.But since LWT is not acceptable in batch i need to execute in aforesaid fashion.

The problem with this approach is that in some scenario batch get executed sucessfully but updating INV table results in timeout exception and data become incosistent in both the table.

Is there any feature provided by cassandra to handle these type of scenario elegantly?

解决方案

Caution with Lightweight Transactions (LWT)

Lightweight Transactions are currently considered a Cassandra anti-pattern because of the performance issues you are suffering.

Here is a bit of context to explain.

Cassandra does not use RDBMS ACID transactions with rollback or locking mechanisms. It does not provide locking because of a fundamental constraint on all kinds of distributed data store called the CAP Theorem. It states that it is impossible for a distributed computer system to simultaneously provide all three of the following guarantees:

  • Consistency (all nodes see the same data at the same time)
  • Availability (a guarantee that every request receives a response about whether it was successful or failed)
  • Partition tolerance (the system continues to operate despite arbitrary message loss or failure of part of the system)

Because of this, Cassandra is not good for atomic operations and you should not use Cassandra for this purpose.

It does provide lightweight transactions, which can replace locking in some cases. But because the Paxos protocol (the basis for LWT) involves a series of actions that occur between nodes, there will be multiple round trips between the node that proposes a LWT and the other replicas that are part of the transaction.

This has an adverse impact on performance and is one reason for the WriteTimeoutException error. In this situation you can't know if the LWT operation has been applied, so you need to retry it in order to fallback to a stable state. Because LWTs are so expensive, the driver will not automatically retry it for you.

LTW comes with big performance penalties if used frequently, and we see some clients with big timeout issues due to using LWTs.

Lightweight transactions are generally a bad idea and should be used infrequently.

这篇关于如何在Cassandra中执行批处理语句和LWT作为事务的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆