什么时候在cassandra中覆盖行 [英] When are rows overwritten in cassandra

查看:164
本文介绍了什么时候在cassandra中覆盖行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



例如:



p>我有列(user_id int,item_id int,site_id int)和我的 PRIMARY KEY(user_id,item_id)



如果我有以下表格:

  user_id,item_id ,site_id 
2 3 4

并插入 user_id:2 ,item_id:3,site_id:10 ,我的新表将是:

  user_id,item_id, site_id 
2 3 10

不是

  user_id,item_id,site_id 
2 3 4
2 3 10

这种简单的情况在所有情况下都成立吗?我可能不知道的任何微妙之处吗?此外,我在文档中找不到这一点,并通过玩弄cassandra得出这个结论,任何人都可以提供一个文档源。

解决方案

是的,这是Cassandra的设计操作方式。在执行 UPDATE INSERT 的所有情况下,如果数据存在,并插入它它不。要记住的一个重要的点是,在引擎盖下, UPDATE INSERT 是同义词。如果你认为这两个是相同的,那么你可以开始明白为什么它的工作方式。



这就是说,你是正确的,在你必须仔细查看在文档中找到这个行为的显式引用。我在文档中找到了最接近的引用,并在下面列出:



UPDATE 文档:


如果之前不存在,则创建该行,并更新除此以外。通过包括组成分区键的所有列,在WHERE子句中指定要更新的行。 ... UPDATE SET操作在主键字段上无效。


INSERT 文档:


您不必定义所有列,除了组成键的那些。 ...如果列存在,则会更新。


现在,虽然这些摘录可能不会出来说小心不要覆盖我设法找到一个关于卡桑德拉星球的文章,更加明确:如何做在Cassandra中插入


Cassandra是一个分布式数据库,避免在写入之前读取,因此INSERT或UPDATE设置列值您指定,而不管行是否已存在。这意味着插入可以更新现有行,更新可以创建新行。这也意味着很容易意外覆盖现有数据,因此请记住这一点。



My understanding that rows are overwritten when another row with identical primary keys is inserted.

For example:

I have columns (user_id int, item_id int, site_id int), and my PRIMARY KEY(user_id, item_id)

If I had the following table:

user_id, item_id, site_id
   2       3        4

and I insert user_id : 2, item_id : 3, site_id : 10, my new table would be:

user_id, item_id, site_id
   2       3        10

not

user_id, item_id, site_id
   2       3        4
   2       3        10

Is this simple case hold in all cases? Are any subtleties that I likely not aware off? Also, I could not find this in the docs and came to this conclusion by playing around with cassandra, can anyone provide a doc source?

解决方案

Yes, this is how Cassandra is designed to operate. In all cases where an UPDATE or INSERT is executed, data will be updated (based on the keys) if it exists, and inserted it it does not. An important point to remember, is that under the hood, UPDATE and INSERT are synonymous. If you think about those two as being the same, then you can start to understand why it works the way that it does.

That being said, you are correct, in that you do have to look closely to find an explicit reference to this behavior in the documentation. I found the closest references in the docs and listed them below:

From the UPDATE documentation:

The row is created if none existed before, and updated otherwise. Specify the row to update in the WHERE clause by including all columns composing the partition key. ... The UPDATE SET operation is not valid on a primary key field.

From the INSERT documentation:

You do not have to define all columns, except those that make up the key. ... If the column exists, it is updated. The row is created if none exists.

Now while these excerpts may not come right out and say "be careful not to overwrite", I did manage to find an article on Planet Cassandra that was more explicit: How to Do an Upsert in Cassandra

Cassandra is a distributed database that avoids reading before a write, so an INSERT or UPDATE sets the column values you specify regardless of whether the row already exists. This means inserts can update existing rows, and updates can create new rows. It also means it’s easy to accidentally overwrite existing data, so keep that in mind.

这篇关于什么时候在cassandra中覆盖行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆