您可以在基于复制的分布式数据库中删除吗? [英] Can you Delete in a replication based Distributed Database?

查看:121
本文介绍了您可以在基于复制的分布式数据库中删除吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

到目前为止,我一直生活在一种印象中,即您不能真正删除基于复制的分布式数据库中的行。一切都可以在基于副本的副本中很好地工作。但是在复制中,您将它们标记为考虑此删除,并在每个最后一个查询中将其过滤掉。但是您实际上从未从DB中删除任何内容。我认为是时候验证这一假设是否正确。

I have thus far been living under the impression that you can not truly delete a row in a replication based Distributed Database. It all works well in a Copy based one. But in Replication you mark them as "consider this delete" and filter them out in every last query. But you do not ever actually delte something from teh DB. I think it is time to verify if that asumption is true.

我的直觉是,如果发生了关键的大肠菌病,您将与Replicaiton参加竞赛。它是这样的:

My udnerstanding is that you would run into a Race Condtion with the Replicaiton if there was ever a key colission. It goes something like this:

数据库A:
在键11(11A)下添加一个条目

Database A: Adds a Entry under Key 11 (11A)

数据库B:
在键11(11B)下添加条目

Database B: Adds a Entry under Key 11 (11B)

数据库A:
删除键11下的条目

Database A: Deletes a Entry under Key 11

现在取决于顺序这三个操作在野外相遇:
预期顺序为:

Now it depends in wich Order these 3 operations "meet" in the wild: The expected order would be:


  • 11A创建

  • 11删除(即11A)

  • 11B创建

但是如果发生这种情况怎么办?

But what if this happens instead?


  • 11A创建

  • 11B创建(失败,已经是键11)

  • 11删除

或更糟糕的是?


  • 11B创建

  • 11A创建(已失败键11)

  • 11 Delte(将命中11B)

推荐答案

我假设我们正在谈论的是一种无领导者的分布式数据库,该数据库中所有节点都扮演相同的角色( e不是主节点),因此所有节点均可同时进行读写操作。否则,如果只有一个主服务器,则它可以对所有写入/删除施加特定的顺序,从而解决您所描述的并发问题。

I'll assume that we are talking about a leaderless distributed database, that is one where all nodes play the same role (there is no master), so reads and writes can both be served by all nodes. Otherwise, if there's a single master, it can impose a specific ordering on all the writes/deletes and thus resolve the concurrency problem you are describing.


但是在复制中,您将它们标记为考虑此删除,并在每个最后一个查询中将它们过滤掉

But in Replication you mark them as "consider this delete" and filter them out in every last query.

没错,这样做的主要原因有两个:

That's right and it's done for 2 main reasons:


  • 正确性:如果项目已删除而不是逻辑删除,则可能存在一个模棱两可的实例,其中查询了2个节点,其中节点A有该项目,而节点B没有。整个系统无法区分是删除了该项目(但在A中删除失败)还是该项目是最近创建的(但在B中创建失败)。通过逻辑删除,可以清楚地区分这种差异。

  • 性能:大多数系统不执行就地更新(如RDBMS数据库通常那样),但是而是执行仅追加操作。这样做是为了提高性能,因为磁盘中的随机访问操作比顺序操作要慢得多。结果,执行通过逻辑删除的删除与该方法非常吻合。

  • correctness: if items were deleted instead of tombstoned, then there could be an ambiguous instance, where 2 nodes are consulted where node A has the item but node B does not. And the system as a whole cannot distinguish whether that item was deleted (but the delete failed in A) or whether the item was recently created (but the created failed in B). With tombstones, this distinction can be made clear.
  • performance: most of those systems do not perform in-place updates (as RDBMS databases usually do), but instead perform append-only operations. That's done in order to improve performance, since random access operations in disk are much slower than sequential operations. As a result, performing the deleted via tombstones aligns well with this approach.

但是您实际上从未从数据库中删除某些东西。

But you do not ever actually delete something from the DB.

那不一定是真的。通常,逻辑删除最终会从数据库中删除(以垃圾回收的方式)。最终,这意味着当系统可以确保上述示例不再对这些项目发生时(因为删除已传播到所有节点),它们将被删除。

That is not necessarily true. Usually, the tombstones are eventually removed from the database (in a garbage-collection fashion). Eventually here means that they are deleted when the system can be sure that the example described above cannot happen anymore for these items (because the deletes have propagated to all the nodes).


我的理解是,如果发生键冲突,您将在复制中遇到竞争状况

My understanding is that you would run into a Race Condition with the Replication if there was ever a key collision

对于大多数此类分布式系统而言,这是正确的。结果将取决于操作到达数据库的顺序。但是,这些数据库中的一些提供了其他机制,例如条件写入/删除。这样,您只能删除项目的特定版本,或者仅当项目的版本为特定版本时才对其进行更新(因此,如果其他人同时对其进行了更新,则中止更新)。来自Cassandra的此类操作的示例是有条件的删除和所谓的轻量级交易

That's right for most of the distributed systems of that kind. The result will depend on the order the operations reached the database. However, some of these databases provide alternative mechanisms, such as conditional writes/deletes. In this way, you can only delete a specific version of an item or update an item only if its version if a specific one (thus aborting the update if someone else updated it in the meanwhile). An example of operations of this kind from Cassandra are conditional deletes and the so-called lightweight transactions

以下是一些描述Riak和Cassandra如何执行删除操作的参考,其中也包含有关墓碑的大量信息:

Below are some references that describe how Riak and Cassandra perform deletes, which contain a lot of information around tombstones as well:

  • Riak: Object deletion
  • About deletes and tombstones in Cassandra

这篇关于您可以在基于复制的分布式数据库中删除吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆