DocumentDB更改摘要-如何查看对文档的所有更改 [英] DocumentDB Change Feed - How to see all changes to a document

查看:42
本文介绍了DocumentDB更改摘要-如何查看对文档的所有更改的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

DocumentDB提供的这项新的更改Feed 功能很酷但是,文档指出:

This new Change Feed feature provided by DocumentDB is pretty cool. However, the documentation states:

对文档的每次更改在更改Feed中仅出现一次.更改日志中仅包含给定文档的最新更改.中间更改可能不可用.

Each change to a document appears only once in the change feed. Only the most recent change for a given document is included in the change log. Intermediate changes may not be available.

基本上,如果文档来自修订版A-> B-> C,则在轮询变更供稿时,我们只会得到"C". -我有一种情况想同时看到"A"和"B".

Basically, if a document goes from revision A->B->C, when the change feed is polled, we're only going to get "C." - I have a situation where I want to see "A" and "B" as well.

我知道一些解决该问题的现有模式,但我确实希望利用这一新的变更供稿功能.我希望它将返回A,B和C.

I know of a few existing patterns to solve this, but I was really hoping to leverage this new Change Feed feature. I hoped it would return A, B, and C.

此功能的目的是让工人"非常频繁地轮询服务吗?显然,工人轮询的频率越高,跳过文档修订的可能性就越小.但是,我不想因此而不利地影响集合的性​​能.

Is the intent of this feature to have "workers" polling the service very frequently? Obviously, the more frequently workers poll, the less likely they are to skip a revision to a document. However, I wouldn't want to adversely affect performance of the collection as a result.

推荐答案

DocumentDB团队成员在此处.我首先要说一下,请在这里建议/投票支持该文档的所有版本/版本: http ://feedback.azure.com/forums/263030-documentdb

DocumentDB team member here. I'll start off saying please propose/vote for support for all versions/generations of the document here: http://feedback.azure.com/forums/263030-documentdb

支持最新版本的变更Feed的目的有两个:

The intent of Change Feed supporting the latest version was for two reasons:

  1. 数据同步和流处理等许多问题都依赖于最新版本,不需要中间版本
  2. 这种方法的优点是不需要额外的存储来存储所有版本,也不需要一定的时间来更改供稿.

您曾经提到过,您已经知道变通方法,但是我只是为了其他人的利益而声明:可以通过反转DocumentDB中存储的内容来解决此问题.也就是说,您可以通过创建新文档将所有版本存储在DocumentDB中,然后通过更新最新版本通过更改feed合并它们.

You had mentioned you're already aware of workarounds, but I'll just state this for the benefit of others: this problem can be solved by inverting what's stored in DocumentDB. That is, you can store all versions in DocumentDB via creating new documents, then consolidate them via change feed by upserting the latest version.

要回答评论中的问题,出于以下原因,您必须绝对在时间戳查询中使用Change Feed :

To answer the question in comments, you must absolutely use Change Feed over querying by timestamp for the following reasons:

  1. 更改Feed效率更高.在分布式数据集中查询按时间戳排序"将执行全局排序,而更改摘要"将部分在分区时间戳中进行本地排序.此外,没有查询解析开销
  2. 由于时钟偏斜,在分布式系统中,时钟时间的意义不大,因此区分一秒/毫秒内的多个更新可能很重要.相反,您需要逻辑时间"来表示数据库中确切的提交顺序.使用变更供稿,分区键中的更新按提交的确切顺序进行,您将在事务中更新所有具有相同逻辑时间戳记的文档.
  3. 与查询不同,变更Feed可以在多个工作人员之间以分布式方式使用.在使用下游可扩展计算框架(例如Apache Storm或Azure Functions)时,这非常好.

这篇关于DocumentDB更改摘要-如何查看对文档的所有更改的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆