arangodump:我怎么知道最新的“修订"? [英] arangodump: How do I know the latest "revision"?

查看:111
本文介绍了arangodump:我怎么知道最新的“修订"?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在从arangodump中手动解析和导入数据,其中包含每个文档的每个修订版本的记录.问题是,我无法确定哪个项目是最新版本.

I'm doing manual parsing and importing of data from arangodump, which contains records of every revision of every document. The problem is, I cannot tell which item is the latest revision.

(对于已删除文档,在arangodump中将存在带有修订但文档为空的记录的情况,这也是有问题的.)

(This is also problematic in the case of deleted documents where there would be records in the arangodump with a revision but with an empty document.)

docs :

客户可以使用修订版本ID进行简单的相等/不相等比较(例如,检查文档是否已更改),但客户不应使用修订版本ID进行比其更大/更少的比较来检查是否文档修订版本比另一个版本旧,即使在某些情况下也可行.

Clients can use revisions ids to perform simple equality/non-equality comparisons (e.g. to check whether a document has changed or not), but they should not use revision ids to perform greater/less than comparisons with them to check if a document revision is older than one another, even if this might work for some cases.

文档并没有给我希望.这有可能吗?

Docs doesn't give me hope. Is this even possible?

如果没有,将arangodump手动导入到其他应用程序中的正确方法是什么?

If not, what is the proper way to manually import arangodump into a different application?

推荐答案

ArangoDump旨在为您尽快提供现有数据库的快照.因此,它不会为您提供集合级别的内容,而是磁盘上的内容.就像@CoDEmanX指出的那样,在牺牲数据库服务器ArangoExport的资源使用率的前提下,这会给您带来好处.

ArangoDump is intended to give you a snapshot of the existing database as fast as possible. Thus it doesn't give you the contents on the collection level, but as whats on disk. This is, what as @CoDEmanX noted, at the sacrifice of resource usage on the database server ArangoExport will give you.

要回答使用旧版本文档的原因,我们将不得不更深入地研究数据库本身.

To answer the reason why you get older versions of documents, we will have to take a deeper look at the database itself.

插入数据库将创建一个带有_key的新文档.一旦尝试用UPDATE替换它,实际发生的是写了一个不可见的文档(又名Marker),即删除了旧版本.之后,将创建该文档的新版本.

A insert into the database will create a new document, with a _key. Once you try to replace this by i.e. UPDATE, whats actually happening is, that an invisible document (aka Marker) is written, that is to remove the old version. After that, a new Version of the document is created.

这一切都完成了,所以您有了write ahead log-又名WAL.这是以线性方式编写的,但仅定义了部分内容已将其同步到磁盘.一旦事务要求文档为sealed-暂停执行,直到内核答复它可以确保此阶段已同步到存储.

This is all done liniar, so you have a write ahead log - aka WAL. This is written in linear fashion, but only some of its content is defined to have been sync'ed to disk. Once a transaction demands a document to be sealed - the execution is paused untill the kernel replies that it can ensure this stage has been synchronized to the storage.

关于磁盘的方式就这么多.这样做是为了给您最大的吞吐量,同时保证您已写入某些内容(并且没有卡在磁盘缓存中等)

That much about the way to disk. It is implemented that way to give you a maximum throughput, while giving you warranties that certain things have been written (and are not somewhere stuck in disk caches etc.)

稍后工作将尝试清理所有内容,并绑紧松散的一端.这称为集合".它将从WAL收集文档,并将其存储在永久数据库文件中.它还将尝试将删除标记与现有文档组合在一起,以使它们最终消失.

A later on job will try to clean up everything, and tie up loose ends. This is called the 'Collection'. It will collect documents from the WAL, and store it in permanent database files. It will also try to combine delete-markers with existing documents resulting in them to finally disappear.

因此,一旦运行了集合,删除的文档及其删除标记将实际上消失.如果多个数据库文件的大小受到某个阈值的限制,则可以将它们合并为一个数据库文件.甚至可能发生某些删除标记只有在这样的组合之后才能找到其文档.

So once the collection has been run, deleted documents combined with their delete markers will actually disappear. Multiple database files may be combined to one database file, if their size undergoes a certain threshhold. It may even happen, that some delete markers find their documents only after such a combination.

这篇关于arangodump:我怎么知道最新的“修订"?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆