Amazon EBS,快照作为增量备份 [英] Amazon EBS, snapshots as incremental backups

查看:258
本文介绍了Amazon EBS,快照作为增量备份的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在为自动备份EBS卷建立一种自动机制。

I'm working on an automated mechanism for our EBS volumes to be backed up on a daily basis.

我非常了解创建新快照的步骤。显然,这非常简单,您拥有一个EBS卷,可以对其进行快照,并且可以随时还原该快照。好的。

I know quite well the steps to create a new snapshot. Apparently it's all quite simple, you have an EBS volume which you can snapshot, and you can restore the snapshot anytime. Fine.

但是我担心的是快照的大小,我知道这些快照是压缩存储在S3中的,我们将根据快照的大小收费。快照的大小。如果我们有大量数据,那么我们进行的每个备份的发票额都会大大增加。

But my concern is about the size of the snapshots, I know these snapshots are stored with compression in S3, and we're going to be charged depending on the size of the snapshots. If we have large amounts of data we'll have a significant amount increase in the invoice for each backup we make.

但是,根据Amazon的页面,这些快照是增量的。那可以解决我的问题,因为每日备份只会上传自上次快照以来发生更改的数据。但这引出了下一个问题:如果备份是增量备份,而我们仅上传修改后的数据,原始数据存储在哪里? (即,显然无法增量完成的第一个快照...)

However, according to Amazon's pages, these snapshots are incremental. That'd solve my problem as the daily backup would only upload the data which has changed since the last snapshot. But this leads me to next question: if the backup is incremental and we're only uploading the modified data, where's the original data being stored? (ie. the first snapshot which obviously couldn't have been done incrementally...)

不幸的是,我无法在整个Amazon文档中找到此信息。

Unfortunately I haven't been able to find this information all over Amazon's documents.

有人有快照及其计费方面的经验吗?

Does anybody have experience with snapshots and its billing?

感谢您的帮助!

推荐答案

我认为您不会找到有关实现快照的详细文档;这不是我遇到的东西。他们确实有>计划成本的文档。但是,我想,如果您知道它的工作原理,可以清算帐单,并放心使用。

I don't think that you'll find detailed documentation as to how the snapshots are implemented; it's not something I have come across. They do have documentation for "Projecting Costs". However, I think if you know how it works, you can intuit the bill, and feel more at ease with it.

请注意,按照我们可能已经在DOS操作系统中理解该术语的方式,这些快照不是递增的。在DOS中,修改文件后会设置存档位,而增量备份只会复制设置了存档位的文件。备份过程将清除存档属性,因此以后对该文件的编辑将使该文件再次增量备份。

Note that these snapshots are not "incremental" in the way we may have come to understand that term in the DOS operating system. In DOS, the "archive" bit was set when a file was modified, and an "incremental" backup copied only the files that had it's "archive" bit set. The backup process would clear the archive attribute, so a future edit to the file would cause it to be backed up "incrementally" once again.

使用快照时,每个文件的块如果修改了该卷,则会对其进行标记。并非逐个文件地完成。在第一个快照之后,仅备份已标记为已修改的块,就像DOS中的增量备份一样。但这就是相似之处的终点,因为对于每个不必复制的块 ,它不仅跳过它,而且还写了一个指向数据的最后(未更改)副本的指针。

With snapshots, each block of the volume is flagged if it is modified. It's not done on a file by file basis. After the first snapshot, only blocks that have been flagged as modified are backed up, just like "incremental" backups in DOS. But that's where the similarities end, because with each block that it doesn't have to copy it doesn't just skip it, it writes a pointer to where the last (unchanged) copy of the data is.

您为卷创建的第一个快照将数据分为多个块。在亚马逊上: 大容量数据在传输到Amazon S3之前被分解为大块。虽然大块的大小可以通过将来的优化进行更改,但是可以通过划分数据大小来估计数量自上次快照以来已更改了4MB。

The first snapshot you make of a volume, the data is broken up into blocks. From Amazon: "Volume data is broken up into chunks before being transferred to Amazon S3. While the size of the chunks could change through future optimizations, the number [...] can be estimated by dividing the size of the data that has changed since the last snapshot by 4MB."

您制作的下一个快照仅包含已更改的那些块的数据,指向未更改块的指针。这些指针指向上一个快照中的数据块。

The next snapshot you make consists of data for only those blocks that have changed, and pointers to the blocks that haven't changed. Those pointers point to blocks of data in the previous snapshot.

下一个快照(n)是通过记录自上一个快照(n-1)之后已更改的每个块的数据而构成的),以及自上次快照(n-1)以来未更改的块的指针。这些指针指向先前快照中的相应块(可能包含数据),或者指向先前快照的另一个指针。最终,每个指针最终都指向一个实际数据块(自创建快照以来就没有改变)。

The next snapshot (n) is made by recording data of each block changed since the previous snapshot (n-1), along with pointers for the blocks that haven't changed since the previous snapshot (n-1). These pointers point to corresponding blocks in the previous snapshot, which may contain data, or another pointer to its previous snapshot. Eventually, every pointer ends up at a block of real data, (that hasn't changed since that snapshot was created).

现在,假设您决定删除快照( X)。快照(x)具有在快照(x-1)之前和之后(x + 1)制作的快照。 Amazon将快照(x)中的指针和数据替换为快照(x + 1)中的指针(已删除一个)。结果,快照(x)中的任何实际数据都将复制到快照(x + 1),除非它具有该块的最新数据的副本。

Now let's say you decide to delete snapshot (x). Snapshot (x) has snapshots made before it (x-1), and after it (x+1). Amazon replaces the pointers in snapshot (x+1) with pointers and data from snapshot (x) (the one being deleted). As a result, any actual data in snapshot (x) is copied to snapshot (x+1), unless it has it's own copy of more recent data for that block there.

这是快照的工作方式,数据的存储位置以及快照大小可管理的原因。从中您可以了解如何删除快照将仅破坏您恢复创建该快照时的卷的能力,而不破坏使用其他快照的能力。与不使用指针的简单,传统的增量备份不同,未删除的快照会根据需要进行更新,以在删除其依赖快照之一时保持其有用性。这就是为什么亚马逊对智能快照存储收取更多费用而不是对EBS卷进行简单复制的原因。最后,可以理解的是,由于快照存储是如此动态,因此很难预测它将花费多少快照存储空间。

This is how snapshots work, where the data is stored, and why the size of the snapshots are manageable. You can understand from this how deleting a snapshot will destroy only your ability to bring back the volume as it was at the point in time when that snapshot was created, without destroying the ability to use your other snapshots. Unlike simple, traditional "incremental" backups that don't utilize pointers, snapshots not being deleted are updated as needed to maintain their usefulness when one of its dependent snapshots are deleted. This is why it makes sense that Amazon charges more for intelligent snapshot storage than simple copies of EBS volumes. Finally, it's understandable that it's difficult to predict how much snapshot storage is going to cost, since it is so dynamic.

这篇关于Amazon EBS,快照作为增量备份的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆