锁定s3对象的最佳做法? [英] Locking an s3 object best practice?

查看:124
本文介绍了锁定s3对象的最佳做法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个S3存储桶,其中包含许多EC2实例可以从中提取的许多S3对象(水平缩放时)。每个EC2将一次拉一个对象,对其进行处理,然后将其移动到另一个存储桶。

I have an S3 bucket containing quite a few S3 objects that multiple EC2 instances can pull from (when scaling horizontally). Each EC2 will pull an object one at a time, process it, and move it to another bucket.

当前,为了确保同一对象不会被多个EC2实例处理,我的Java应用将其重命名,并在其S3对象密钥中添加了锁定扩展名。问题在于重新命名实际上是在进行移动。因此,S3存储桶中的大文件最多可能需要几分钟才能完成其重命名,从而导致锁定过程无效。

Currently, to make sure the same object isn't processed by multiple EC2 instances, my Java app renames it with a "locked" extension added to its S3 object key. The problem is that "renaming" is actually doing a "move". So the large files in the S3 bucket can take up to several minutes to complete its "rename", resulting in the locking process being ineffective.

任何人都有最佳实践吗?

Does anyone have a best practice for accomplishing what I'm trying to do?

我考虑过使用SQS,但是该解决方案有其自身的一系列问题(无法保证订购,传递消息的可能性)

I considered using SQS, but that "solution" has its own set of problems (order not guaranteed, possibility of messages delivered more than once, and more than one EC2 getting the same message)

我想知道设置锁定标头是否会更快地锁定过程。

I'm wondering if setting a "locked" header would be a quicker "locking" process.

推荐答案


顺序无法保证,消息传递的可能性不止一次,并且有多个EC2相同的消息

order not guaranteed, possibility of messages delivered more than once, and more than one EC2 getting the same message

实际上多次获得同一条消息的几率很低。这只是可能,但可能性很小。如果从本质上来说,如果在个别情况下您应该处理一个文件不止一次,那么SQS似乎是一个完全合理的选择。

The odds of actually getting the same message more than once is low. It's merely "possible," but not very likely. If it's essentially only an annoyance if, on isolated occasions, you should happen to process a file more than once, then SQS seems like an entirely reasonable option.

在对象上设置锁定标头有其自身的问题-当您用对象自身的副本覆盖对象时(当您更改元数据时会发生这种情况-使用相同的密钥创建对象的新副本),然后您将受到最终一致性的影响。

Setting a "locked" header on the object has a problem of its own -- when you overwrite an object with a copy of itself (that's what happens when you change the metadata -- a new copy of the object is created, with the same key) then you are subject to the slings and arrows of eventual consistency.

问:Amazon S3使用哪种数据一致性模型?

所有区域中的Amazon S3存储桶都提供读取功能-新对象的PUTS写入后一致性,以及覆盖PUTS和DELETES的最终一致性。

Amazon S3 buckets in all Regions provide read-after-write consistency for PUTS of new objects and eventual consistency for overwrite PUTS and DELETES.

https://aws.amazon.com/s3/faqs/

更新metada ta是覆盖 PUT 。您的新标题可能不会立即可见,并且如果两个或更多工作人员设置了自己的唯一标题(例如,x-amz-meta-locked:i-12345678),则完全有可能出现以下情况(W1,W2 =工人#1和#2):

Updating metadata is an "overwrite PUT." Your new header may not immediately be visible, and if two or more workers set their own unique header (e.g. x-amz-meta-locked: i-12345678) it's entirely possible for a scenario like the following to play out (W1, W2 = Worker #1 and #2):

W1: HEAD object (no lock header seen)
W2: HEAD object (no lock header seen)
W1: set header
W2: set header
W1: HEAD object (sees its own lock header)
W2: HEAD object (sees its own lock header)

相同或相似的故障可能会因时序的几种不同排列而发生。

The same or a similar failure can occur with several different permutations of timing.

在这样的最终一致性环境中无法有效地锁定对象。

Objects can't be effectively locked in an eventual consistency environment like this.

这篇关于锁定s3对象的最佳做法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆