我如何只复制丢失的红宝石使用AWS-SDK桶之间的对象? [英] How do I only copy missing objects between buckets using ruby aws-sdk?
问题描述
我写了一个脚本,从我的生产S3存储S3对象复制到我的发展,但它需要相当长的时间来运行,因为我正在逐一检查每个对象是否存在复制之前。有没有一种方法来在差异比较两个桶,只复制我所需要的对象呢?或者复制桶作为一个整体?
下面是我目前:
计数= 0
把抄袭#{prod_bucket}以#{} dev_bucket
BM = Benchmark.measure做
AWS :: S3.new.buckets [prod_bucket] .objects.each办|Ø|
存在= AWS :: S3.new.buckets [dev_bucket] .objects [o.key] .exists?
是否存在
把跳绳:#{o.key}
其他
把复制:#{o.key}(#{}计数)
o.copy_to(o.key,:bucket_name => dev_bucket,:ACL =>:public_read)
数+ = 1
结束
结束
结束
提出已将#{}算在#{bm.real}等对象
我从来没有与创业板,但你code看起来是可以接受的所有物品存放在桶的数组。加载列表两种桶,并确定用一个简单的数组操作丢失的文件。要快很多。
#加载文件列表(查找对象为1000批)
SOURCE_FILES = AWS :: S3.new.buckets [prod_bucket] .objects.map(安培;:键)
target_files = AWS :: S3.new.buckets [dev_bucket] .objects.map(安培;:键)
#确定文件的开发失踪
files_to_copy = SOURCE_FILES - target_files
files_to_copy.each_with_index做| FILE_NAME,我|
却将应对#{I} / {#} files_to_copy.size:#{FILE_NAME}
S3Object.store(FILE_NAME,
S3Object.value(FILE_NAME,PROD_BUCKET_NAME)
DEV_BUCKET_NAME)
结束
#确定是不存在的督促dev的文件
files_to_remove = target_files - SOURCE_FILES
files_to_remove.each_with_index做| FILE_NAME,我|
却将删除#{I} / {#} files_to_remove.size:#{FILE_NAME}
S3Object.delete(FILE_NAME,DEV_BUCKET_NAME)
结束
I wrote a script to copy s3 objects from my production s3 bucket to my development one, but it takes quite a long time to run because I am individually checking each object for existence before copying. Is there a way to diff the two buckets and only copy the objects I need? Or to copy the bucket as a whole?
Here is what I have currently:
count = 0
puts "COPYING FROM #{prod_bucket} to #{dev_bucket}"
bm = Benchmark.measure do
AWS::S3.new.buckets[prod_bucket].objects.each do |o|
exists = AWS::S3.new.buckets[dev_bucket].objects[o.key].exists?
if exists
puts "Skipping: #{o.key}"
else
puts "Copy: #{o.key} (#{count})"
o.copy_to(o.key, :bucket_name => dev_bucket, :acl => :public_read)
count += 1
end
end
end
puts "Copied #{count} objects in #{bm.real}s"
I never worked with that gem, but you code looks like it is possible to receive an array with all items store in a bucket. Load that list for both buckets and determine the missing files with a simple array operations. Should be much faster.
# load file lists (looks up objects in batches of 1000)
source_files = AWS::S3.new.buckets[prod_bucket].objects.map(&:key)
target_files = AWS::S3.new.buckets[dev_bucket].objects.map(&:key)
# determine files missing in dev
files_to_copy = source_files - target_files
files_to_copy.each_with_index do |file_name, i|
puts "Coping #{i}/#{files_to_copy.size}: #{file_name}"
S3Object.store(file_name,
S3Object.value(file_name, PROD_BUCKET_NAME),
DEV_BUCKET_NAME)
end
# determine files on dev that are not existing on prod
files_to_remove = target_files - source_files
files_to_remove.each_with_index do |file_name, i|
puts "Removing #{i}/#{files_to_remove.size}: #{file_name}"
S3Object.delete(file_name, DEV_BUCKET_NAME)
end
这篇关于我如何只复制丢失的红宝石使用AWS-SDK桶之间的对象?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!