我如何只复制丢失的红宝石使用AWS-SDK桶之间的对象? [英] How do I only copy missing objects between buckets using ruby aws-sdk?

查看:117
本文介绍了我如何只复制丢失的红宝石使用AWS-SDK桶之间的对象?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写了一个脚本,从我的生产S3存储S3对象复制到我的发展,但它需要相当长的时间来运行,因为我正在逐一检查每个对象是否存在复制之前。有没有一种方法来在差异比较两个桶,只复制我所需要的对象呢?或者复制桶作为一个整体?

下面是我目前:

 计数= 0
把抄袭#{prod_bucket}以#{} dev_bucket
BM = Benchmark.measure做
  AWS :: S3.new.buckets [prod_bucket] .objects.each办|Ø|
    存在= AWS :: S3.new.buckets [dev_bucket] .objects [o.key] .exists?

    是否存在
      把跳绳:#{o.key}
    其他
      把复制:#{o.key}(#{}计数)
      o.copy_to(o.key,:bucket_name => dev_bucket,:ACL =>:public_read)
      数+ = 1
    结束
  结束
结束
提出已将#{}算在#{bm.real}等对象
 

解决方案

我从来没有与创业板,但你code看起来是可以接受的所有物品存放在桶的数组。加载列表两种桶,并确定用一个简单的数组操作丢失的文件。要快很多。

 #加载文件列表(查找对象为1000批)
SOURCE_FILES = AWS :: S3.new.buckets [prod_bucket] .objects.map(安培;:键)
target_files = AWS :: S3.new.buckets [dev_bucket] .objects.map(安培;:键)

#确定文件的开发失踪
files_to_copy = SOURCE_FILES  -  target_files
files_to_copy.each_with_index做| FILE_NAME,我|
  却将应对#{I} / {#} files_to_copy.size:#{FILE_NAME}

  S3Object.store(FILE_NAME,
                 S3Object.value(FILE_NAME,PROD_BUCKET_NAME)
                 DEV_BUCKET_NAME)
结束

#确定是不存在的督促dev的文件
files_to_remove = target_files  -  SOURCE_FILES
files_to_remove.each_with_index做| FILE_NAME,我|
  却将删除#{I} / {#} files_to_remove.size:#{FILE_NAME}

  S3Object.delete(FILE_NAME,DEV_BUCKET_NAME)
结束
 

I wrote a script to copy s3 objects from my production s3 bucket to my development one, but it takes quite a long time to run because I am individually checking each object for existence before copying. Is there a way to diff the two buckets and only copy the objects I need? Or to copy the bucket as a whole?

Here is what I have currently:

count = 0
puts "COPYING FROM #{prod_bucket} to #{dev_bucket}"
bm = Benchmark.measure do 
  AWS::S3.new.buckets[prod_bucket].objects.each do |o|
    exists = AWS::S3.new.buckets[dev_bucket].objects[o.key].exists?

    if exists
      puts "Skipping: #{o.key}"
    else
      puts "Copy: #{o.key} (#{count})"
      o.copy_to(o.key, :bucket_name => dev_bucket, :acl => :public_read)
      count += 1
    end
  end
end
puts "Copied #{count} objects in #{bm.real}s"

解决方案

I never worked with that gem, but you code looks like it is possible to receive an array with all items store in a bucket. Load that list for both buckets and determine the missing files with a simple array operations. Should be much faster.

# load file lists (looks up objects in batches of 1000)
source_files  = AWS::S3.new.buckets[prod_bucket].objects.map(&:key)
target_files  = AWS::S3.new.buckets[dev_bucket].objects.map(&:key)

# determine files missing in dev
files_to_copy = source_files - target_files
files_to_copy.each_with_index do |file_name, i|
  puts "Coping #{i}/#{files_to_copy.size}: #{file_name}"

  S3Object.store(file_name, 
                 S3Object.value(file_name, PROD_BUCKET_NAME), 
                 DEV_BUCKET_NAME)
end

# determine files on dev that are not existing on prod
files_to_remove = target_files - source_files
files_to_remove.each_with_index do |file_name, i|
  puts "Removing #{i}/#{files_to_remove.size}: #{file_name}"

  S3Object.delete(file_name, DEV_BUCKET_NAME)
end

这篇关于我如何只复制丢失的红宝石使用AWS-SDK桶之间的对象?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆