如何删除/在S3存储桶计算对象? [英] How do I delete/count objects in a s3 bucket?

查看:1707
本文介绍了如何删除/在S3存储桶计算对象?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我知道这是一个普遍的问题,但那里只是似乎没有得到什么好处的答案吧。

So I know this is a common question but there just doesn't seem to be any good answers for it.

我和采空区的水桶(我不知道有多少),在其中的文件数量。他们是在2K一块所有。

I have a bucket with gobs (I have no clue how many) number of files in them. They are all within 2k a piece.

1)如何找出我如何将这些文件多有不列呢?    我已经使用了s3cmd.rb,AWS / S3和的JetS3t的东西,最好我能找到一个命令来计算前1000个记录(真是GETS执行对他们的)。

1) How do I figure out how many of these files I have WITHOUT listing them? I've used the s3cmd.rb, aws/s3, and jets3t stuff and the best I can find is a command to count the first 1000 records (really performing GETS on them).

我一直在使用的JetS3t的小程序,以及因为它真的很高兴与工作,但即使这样我无法列出所有的对象,因为我用尽堆空间。 (presumably导致它peforming沾到所有这些,让他们在内存中)

I've been using jets3t's applet as well cause it's really nice to work with but even that I can't list all my objects cause I run out of heap space. (presumably cause it is peforming GETS on all of them and keeping them in memory)

2)我如何只删除一个水桶?    我见过的最好的事情是paralleized删除循环,并有问题,有时会导致它试图删除相同的文件。这是所有的'用deleteAll命令,我做对面跑。

2) How can I just delete a bucket? The best thing I've seen is a paralleized delete loop and that has problems cause sometimes it tries to delete the same file. This is what all the 'deleteall' commands that I've ran across do.

你们怎么做的谁也吹嘘举办数以百万计的图像/ txts?当你要删除它,会发生什么?

What do you guys do who have boasted about hosting millions of images/txts?? What happens when you want to remove it?

3)最后,是否有备用的答案呢?所有这些文件都是TXT / XML文件,所以我甚至不知道S3就是这样一个问题 - 也许我应该搬到这的种种??文档数据库

3) Lastly, are there alternate answers to this? All of these files are txt/xml files so I'm not even sure S3 is such a concern -- maybe I should move this to a document database of sorts??

它归结为是,亚马逊S3 API仅仅是直出遗漏2个非常重要的操作 - 计数和DEL_BUCKET。 (实际上有一个删除桶命令,但它只有在桶是空的作品)。如果有人想出了一个不吸做这两项业务我会很高兴地放弃了很多恩惠的方法。

What it boils down to is that the amazon S3 API is just straight out missing 2 very important operations -- COUNT and DEL_BUCKET. (actually there is a delete bucket command but it only works when the bucket is empty) If someone comes up with a method that does not suck to do these two operations I'd gladly give up lots of bounty.

更新

只是为了回答几个问题。我问这个的原因是我已经在过去一年左右的时间被储存数以十万计,更像是数以百万计的2K TXT和XML文档。最后一次,几个月前,我希望删除桶只花了天这样做,因为水桶必须为空,然后才能删除它。这是这样一个痛苦的屁股,我害怕永远不必再次做到这一点没有它的API支持。

Just to answer a few questions. The reason I ask this was I have been for the past year or so been storing hundreds of thousands, more like millions of 2k txt and xml documents. The last time, a couple of months ago, I wished to delete the bucket it literally took DAYS to do so because the bucket has to be empty before you can delete it. This was such a pain in the ass I am fearing ever having to do this again without API support for it.

更新

该岩石的房子!

http://github.com/SFEley/s3nuke/

http://github.com/SFEley/s3nuke/

我rm'd一个很好的夫妇演出身价1-2k的文件在几分钟之内。

I rm'd a good couple gigs worth of 1-2k files within minutes.

推荐答案

目录将无法检索数据。我用s3cmd(Python脚本),我会做这样的事情:

"List" won't retrieve the data. I use s3cmd (a python script) and I would have done something like this:

s3cmd ls s3://foo | awk '{print $4}' | split -a 5 -l 10000 bucketfiles_
for i in bucketfiles_*; do xargs -n 1 s3cmd rm < $i & done

不过,首先检查你有多少个bucketfiles_文件得到。将有每个文件一个s3cmd运行。

But first check how many bucketfiles_ files you get. There will be one s3cmd running per file.

这将需要一段时间,但不是天。

It will take a while, but not days.

这篇关于如何删除/在S3存储桶计算对象?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆