将文件从EC2移动到S3,然后从EC2删除 [英] Move files from EC2 to S3 and then delete from EC2

查看:172
本文介绍了将文件从EC2移动到S3,然后从EC2删除的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在将文件从一台远程服务器迁移到S3.大约有1万个文件(所有文件都可以通过远程服务器上的http URL访问).总大小约为300GB(没有单个文件超过1GB).我正在尝试找出进行此迁移的最有效方法.到目前为止,我已经有一个EC2实例,并且已经安装了S3CMD. PHP-SDK,我还有一个带有所有URL的文本文件.我可以将文件从EC2移至S3,没有任何问题.但是问题是,如果我在EC2中下载所有内容,则会用光存储空间.有没有一种方法可以在EC2中下载文件(也许在txt文件中查找),将其移动到S3(使用S3CMD),然后从EC2删除文件,然后再转到下一个文件.

I'm migrating files from one remote server to S3. There are about 10k files (all accessible via http URLs from the remote server). The total size is about 300GB (no individual file is more than 1GB). I'm trying to figure out the most efficient way to make this migration. So far I have a EC2 instance and I have the S3CMD installed; PHP-SDK, I have a text file with all the URL's as well. I'm able to move files from EC2 to S3 without any issue. but the problem is if I download everything in EC2 I run out of storage. Is there a way where I can download a file in EC2 (maybe look in the txt file) move it to S3 (using S3CMD) and then delete the file from EC2 before I go to the next file.

理想情况下,我想直接从远程位置将所有内容直接下载到S3,但我认为这是不可能的,除非有人在这里说是这样.

Ideally I would want to download everything straight to S3 from the remote location, but I don't think that is possible, unless someone here says it is.

预先感谢您的帮助.

推荐答案

我看不到您当前的ec2实例正在运行什么操作系统.但是,如果是Linux,则可以使用S3fs
https://github.com/s3fs-fuse/s3fs-fuse/wiki/Fuse-Over-Amazon

I don't see what OS your current ec2 instance is running. But if it is linux you could use S3fs
https://github.com/s3fs-fuse/s3fs-fuse/wiki/Fuse-Over-Amazon

,使您可以像本地驱动器/文件夹一样安装存储桶.然后,您可以简单地将文件移到那里.它将它们上传到后台的存储桶中.我会分批移动它们,以使其易于跟踪.上载后,移动它们会从您的本地文件系统中删除它们.您也可以通过这种方式将它们复制到存储桶中.完成后,您可以进行简单比较以确保两个文件夹中都存在相同的文件,然后就可以完成了.

that will allow you to mount your bucket like a local drive/folder. Then you can simple move the files there. It will upload them to the bucket in the background. I would moving them in some kind of batches to make it easy to track. Moving them would remove them from your local file system after uploading. You could also just copy them to the bucket this way. When done you could do a simple comparison to make sure the same files exist in both folders and then you are done.

EDIT 基于注释中要求提供的问题

EDIT based on question asked in comment for clarity

在远程计算机上,使用您的AWS凭证设置Fuse.
安装您的S3存储桶.它看起来像Ubuntu中的本地文件夹结构.
假设您当前的文件位于
/var/myfiles/folder1/var/myfiles/folder2
将您的S3存储桶安装到/mybucket
mv /var/myfiles/folder1 /mybucket/folder1

On the remote machine, setup Fuse with your AWS credentials.
Mount your S3 bucket. It will look like a local folder structure in Ubuntu.
Lets say your current files are in
/var/myfiles/folder1 and /var/myfiles/folder2
mount your S3 bucket to /mybucket
mv /var/myfiles/folder1 /mybucket/folder1

同样,我将分批移动它们并确保文件夹匹配,然后再继续.

Again, I would move them in batches and make sure the folders match up before continuing.

结束编辑

如果您的EC2实例是Windows,则还有其他方法可以将S3存储桶安装为本地驱动器.然后,可能会发生相同的过程.

If you EC2 instance is windows then there are other ways to mount an S3 bucket as a local drive. Then the same process could take place.

这篇关于将文件从EC2移动到S3,然后从EC2删除的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆