使用博托+ Python的亚马逊S3上载失败 [英] Amazon S3 upload fails using boto + Python

查看:229
本文介绍了使用博托+ Python的亚马逊S3上载失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您好,我无法用博托文件上传到S3。它失败,出现以下错误消息。有人可以帮助我,我是新来的蟒蛇和博托。

Hi I am unable to upload a file to S3 using boto. It fails with the following error message. Can someone help me, i am new to python and boto.

from boto.s3 import connect_to_region
from boto.s3.connection import Location
from boto.s3.key import Key
import boto
import gzip
import os

AWS_KEY = ''
AWS_SECRET_KEY = ''
BUCKET_NAME = 'mybucketname'

conn = connect_to_region(Location.USWest2,aws_access_key_id = AWS_KEY,
        aws_secret_access_key = AWS_SECRET_KEY,
        is_secure=False,debug = 2
        )

bucket = conn.lookup(BUCKET_NAME)
bucket2 = conn.lookup('unzipped-data')
rs = bucket.list()
rs2 = bucket2.list()

compressed_files = []
all_files = []
files_to_download = []
downloaded_files = []
path = "~/tmp/"

# Check if the file has already been decompressed

def filecheck():
    for filename in bucket.list():
        all_files.append(filename.name)

    for n in rs2:
        compressed_files.append(n.name)
    for file_name in all_files:
            if file_name.strip('.gz') in compressed_files:
                pass;
            elif '.gz' in file_name and 'indeed' in file_name:
                files_to_download.append(file_name)


# Download necessary files                
def download_files():
    for name in rs:
        if name.name in files_to_download:  
            file_name = name.name.split('/')

            print('Downloading: '+ name.name).strip('\n')
            file_name = name.name.split('/')
            name.get_contents_to_filename(path+file_name[-1])
            print(' - Completed')

            # Decompressing the file
            print('Decompressing: '+ name.name).strip('\n')
            inF = gzip.open(path+file_name[-1], 'rb')
            outF = open(path+file_name[-1].strip('.gz'), 'wb')
            for line in inF:
                outF.write(line)
            inF.close()
            outF.close()
            print(' - Completed')

            # Uploading file
            print('Uploading: '+name.name).strip('\n')
            full_key_name = name.name.strip('.gz')
            k = Key(bucket2)
            k.key = full_key_name
            k.set_contents_from_filename(path+file_name[-1].strip('.gz'))
            print('Completed') 

            # Clean Up
            d_list = os.listdir(path)
            for d in d_list:
                os.remove(path+d)


# Function Calls             
filecheck()
download_files()

错误消息:

Traceback (most recent call last):
  File "C:\Users\Siddartha.Reddy\workspace\boto-test\com\salesify\sid\decompress_s3.py", line 86, in <module>
    download_files()
  File "C:\Users\Siddartha.Reddy\workspace\boto-test\com\salesify\sid\decompress_s3.py", line 75, in download_files
    k.set_contents_from_filename(path+file_name[-1].strip('.gz'))
  File "C:\Python27\lib\site-packages\boto\s3\key.py", line 1362, in set_contents_from_filename
    encrypt_key=encrypt_key)
  File "C:\Python27\lib\site-packages\boto\s3\key.py", line 1293, in set_contents_from_file
    chunked_transfer=chunked_transfer, size=size)
  File "C:\Python27\lib\site-packages\boto\s3\key.py", line 750, in send_file
    chunked_transfer=chunked_transfer, size=size)
  File "C:\Python27\lib\site-packages\boto\s3\key.py", line 951, in _send_file_internal
    query_args=query_args
  File "C:\Python27\lib\site-packages\boto\s3\connection.py", line 664, in make_request
    retry_handler=retry_handler
  File "C:\Python27\lib\site-packages\boto\connection.py", line 1070, in make_request
    retry_handler=retry_handler)
  File "C:\Python27\lib\site-packages\boto\connection.py", line 1029, in _mexe
    raise ex
socket.error: [Errno 10053] An established connection was aborted by the software in your host machine

我没有问题,下载文件,但上载由于某种奇怪的原因。

I have no problem downloading the files, but the upload fails for some weird reason.

推荐答案

如果问题是文件(> 5GB),你应该使用多部分上传的尺寸:

If the problem is the size of files (> 5GB), you should use multipart upload:

http://docs.aws.amazon.com/AmazonS3/最新的/ dev / mpuoverview.html

搜索multipart_upload的文档: http://boto.readthedocs.org/en /latest/ref/s3.html#module-boto.s3.multipart

search for multipart_upload in the docs: http://boto.readthedocs.org/en/latest/ref/s3.html#module-boto.s3.multipart

此外,看到这个问题的一个相关的问题:

Also, see this question for a related issue:

<一个href="http://stackoverflow.com/questions/10355941/how-can-i-copy-files-bigger-than-5-gb-in-amazon-s3">How我可以在Amazon S3中复制文件大于5 GB?

的方法,是一个小的非直观。您需要:

The process is a little non-intuitive. You need to:

  • 运行initiate_multipart_upload(),存储返回的对象
  • 将文件分割成块(无论是在磁盘上,或者使用CStringIO内存中读取)
  • 饲料部分依次为upload_part_from_file()
  • 运行complete_upload()存储的对象上

这篇关于使用博托+ Python的亚马逊S3上载失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆