在带AWS的Python中使用Lambda将文件写入S3 [英] Writing a file to S3 using Lambda in Python with AWS

查看:425
本文介绍了在带AWS的Python中使用Lambda将文件写入S3的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在AWS中,我试图使用Lambda函数将文件保存到Python中的S3.尽管这可以在我的本地计算机上运行,​​但无法在Lambda上运行.我一天中大部分时间都在研究此问题,希望能获得帮助.谢谢.

In AWS, I'm trying to save a file to S3 in Python using a Lambda function. While this works on my local computer, I am unable to get it to work in Lambda. I've been working on this problem for most of the day and would appreciate help. Thank you.

def pdfToTable(PDFfilename, apiKey, fileExt, bucket, key):

    # parsing a PDF using an API
    fileData = (PDFfilename, open(PDFfilename, "rb"))
    files = {"f": fileData}
    postUrl = "https://pdftables.com/api?key={0}&format={1}".format(apiKey, fileExt)
    response = requests.post(postUrl, files=files)
    response.raise_for_status()

    # this code is probably the problem!
    s3 = boto3.resource('s3')
    bucket = s3.Bucket('transportation.manifests.parsed')
    with open('/tmp/output2.csv', 'rb') as data:
        data.write(response.content)
        key = 'csv/' + key
        bucket.upload_fileobj(data, key)


    # FYI, on my own computer, this saves the file
    with open('output.csv', "wb") as f:
        f.write(response.content)

在S3中,有一个存储桶transportation.manifests.parsed,其中包含应保存文件的文件夹csv.

In S3, there is a bucket transportation.manifests.parsed containing the folder csv where the file should be saved.

response.content的类型是字节.

在AWS中,上述当前设置的错误是[Errno 2] No such file or directory: '/tmp/output2.csv': FileNotFoundError.实际上,我的目标是使用唯一名称将文件保存到csv文件夹,因此tmp/output2.csv可能不是最佳方法.有指导吗?

From AWS, the error from the current set-up above is [Errno 2] No such file or directory: '/tmp/output2.csv': FileNotFoundError. In fact, my goal is to save the file to the csv folder under a unique name, so tmp/output2.csv might not be the best approach. Any guidance?

此外,我尝试使用wbw代替rb也无济于事. wb的错误是Input <_io.BufferedWriter name='/tmp/output2.csv'> of type: <class '_io.BufferedWriter'> is not supported. documentation 建议使用'rb'是推荐的用法,但我不明白为什么会是这种情况.

In addition, I've tried to use wb and w instead of rb also to no avail. The error with wb is Input <_io.BufferedWriter name='/tmp/output2.csv'> of type: <class '_io.BufferedWriter'> is not supported. The documentation suggests that using 'rb' is the recommended usage, but I do not understand why that would be the case.

此外,我尝试了s3_client.put_object(Key=key, Body=response.content, Bucket=bucket),但收到了An error occurred (404) when calling the HeadObject operation: Not Found.

Also, I've tried s3_client.put_object(Key=key, Body=response.content, Bucket=bucket) but receive An error occurred (404) when calling the HeadObject operation: Not Found.

推荐答案

您有一个可写流,您正在要求boto3用作不起作用的可读流.

You have a writable stream that you're asking boto3 to use as a readable stream which won't work.

写入文件,然后再简单地使用bucket.upload_file(),如下所示:

Write the file, and then simply use bucket.upload_file() afterwards, like so:

s3 = boto3.resource('s3')
bucket = s3.Bucket('transportation.manifests.parsed')
with open('/tmp/output2.csv', 'w') as data:
    data.write(response.content)

key = 'csv/' + key
bucket.upload_file('/tmp/output2.csv', key)

这篇关于在带AWS的Python中使用Lambda将文件写入S3的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆