我想使用 Python (Boto3) 以 CSV 格式将 DynamoDB 表导出到 S3 存储桶 [英] I would like to export DynamoDB Table to S3 bucket in CSV format using Python (Boto3)

查看:12
本文介绍了我想使用 Python (Boto3) 以 CSV 格式将 DynamoDB 表导出到 S3 存储桶的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

之前在以下链接中已提出此问题:如何将 dynamodb 扫描数据写入 CSV 并使用 python 上传到 s3 存储桶?

This question has been asked earlier in the following link: How to write dynamodb scan data's in CSV and upload to s3 bucket using python?

我已按照评论中的建议修改了代码.代码如下:

I have amended the code as advised in the comments. The code looks like as follows:

import csv
import boto3
import json
dynamodb = boto3.resource('dynamodb')
db = dynamodb.Table('employee_details')
def lambda_handler(event, context):
    AWS_BUCKET_NAME = 'session5cloudfront'
    s3 = boto3.resource('s3')
    bucket = s3.Bucket(AWS_BUCKET_NAME)
    path = '/tmp/' + 'employees.csv'
    try:
        response = db.scan()
        myFile = open(path, 'w')  

        for i in response['Items']:
            csv.register_dialect('myDialect', delimiter=' ', quoting=csv.QUOTE_NONE)
            with myFile:
                writer = csv.writer(myFile, dialect='myDialect')
                writer.writerows(i)
            print(i)
    except :
        print("error")

    bucket.put_object(
        ACL='public-read',
        ContentType='application/csv',
        Key=path,
        # Body=json.dumps(i),
    )
    # print("here")
    body = {
        "uploaded": "true",
        "bucket": AWS_BUCKET_NAME,
        "path": path,
    }
    # print("then here")
    return {
        "statusCode": 200,
        "body": json.dumps(body)
    }

我是新手,请帮助我修复此代码,因为它在将数据插入在 S3 存储桶中创建的文件中时遇到问题.

I am a novice, please help me in fixing this code as it is having problem in inserting data in file created in S3 Bucket.

谢谢

推荐答案

我已经修改了代码,使其更简单,并且还可以处理超过 1MB 数据的表的分页响应:

I have revised the code to be simpler and to also handle paginated responses for tables with more than 1MB of data:

import csv
import boto3
import json

TABLE_NAME = 'employee_details'
OUTPUT_BUCKET = 'my-bucket'
TEMP_FILENAME = '/tmp/employees.csv'
OUTPUT_KEY = 'employees.csv'

s3_resource = boto3.resource('s3')
dynamodb_resource = boto3.resource('dynamodb')
table = dynamodb_resource.Table(TABLE_NAME)

def lambda_handler(event, context):

    with open(TEMP_FILENAME, 'w') as output_file:
        writer = csv.writer(output_file)
        header = True
        first_page = True

        # Paginate results
        while True:

            # Scan DynamoDB table
            if first_page:
                response = table.scan()
                first_page = False
            else:
                response = table.scan(ExclusiveStartKey = response['LastEvaluatedKey'])

            for item in response['Items']:

                # Write header row?
                if header:
                    writer.writerow(item.keys())
                    header = False

                writer.writerow(item.values())

            # Last page?
            if 'LastEvaluatedKey' not in response:
                break

    # Upload temp file to S3
    s3_resource.Bucket(OUTPUT_BUCKET).upload_file(TEMP_FILENAME, OUTPUT_KEY)

这篇关于我想使用 Python (Boto3) 以 CSV 格式将 DynamoDB 表导出到 S3 存储桶的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆