我想使用Python(Boto3)以CSV格式将DynamoDB表导出到S3存储桶 [英] I would like to export DynamoDB Table to S3 bucket in CSV format using Python (Boto3)
本文介绍了我想使用Python(Boto3)以CSV格式将DynamoDB表导出到S3存储桶的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
此问题已在以下链接中更早提出:
如何以CSV格式编写dynamodb扫描数据并使用python上传到s3存储桶?
This question has been asked earlier in the following link: How to write dynamodb scan data's in CSV and upload to s3 bucket using python?
我已按照注释中的建议修改了代码。代码如下所示:
I have amended the code as advised in the comments. The code looks like as follows:
import csv
import boto3
import json
dynamodb = boto3.resource('dynamodb')
db = dynamodb.Table('employee_details')
def lambda_handler(event, context):
AWS_BUCKET_NAME = 'session5cloudfront'
s3 = boto3.resource('s3')
bucket = s3.Bucket(AWS_BUCKET_NAME)
path = '/tmp/' + 'employees.csv'
try:
response = db.scan()
myFile = open(path, 'w')
for i in response['Items']:
csv.register_dialect('myDialect', delimiter=' ', quoting=csv.QUOTE_NONE)
with myFile:
writer = csv.writer(myFile, dialect='myDialect')
writer.writerows(i)
print(i)
except :
print("error")
bucket.put_object(
ACL='public-read',
ContentType='application/csv',
Key=path,
# Body=json.dumps(i),
)
# print("here")
body = {
"uploaded": "true",
"bucket": AWS_BUCKET_NAME,
"path": path,
}
# print("then here")
return {
"statusCode": 200,
"body": json.dumps(body)
}
我是新手,请帮助我修复此代码,因为它无法在S3存储桶中创建的文件中插入数据。
I am a novice, please help me in fixing this code as it is having problem in inserting data in file created in S3 Bucket.
谢谢
推荐答案
我将代码修改为更简单,还可以处理数据超过1MB的表的分页响应:
I have revised the code to be simpler and to also handle paginated responses for tables with more than 1MB of data:
import csv
import boto3
import json
TABLE_NAME = 'employee_details'
OUTPUT_BUCKET = 'my-bucket'
TEMP_FILENAME = '/tmp/employees.csv'
OUTPUT_KEY = 'employees.csv'
s3_resource = boto3.resource('s3')
dynamodb_resource = boto3.resource('dynamodb')
table = dynamodb_resource.Table(TABLE_NAME)
def lambda_handler(event, context):
with open(TEMP_FILENAME, 'w') as output_file:
writer = csv.writer(output_file)
header = True
first_page = True
# Paginate results
while True:
# Scan DynamoDB table
if first_page:
response = table.scan()
first_page = False
else:
response = table.scan(ExclusiveStartKey = response['LastEvaluatedKey'])
for item in response['Items']:
# Write header row?
if header:
writer.writerow(item.keys())
header = False
writer.writerow(item.values())
# Last page?
if 'LastEvaluatedKey' not in response:
break
# Upload temp file to S3
s3_resource.Bucket(OUTPUT_BUCKET).upload_file(TEMP_FILENAME, OUTPUT_KEY)
这篇关于我想使用Python(Boto3)以CSV格式将DynamoDB表导出到S3存储桶的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文