如何使用 AWS Lambda Python 读取 AWS S3 存储的 word 文档(.doc 和 .docx)文件内容? [英] How to read AWS S3 stored word document (.doc and .docx) file content using AWS Lambda Python?

查看：48 发布时间：2021/10/27 19:02:23 python amazon-web-services amazon-s3 aws-lambda

本文介绍了如何使用 AWS Lambda Python 读取 AWS S3 存储的 word 文档(.doc 和 .docx)文件内容?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我的场景，我正在尝试使用 python 实现从 Aws Lambda 读取 AWS Stored S3 word 文档(.doc 和 .docx)文件内容.下面是我使用的代码，我的问题是我可以获取文件名但无法读取内容.

def lambda_handler(event, context):file_contents = s3.Object('Bucketname', 'sample.docx').get()['Body'].read().decode("unicode-escape")返回 {'文件名':obj.key，‘内容’:file_contents}

<块引用>

响应:{errorMessage":'unicodeescape'编解码器无法解码位置 25818-25819 中的字节:截断的 \xXX 转义"、错误类型":"UnicodeDecodeError", "stackTrace": [["/var/task/lambda_function.py",76、"lambda_handler","file_contents = s3.Object('Bucketname', 'sample.docx').get()['Body'].read().decode(\"unicode-escape\")"] ] }

解决方案

.docx 和 .doc 文件是二进制文件，所以简单的解码是行不通的，也许 docx2txt可能会在这里有所帮助.

My scenario, I am trying to implement read AWS Stored S3 word document (.doc and .docx) file content from Aws Lambda by using python. Below code I am using, My problem is I can able to get the file name but I can’t able to read content.

def lambda_handler(event, context):

    file_contents = s3.Object(‘Bucketname’, 'sample.docx').get()['Body'].read().decode("unicode-escape")

    return {
         'File Name' : obj.key,
         ‘Content’ : file_contents
            }

Response: { "errorMessage": "'unicodeescape' codec can't decode bytes in position 25818-25819: truncated \xXX escape", "errorType": "UnicodeDecodeError", "stackTrace": [ [ "/var/task/lambda_function.py", 76, "lambda_handler", "file_contents = s3.Object('Bucketname', 'sample.docx').get()['Body'].read().decode(\"unicode-escape\")" ] ] }

解决方案

.docx and .doc files are binary files, so a simple decode won't work, perhaps docx2txt may help here.

这篇关于如何使用 AWS Lambda Python 读取 AWS S3 存储的 word 文档(.doc 和 .docx)文件内容?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何使用 AWS Lambda Python 读取 AWS S3 存储的 word 文档(.doc 和 .docx)文件内容? [英] How to read AWS S3 stored word document (.doc and .docx) file content using AWS Lambda Python?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何使用 AWS Lambda Python 读取 AWS S3 存储的 word 文档(.doc 和 .docx)文件内容? [英] How to read AWS S3 stored word document (.doc and .docx) file content using AWS Lambda Python?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭