如何从S3下载大型csv文件，而不会遇到“内存不足"的问题? [英] How to download large csv files from S3 without running into 'out of memory' issue?

查看：64 发布时间：2021/4/3 19:36:15 amazon-web-services amazon-s3 boto3

本文介绍了如何从S3下载大型csv文件，而不会遇到“内存不足"的问题?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要处理存储在S3存储桶中的大文件.我需要将csv文件分成较小的块进行处理.但是，这似乎是在文件系统存储上胜于在对象存储上完成的任务.因此，我计划将大文件下载到本地，将其分成较小的块，然后将结果文件一起上传到另一个文件夹中.我知道方法 download_fileobj ，但无法确定在下载大小约为10GB的大文件时，是否会导致内存不足错误.

I need to process large files stored in S3 bucket. I need to divide the csv file into smaller chunks for processing. However, this seems to be a task done better on file-system storage rather an on object storage. Hence, I am planning to download the large file to local, divide it into smaller chunks and then upload the resultant files together in a different folder. I am aware of the method download_fileobj but could not determine whether it would result in out of memory error while downloading large files of sizes ~= 10GB.

推荐答案

我建议使用

I would recommend using download_file():

import boto3
s3 = boto3.resource('s3')
s3.meta.client.download_file('mybucket', 'hello.txt', '/tmp/hello.txt')

下载时，它不会用完内存.Boto3将负责转移过程.

It will not run out of memory while downloading. Boto3 will take care of the transfer process.

这篇关于如何从S3下载大型csv文件，而不会遇到“内存不足"的问题?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何从S3下载大型csv文件，而不会遇到“内存不足"的问题? [英] How to download large csv files from S3 without running into 'out of memory' issue?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何从S3下载大型csv文件，而不会遇到“内存不足"的问题? [英] How to download large csv files from S3 without running into &#39;out of memory&#39; issue?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

如何从S3下载大型csv文件，而不会遇到“内存不足"的问题? [英] How to download large csv files from S3 without running into 'out of memory' issue?

登录关闭