aws s3>是"aws s3 cp"多线程实现的命令? [英] aws s3 > is "aws s3 cp" command implemented with multithreads?

查看:329
本文介绍了aws s3>是"aws s3 cp"多线程实现的命令?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是使用AWS S3客户端的新手.我尝试使用"aws s3 cp"命令将一批文件从s3下载到本地文件系统,速度非常快.但是我随后尝试通过使用Amazon Java sdk API在单个线程循环中仅读取批处理文件的所有内容,这令人惊讶地比给定的"aws s3 cp"命令慢了几倍:<

I am newbie in using aws s3 client. I tried to use "aws s3 cp" command to download batch of files from s3 to local file system, it is pretty fast. But I then tried to only read all the contents of the batch of files in a single thread loop by using the amazon java sdk API, it is suprisingly several times slower then the given "aws s3 cp" command :<

谁知道原因是什么?我怀疑"aws s3 cp"是多线程的

Anyone know what is the reason? I doubted that "aws s3 cp" is multi-threaded

推荐答案

如果查看了transferconfig.py的来源,则表明默认值是:

If you looked at the source of transferconfig.py, it indicates that the defaults are:

DEFAULTS = {
    'multipart_threshold': 8 * (1024 ** 2),
    'multipart_chunksize': 8 * (1024 ** 2),
    'max_concurrent_requests': 10,
    'max_queue_size': 1000,
}

这意味着它可以同时执行10个请求,并且当文件大于8MB时,它还将传输分成8MB的块.

which means that it can be doing 10 requests at the same time, and that it also chunks the transfers into 8MB pieces when the file is larger than 8MB

这也是也记录在s3 cli配置中文档.

这些是您可以为S3设置的配置值:
max_concurrent_requests-并发请求的最大数量.
max_queue_size-任务队列中的最大任务数. multipart_threshold-CLI用于单个文件的多部分传输的大小阈值.
multipart_chunksize-使用分段传输时,这是CLI用于单个文件的分段传输的块大小.

These are the configuration values you can set for S3:
max_concurrent_requests - The maximum number of concurrent requests.
max_queue_size - The maximum number of tasks in the task queue. multipart_threshold - The size threshold the CLI uses for multipart transfers of individual files.
multipart_chunksize - When using multipart transfers, this is the chunk size that the CLI uses for multipart transfers of individual files.

您可以对其进行调整,以查看它是否与您的简单方法相比较:

You could tune it down, to see if it compares with your simple method:

aws configure set default.s3.max_concurrent_requests 1

别忘了以后再进行调整,否则您的AWS性能将很糟糕.

Don't forget to tune it back up afterwards, or else your AWS performance will be miserable.

这篇关于aws s3&gt;是"aws s3 cp"多线程实现的命令?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆