对于具有> 10s Tb图像的机器学习任务,性能最佳的存储设置是什么? [英] What is the most performant storage setup for a machine learning task with >10s of Tb of imagery?

查看:71
本文介绍了对于具有> 10s Tb图像的机器学习任务,性能最佳的存储设置是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用卫星图像训练图像分割模型,每个图像约为1.5 Gb。总的来说,我将在100s的Gb图像上训练模型,并应用该模型测试10s的Tb图像,或更多。我对Azure存储的所有不同的
选项有点淹没,我正在寻找能够处理大容量存储的最高性能选项。什么是最常用于天蓝色的大数据ml任务?以下是我看到的选项

I am training image segmentation models using satellite imagery, each image is about 1.5 Gb. Overall I'll be training the model on 100s of Gb of imagery and applying the model to test on 10s of Tb of imagery, or more. I'm a bit swamped with all the different options for storage with Azure and I'm looking for the most performant option that can handle large storage. What is most commonly used for big data ml tasks on azure? Here are the options I see

包含或不包含blobfuse的Blob存储:  https://github.com/Azure/azure-storage-fuse/wiki/1.-Installation

Blob storage with or without blobfuse: https://github.com/Azure/azure-storage-fuse/wiki/1.-Installation

- 对于Blob存储的REST调用是否比在vm上安装文件存储并且我从这个已安装的文件存储中读入图像时要慢?

- Are the REST calls to Blob storage slower than if a File Store was mounted on a vm and I read in images from this mounted File Store?

- azure-storage-fuse是否常用于天蓝色的机器学习?我不确定在训练和测试模型时使用blobfuse读取我的图像文件是否有性能成本,或者甚至可能。

- Is azure-storage-fuse commonly used with machine learning on azure? I'm not sure if there is a cost in performance to using blobfuse to read in my image files when I am training and testing the model, or if this is even possible.

安装一个VM上的Azure文件存储

Mounting an Azure file store on a VM

- 这会比Blob存储更高效吗?文件共享有5 TB的限制,这就是为什么我虽然Blob会更好的可伸缩性。但也许我可以使用多个文件共享?

- Would this be more performant than Blob storage? There is a 5 TB limit to the File Share which is why I though Blob would be better for scalability. But maybe I could use multiple File shares?

感谢您的帮助和建议,

Ryan

推荐答案

您好,

以下是性能方面的考虑因素需要在使用blobfuse之前进行检查。 

Here are the considerations for performance that needs to be checked before using blobfuse. 

为了获得合理的性能,blobfuse需要一个临时目录作为本地缓存。该目录将包含通过blobfuse读取或写入的任何文件(blob)的完整内容。如果缓存文件不再有打开的文件句柄,它们将在
(--file-cache-timeout-in-seconds)之前被清除。

In order to achieve reasonable performance, blobfuse requires a temporary directory to use as a local cache. This directory will contain the full contents of any file (blob) read to or written from through blobfuse. Cached files will be purged as they age (--file-cache-timeout-in-seconds) if there are no longer open file handles to them.


  • 将缓存目录放在ramdisk或SSD(Azure上的临时磁盘)上将大大提高性能。
  • Blobfuse目前无法管理tmp路径中的可用磁盘空间。确保有足够的空间,或减少--file-cache-timeout-in-seconds值以加速清除缓存的文件。
  • 要删除缓存,请卸载并重新安装blobfuse。
  • 不要对blobfuse的多个实例使用相同的缓存目录,或者在blobfuse运行时使用相同的缓存目录。

以下
链接
提供了有关不同工作负载的行为的更多详细信息。

The following link has more details of its behavior for different workloads.

我认为以下
部分
将帮助您确定哪个选项有用。

I think the following section about Azure storage and scalability will help you decide which option would help.

此外,请参阅此
blog
 和 repo
关于使用深度学习的卫星图像。

Also, please refer to this blog  and repo about using satellite images with deep learning.

我希望这有帮助!!


这篇关于对于具有> 10s Tb图像的机器学习任务,性能最佳的存储设置是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆