GPU 上的 AWS SageMaker [英] AWS SageMaker on GPU

查看：37 发布时间：2021/9/5 19:26:50 amazon-web-services tensorflow amazon-sagemaker

本文介绍了GPU 上的 AWS SageMaker的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试在 AWS 上训练神经网络 (Tensorflow).我有一些 AWS 积分.根据我的理解，AWS SageMaker 是最适合这项工作的.我设法在 SageMaker 上加载了 Jupyter Lab 控制台并试图找到一个 GPU 内核，因为我知道它是训练神经网络的最佳选择.但是，我找不到这样的内核.

I am trying to train a neural network (Tensorflow) on AWS. I have some AWS credits. From my understanding AWS SageMaker is the one best for the job. I managed to load the Jupyter Lab console on SageMaker and tried to find a GPU kernel since, I know it is the best for training neural networks. However, I could not find such kernel.

任何人都可以在这方面提供帮助.

Would anyone be able to help in this regard.

谢谢&最好的问候

Thanks & Best Regards

迈克尔

推荐答案

您可以通过 2 个不同的组件在 SageMaker 生态系统中的 GPU 上训练模型:

You train models on GPU in the SageMaker ecosystem via 2 different components:

您可以实例化 GPU 驱动的 SageMaker笔记本实例，例如 p2.xlarge (NVIDIA K80) 或 p3.2xlarge (NVIDIA V100).这对于交互式开发来说很方便——你的笔记本下面有 GPU，可以交互式地在 GPU 上运行代码，并通过终端选项卡中的 nvidia-smi 监控 GPU——一个很好的开发体验.但是，当您直接从 GPU 驱动的机器上进行开发时，有时您可能不会使用 GPU.例如，当您编写代码或浏览某些文档时.一直以来，您都为闲置的 GPU 付费.在这方面，它可能不是您的用例最具成本效益的选择.

You can instantiate a GPU-powered SageMaker Notebook Instance, for example p2.xlarge (NVIDIA K80) or p3.2xlarge (NVIDIA V100). This is convenient for interactive development - you have the GPU right under your notebook and can run code on the GPU interactively and monitor the GPU via nvidia-smi in a terminal tab - a great development experience. However when you develop directly from a GPU-powered machine, there are times when you may not use the GPU. For example when you write code or browse some documentation. All that time you pay for a GPU that sits idle. In that regard, it may not be the most cost-effective option for your use-case.

另一种选择是使用 SageMaker 训练作业在 GPU 实例上运行.这是训练的首选选项，因为训练元数据(数据和模型路径、超参数、集群规范等)保存在 SageMaker 元数据存储中，日志和指标存储在 Cloudwatch 中，并且实例会在训练结束时自动关闭.在小型 CPU 实例上开发并使用 SageMaker Training API 启动训练任务将帮助您充分利用预算，同时帮助您保留所有实验的元数据和工件.你可以看到这里有一个有据可查的 TensorFlow 示例

这篇关于GPU 上的 AWS SageMaker的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

GPU 上的 AWS SageMaker [英] AWS SageMaker on GPU

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

GPU 上的 AWS SageMaker [英] AWS SageMaker on GPU

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭