如何使用Tensorflow Object Detection API启用多GPU训练 [英] How to enable multi GPU training with Tensorflow Object Detection API

查看:134
本文介绍了如何使用Tensorflow Object Detection API启用多GPU训练的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用TensorFlow对象检测API进行Multi-GPU培训.

I am attempting to perform Multi-GPU training with the TensorFlow Object Detection API.

我在NVIDIA-SMI中看到的是实际上只使用了1个GPU.所提供的其他3个GPU均已加载了GPU进程,但内存使用量始终为300MB,利用率始终为0%

What I see in my NVIDIA-SMI is that only 1 GPU is actually being utilized. The other 3 GPUs that are provided have the GPU process loaded to them, but memory usage is at 300MB and utilization sits at 0% at all times

我使用的是在COCO上经过预先训练的基于SSD MobileNetV1的网络,然后使用我的自定义数据集对其进行训练.

I am using the SSD MobileNetV1 based network pretrained on COCO and then training it with my custom dataset.

我希望当我为Tensorflow提供更多GPU时,该框架实际上将使用它们来加快培训速度.

I expect that when I provide Tensorflow with more GPUs, the framework will actually use them to speed up training.

推荐答案

对于Tensorflow 2.2.0对象检测API,当您运行model_main_tf2.py时,请启用以下标志:

For Tensorflow 2.2.0 Object Detection API, when you are running model_main_tf2.py, enable this flags:

python model_main_tf2.py --num_workers=2

---num_workers的任何整数>在图1中,tensorflow使用所有可用的GPU,如果您只想使用某些GPU,则必须编辑此model_main_tf2.py文件,在该文件中指定策略,同时将num_workers保持默认值1.例如,这使用了计算机的第一和第二个gpu:

for any integer for --num_workers > 1, tensorflow uses all available gpus, if you want to use only some of the gpus, you have to edit this model_main_tf2.py file where it specifies the strategy while keeping the num_workers in default 1. This for example, uses first and second gpu of the machine:

strategy = tf.distribute.MirroredStrategy(devices=["/gpu:0", "/gpu:1"])

这篇关于如何使用Tensorflow Object Detection API启用多GPU训练的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆