如何有效利用GPU进行强化学习? [英] How to effectively make use of a GPU for reinforcement learning?

查看:188
本文介绍了如何有效利用GPU进行强化学习?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最近,我研究了强化学习,但有一个问题困扰着我,我找不到以下答案:使用GPU如何有效地进行训练?据我了解,与环境的不断交互是必需的,对我来说这似乎是一个巨大的瓶颈,因为该任务通常是非数学的/不可并行的.例如,Alpha Go使用多个TPU/GPU.那他们怎么办呢?

Recently i looked into reinforcement learning and there was one question bugging me, that i could not find an answer for: How is training effectively done using GPUs? To my understanding constant interaction with an environment is required, which for me seems like a huge bottleneck, since this task is often non-mathematical / non-parallelizable. Yet for example Alpha Go uses multiple TPUs/GPUs. So how are they doing it?

推荐答案

实际上,您经常会在学习步骤之间与环境进行交互,与在GPU上运行相比,在CPU上运行通常会更好.因此,如果用于执行操作的代码和用于执行更新/学习步骤的代码非常快(例如,在表格RL算法中),那么将这些代码放在GPU上就不值得付出努力.

Indeed, you will often have interactions with the environment in between learning steps, which will often be better off running on CPU than GPU. So, if your code for taking actions and your code for running an update / learning step are very fast (as in, for example, tabular RL algorithms), it won't be worth the effort of trying to get those on the GPU.

但是,当您拥有庞大的神经网络时,无论何时选择一个动作或运行一个学习步骤(如当今流行的大多数深度强化学习方法中的情况),都需要遍历.在GPU而不是CPU上运行它们的速度通常足以使它值得在GPU上运行它们的努力(即使这意味着您经常在CPU和GPU之间切换",并且可能需要复制一些内容)从RAM到VRAM或相反的方式.

However, when you have a big neural network, that you need to go through whenever you select an action or run a learning step (as is the case in most of the Deep Reinforcement Learning approaches that are popular these days), the speedup of running these on GPU instead of CPU is often enough for it to be worth the effort of running them on GPU (even if it means you're quite regularly ''switching'' between CPU and GPU, and may need to copy some things from RAM to VRAM or the other way around).

这篇关于如何有效利用GPU进行强化学习?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆