为什么要设置“export OPENBLAS_NUM_THREADS=1"?影响性能? [英] Why would setting "export OPENBLAS_NUM_THREADS=1" impair the performance?

查看:310
本文介绍了为什么要设置“export OPENBLAS_NUM_THREADS=1"?影响性能?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试将导出 OPENBLAS_NUM_THREADS=1"设置为 本文档 建议.但是我发现了一个奇怪的现象,设置它会显着损害我的 RL 算法的性能(我已经对 TD3 和 SAC 进行了一些测试,所有结果一致表明export OPENBLAS_NUM_THREADS=1"会损害性能).为什么会造成这么大的问题?

I try to set "export OPENBLAS_NUM_THREADS=1" as this document suggests. But I found a strange phenomenon that setting this significantly impairs the performance of my RL algorithms(I've done some tests for TD3 and SAC, all results consistently indicate that "export OPENBLAS_NUM_THREADS=1" impairs the performance). Why would this cause such a big problem?

顺便说一句,算法是使用 Tensorflow1.13 实现的,数据通过 tf.data.Dataset.所有测试均在 OpenAI 健身房的 BipedalWalker-v2 环境中完成.

BTW, the algorithms are implemented using Tensorflow1.13, data are fed into the neural network through tf.data.Dataset. all tests are done on BipedalWalker-v2 environment from OpenAI's Gym.

推荐答案

链接指南建议在使用 ray 时专门设置此变量,但并非总是如此.

The linked guide suggests setting this variable specifically when using ray, not always.

AFAICS,这是因为该框架本身会产生许多进程(每个actor 一个进程或其他进程),因此每个进程使用多个线程不会带来加速.当只有一个或只有几个进程时,情况并非如此.

AFAICS, that's because that framework itself spawns many processes (one for each actor or something), so each of them using multiple threads would bring no speedup. This is not the case when there's only one or only a few processes.

一般来说,OpenBLAS FAQ 说 OpenBLAS 的多线程可能会冲突" 与主程序的多线程并在这种情况下建议设置 OPENBLAS_NUM_THREADS=1.然而,FAQ 条目未能提供任何详细信息来验证其声明,因此它很可能已经过时!根据 https://github.com/obspy/obspy/wiki/Notes-on-Parallel-Processing-with-Python-and-ObsPy,这种冲突"的症状是猖獗的死锁和段错误.因此,如果您一无所有,那么您就很清楚了.主要的 Python 库都非常负责自己处理此类问题,而不是将它们转储给用户,所以我很确定如果 OpenBLAS 有任何使用限制,numpyscipy 如果您通过它们使用 OpenBLAS,则在内部和自动执行它们.

On a general note, OpenBLAS FAQ says that OpenBLAS' multithreading might "conflict" with the main program's multithreading and recommends setting OPENBLAS_NUM_THREADS=1 in such a case. The FAQ entry however fails to provide any details to verify its claim, so it can very well be obsolete! As per https://github.com/obspy/obspy/wiki/Notes-on-Parallel-Processing-with-Python-and-ObsPy, symptoms of such a "conflict" are rampant deadlocks and segfaults. So if you have nothing of the kind, you are in the clear. Major Python libraries are very responsible in dealing with such problem themselves rather than dumping them on the user, so I'm pretty sure that if OpenBLAS has any usage restrictions, numpy and scipy enforce them internally and automatically if you are using OpenBLAS through them.

这篇关于为什么要设置“export OPENBLAS_NUM_THREADS=1"?影响性能?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆