如何在AMD GPU上运行Python? [英] How to run Python on AMD GPU?

查看:1685
本文介绍了如何在AMD GPU上运行Python?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们目前正在尝试优化至少包含12个变量的系统.这些变量的总合并超过10亿.这不是深度学习或机器学习或Tensorflow或其他任何东西,而是对时间序列数据的任意计算.

We are currently trying to optimize a system in which there are at least 12 variables. Total comibination of these variable is over 1 billion. This is not deep learning or machine learning or Tensorflow or whatsoever but arbitrary calculation on time series data.

我们已经用Python实现了代码,并在CPU上成功运行了它.我们还尝试了多处理,该方法也很好用,但由于计算需要数周,因此我们需要更快的计算速度.我们拥有一个由6个AMD GPU组成的GPU系统.我们想在此GPU系统上运行代码,但不知道如何执行.

We have implemented our code in Python and successfully run it on CPU. We also tried multiprocessing which also works well but we need faster computation since calculation takes weeks. We have a GPU system consisting of 6 AMD GPUs. We would like to run our code on this GPU system but do not know how to do so.

我的问题是:

  1. 我们可以在支持AMD的笔记本电脑上运行简单的Python代码吗?
  2. 我们可以在GPU系统上运行相同的应用程序吗?

我们了解到需要为GPU计算调整代码,但我们不知道该怎么做.

We read that we need to adjust the code for GPU computation but we do not know how to do that.

PS:如果需要,我可以添加更多信息.我试图使帖子尽可能简单,以免发生冲突.

PS: I can add more information if you need. I tried to keep the post as simple as possible to avoid conflict.

推荐答案

至少有两个选项可以使用GPU加快计算速度:

There are at least two options to speed up calculations using the GPU:

  • PyOpenCL
  • Numba

但是我通常不建议从一开始就在GPU上运行代码. GPU上的计算并不总是更快.取决于它们的复杂程度以及您在CPU和GPU上的实现情况.如果您按照下面的列表进行操作,则可以很好地了解期望的结果.

But I usually don't recommend to run code on the GPU from the start. Calculations on the GPU are not always faster. Depending on how complex they are and how good your implementations on the CPU and GPU are. If you follow the list below you can get a good idea on what to expect.

  1. 如果您的代码是纯Python(列表,浮点,for循环等),则可以通过使用矢量化的Numpy代码看到巨大的加速(最高可达100倍).这也是找出如何实现GPU代码的重要一步,因为矢量化的Numpy中的计算将具有类似的方案. GPU在可以并行化的小任务上表现更好.

  1. If your code is pure Python (list, float, for-loops etc.) you can see a a huge speed-up (maybe up to 100 x) by using vectorized Numpy code. This is also an important step to find out how your GPU code could be implemented as the calculations in vectorized Numpy will have a similar scheme. The GPU performs better at small tasks that can be parallelized.

一旦您有了一个经过优化的Numpy示例,就可以尝试使用

Once you have a well optimized Numpy example you can try to get a first peek on the GPU speed-up by using Numba. For simple cases you can just decorate your Numpy functions to run on the GPU. You can expect a speed-up of 100 to 500 compared to Numpy code, if your problem can be parallelized / vectorized.

到目前为止,您可能还没有为GPU编写任何OpenCL C代码,但仍然可以在其上运行代码.但是,如果您的问题太复杂,则必须编写自定义代码,然后使用 PyOpenCL 运行它.与良好的Numpy代码相比,预期的提速也是100到500.

You may have gotten so far without writing any OpenCL C code for the GPU but still have your code running on it. But if your problem is too complex, you will have to write custom code and run it using PyOpenCL. Expected speed-up is also 100 to 500 compared to good Numpy code.

重要的一点是,GPU只有在正确使用且仅针对某些问题的情况下才具有强大的功能.

The important thing to rembemer is that the GPU is only powerful if you use it correctly and only for a certain set of problems.

如果您有代码的一个小例子,请随时发布.

If you have a small example of your code feel free to post it.

要说的另一件事是,CUDA通常比OpenCL更易于使用.有更多的库,更多的示例,更多的文档,更多的支持. Nvidia从一开始就很好地支持了OpenCL,在这方面做得很好.我通常采用开放标准,但是当事情变得商业化时,我们很快就转向了CUDA和Nvidia硬件.

Another thing to say is that CUDA is often easier to use than OpenCL. There are more libraries, more examples, more documentation, more support. Nvidia did a very good job on not supporting OpenCL well from the very start. I usually perfer open standards, but we moved to CUDA and Nvidia hardware quickly when things became business and commercial.

这篇关于如何在AMD GPU上运行Python?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆