并行执行for循环以在多个GPU内核上同时运行? [英] Parallelizing a for loop to run simultaneously on multiple GPU cores?
问题描述
我知道您可以使用matlabpool
和parfor
并行运行for
循环迭代,但是,我想尝试利用GPU中使用的大量内核来运行更大的内核同时迭代的次数.我想知道是否有内置功能可以做到这一点?
I understand that you can use a matlabpool
and parfor
to run for
loop iterations in parallel, however, I want to try and take advantage of using the high number of cores in my GPU to run a larger number of simultaneous iterations. I was wondering if there is any built in functionality to do this?
据我所知,MATLAB
在GPU上运行代码的方法是通过GPUarray
进行的,但这似乎并没有使循环并行化,只有循环内的某些功能可以实现.
To my understanding, the method in which MATLAB
runs code on the GPU is through a GPUarray
, but that does not seem to parallelize a loop, only certain functions inside the loop.
对于我正在运行的循环,每个迭代都可以独立运行,并且循环外唯一需要存在的变量是要处理的数据(一个3D数组,其中第一个索引是时间,每个迭代在不同的时间进行操作)和一个二维输出数组,其中每个迭代都将特定时间的结果存储在其中.每次都是独立的.
For the loop that I am running, each iteration can run independently and the only variables that need to exist outside of the loop is the data to be processed (a 3-D array, where the first index is time, and each iteration is operating on a different time) and a 2-D output array where each iteration is storing the result for a particular time. Each time is independent.
谢谢
推荐答案
使用GPUArray
,您可以通过根据MATLAB的arrayfun
构造算法来并行运行元素操作.有效地,这隐式地遍历了数组的每个元素,并且可以将MATLAB函数的主体应用于每个元素.该文档为:此处.
With a GPUArray
, you can run elementwise operations in parallel by structuring your algorithm in terms of MATLAB's arrayfun
. Effectively, this implicitly loops over each element of your arrays, and can apply the body of a MATLAB function to each element. The doc is: here.
有一个简单的演示: 查看全文