可以“就地”使用推力::收集器吗? [英] Can thrust::gather be used "in-place"?
问题描述
考虑以下代码:
#include< time.h> //--time
#include< stdlib.h> //--srand,rand
#include< fstream>
#include< thrust\host_vector.h>
#include< thrust\device_vector.h>
#include< thrust\sort.h>
#include< thrust\iterator\zip_iterator.h>
#include TimingGPU.cuh
/ ******** /
/ *主要* /
/ *** ***** /
int main(){
const int N = 16384;
std :: ifstream h_indices_File,h_x_File;
h_indices_File.open( h_indices.txt);
h_x_File.open( h_x.txt);
std :: ofstream h_x_result_File;
h_x_result_File.open( h_x_result.txt);
推力:: host_vector< int> h_indices(N);
推力:: host_vector< double> h_x(N);
推力:: host_vector< double> h_sorted(N);
for(int k = 0; k< N; k ++){
h_indices_File>> h_indices [k];
h_x_File>> h_x [k];
}
推力:: device_vector< int> d_indices(h_indices);
推力:: device_vector< double> d_x(h_x);
推力:: gather(d_indices.begin(),d_indices.end(),d_x.begin(),d_x.begin());
h_x = d_x; (int k = 0; k
// thrust :: device_vector< double> d_x_sorted(N);
//thrust::gather(d_indices.begin(),d_indices.end(),d_x.begin(),d_x_sorted.begin());
// h_x = d_x_sorted;
// for(int k = 0; k< N; k ++)h_x_result_File<< h_x [k]<< \n;
}
代码从文件中加载索引数组 h_indices.txt
和 double
数组 h_x.txt
。然后,将这些数组传输到GPU到 d_indices
和 d_x
并使用 thrust ::聚集
达到Matlab的等效水平
d_x(d_indices)
两个txt文件可以从
另一方面,就地情况返回
我正在使用Windows 10,CUDA 8.0,Visual Studio 2013,在发布模式下编译并在NVIDIA GTX 960 cc上运行。 5.2。
推力 gather
不能就地使用。
但是我要提出的建议是,不能就地安全地执行任何天真的收集操作,而您提出的Matlab代码段则是就地(大概 d_x = d_x(d_indices)
)根本不是就地操作。
Consider the following code:
#include <time.h> // --- time
#include <stdlib.h> // --- srand, rand
#include<fstream>
#include <thrust\host_vector.h>
#include <thrust\device_vector.h>
#include <thrust\sort.h>
#include <thrust\iterator\zip_iterator.h>
#include "TimingGPU.cuh"
/********/
/* MAIN */
/********/
int main() {
const int N = 16384;
std::ifstream h_indices_File, h_x_File;
h_indices_File.open("h_indices.txt");
h_x_File.open("h_x.txt");
std::ofstream h_x_result_File;
h_x_result_File.open("h_x_result.txt");
thrust::host_vector<int> h_indices(N);
thrust::host_vector<double> h_x(N);
thrust::host_vector<double> h_sorted(N);
for (int k = 0; k < N; k++) {
h_indices_File >> h_indices[k];
h_x_File >> h_x[k];
}
thrust::device_vector<int> d_indices(h_indices);
thrust::device_vector<double> d_x(h_x);
thrust::gather(d_indices.begin(), d_indices.end(), d_x.begin(), d_x.begin());
h_x = d_x;
for (int k = 0; k < N; k++) h_x_result_File << h_x[k] << "\n";
//thrust::device_vector<double> d_x_sorted(N);
//thrust::gather(d_indices.begin(), d_indices.end(), d_x.begin(), d_x_sorted.begin());
//h_x = d_x_sorted;
//for (int k = 0; k < N; k++) h_x_result_File << h_x[k] << "\n";
}
The code loads from file an array of indices h_indices.txt
and a double
array h_x.txt
. Then, it transfers those arrays to the GPU to d_indices
and d_x
and uses thrust::gather
to achieve Matlab's equivalent
d_x(d_indices)
The two txt files can be downloaded from h_indices.txt and h_x.txt. The code creates an output result file h_x_result.txt
.
If I use the "in-place" version of thrust::gather
(the last uncommented three lines of the code), then I obtain that the result is different from d_x(d_indices)
, while if I use the not "in-place" version (the last commented three lines of the code), then the result is correct.
In Matlab, I'm using
load h_indices.txt; load h_x.txt; load h_x_result.txt
plot(h_x(h_indices + 1)); hold on; plot(h_x_result, 'r'); hold off
The "in-place" case returns the following comparison
On the other side, the "in-place" case returns
I'm using Windows 10, CUDA 8.0, Visual Studio 2013, compiling in Release Mode and running on an NVIDIA GTX 960 cc. 5.2.
Thrust gather
can't be used in place.
But I would go as far as to suggest that no "naïve" gather operation can be safely performed in-place, and that the Matlab snippet you presented as in-place (presumably d_x = d_x(d_indices)
) isn't an in-place operation at all.
这篇关于可以“就地”使用推力::收集器吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!