cudaFree不释放内存 [英] cudaFree is not freeing memory

查看：698 发布时间：2017/3/4 16:25:58 memory cuda free

本文介绍了cudaFree不释放内存的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

下面的代码计算两个向量a和b的点积。正确的结果是8192.当我第一次运行它的结果是正确的。然后当我第二次运行它的结果是以前的结果+ 8192等：

The code below calculates the dot product of two vectors a and b. The correct result is 8192. When I run it for the first time the result is correct. Then when I run it for the second time the result is the previous result + 8192 and so on:

1st iteration: result = 8192
2nd iteration: result = 8192 + 8192
3rd iteration: result = 8192 + 8192 
and so on.

通过在屏幕上打印并检查设备变量dev_c不被释放。更多的写入它会导致像一个和，结果beeing以前的值加上一个新的写入它。我想这可能是与atomicAdd（）操作的东西，但仍然cudaFree（dev_c）应该擦除它毕竟。

I checked by printing it on screen and the device variable dev_c is not freed. What's more writing to it causes something like a sum, the result beeing the previous value plus the new one being written to it. I guess that could be something with the atomicAdd() operation, but nonetheless cudaFree(dev_c) should erase it after all.

#define N 8192
#define THREADS_PER_BLOCK 512
#define NUMBER_OF_BLOCKS (N/THREADS_PER_BLOCK)
#include <stdio.h>


__global__ void dot( int *a, int *b, int *c ) {

    __shared__ int temp[THREADS_PER_BLOCK];

    int index = threadIdx.x + blockIdx.x * blockDim.x;

    temp[threadIdx.x] = a[index] * b[index];

    __syncthreads();

    if( 0 == threadIdx.x ) {

        int sum = 0;
        for( int i= 0; i< THREADS_PER_BLOCK; i++ ){
        sum += temp[i];
        }
        atomicAdd(c,sum);
    }
}

    int main( void ) {

        int *a, *b, *c;
        int *dev_a, *dev_b, *dev_c; 
        int size = N * sizeof( int); 

        cudaMalloc( (void**)&dev_a, size );
        cudaMalloc( (void**)&dev_b, size );
        cudaMalloc( (void**)&dev_c, sizeof(int));

        a = (int*)malloc(size);
        b = (int*)malloc(size);
        c = (int*)malloc(sizeof(int));

        for(int i = 0 ; i < N ; i++){
            a[i] = 1;
            b[i] = 1;
        }

        cudaMemcpy( dev_a, a, size, cudaMemcpyHostToDevice);
        cudaMemcpy( dev_b, b, size, cudaMemcpyHostToDevice);

        dot<<< N/THREADS_PER_BLOCK,THREADS_PER_BLOCK>>>( dev_a, dev_b, dev_c);

        cudaMemcpy( c, dev_c, sizeof(int) , cudaMemcpyDeviceToHost);

        printf("Dot product = %d\n", *c);

        cudaFree(dev_a);
        cudaFree(dev_b);
        cudaFree(dev_c);    

        free(a); 
        free(b); 
        free(c);

        return 0;

    }

cudaFree不释放内存 [英] cudaFree is not freeing memory

问题描述

推荐答案

相关文章

其它硬件开发最新文章

热门教程

热门工具

登录关闭

cudaFree不释放内存 [英] cudaFree is not freeing memory

问题描述

推荐答案

相关文章

其它硬件开发最新文章

热门教程

热门工具

登录 关闭

登录关闭