将八度转换为使用CuBLAS [英] Converting Octave to Use CuBLAS

查看:306
本文介绍了将八度转换为使用CuBLAS的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将Octave转换为使用CuBLAS进行矩阵乘法。此视频似乎表明这是一个简单的输入28个字符:



使用CUDA库加速应用程序



在实践中,它比这更复杂一些。是否有人知道必须做哪些额外的工作才能在此视频编译中进行修改?



UPDATE



这里是我正在尝试的方法



in dMatrix.cc add



#include < cublas.h>



在dMatrix.cc中更改(保留大小写)的所有出现



dgemm





cublas_dgemm



在我的构建终端机中

  export CC = nvcc 
export CFLAGS = - lcublas -lcudart
export CPPFLAGS = - I / usr / local / cuda / include
export LDFLAGS = - L / usr / local / cuda / lib64

我收到的错误是: p>

  libtool:link:g ++ -I / usr / include / freetype2 -Wall -W -Wshadow -Wold-style-cast 
-Wformat -Wpointer-arith -Wwrite-strings -Wcast-align -Wcast-qual -g -O2
-o .libs / octave octave-main.o -L / usr / local / cuda / lib64
../libgui/.libs/liboctgui.so ../libinterp/.libs/liboctinterp.so
../liboctave/.libs/liboctave.so -lutil -lm -lpthread -Wl,-rpath
-Wl,/ usr / local / lib / octave / 3.7.5

../liboctave/.libs/liboctave.so:未定义引用`cublas_dgemm_'


解决方案

EDIT2:
此视频中要求使用fortranthunking library cublas的绑定
这些步骤适用于我:


  1. 此处

      wget ftp://ftp.gnu.org/gnu/octave/octave-3.6.3.tar.gz 


  2. 从档案中提取所有档案:

      tar -xzvf octave-3.6。 3.tar.gz 


  3. 更改为刚刚创建的八度目录:

      cd octave-3.6.3 


  4. 为您的thunking cublas库创建一个目录

      mkdir mycublas 


  5. 更改为该目录

      cd mycublas 


  6. 构建thunking cublas库

      g ++ -c -fPIC -I / usr / local / cuda / include -I / usr / local / cuda / src -DCUBLAS_GFORTRAN -o fortran_thunking.o /usr/local/cuda/src/fortran_thunking.c 
    ar rvs libmycublas.a fortran_thunking.o


  7. 切换回主构建目录

      cd .. 


  8. 使用附加选项运行八度音阶配置

      ./ configure --disable-docs LDFLAGS = -  L / usr / local / cuda / lib64 -lcublas -lcudart -L / home / user2 / octave / octave-3.6.3 / mycublas -lmycublas

    请注意,在上述命令行中,将第二个 -L 的目录更改为与您在 mycublas 目录中创建的路径相匹配的目录第4步


  9. 现在根据说明编辑 octave-3.6.3 / liboctave / dMatrix.cc 视频中给出。应该足以用 cublas_dgemm 替换 dgemm 的每个实例,并且 DGEMM CUBLAS_DGEMM


  10. 现在你可以建立八度:

    / p>

      make 

    (确保您位于 octave-3.6.3 目录中)


在这一点上,对我来说,Octave已经成功。我没有追求 make install 虽然我认为这将工作。我只需使用 octave-3.6.3 目录中的 ./ run-octave



上述步骤假定正确和标准的CUDA 5.0安装。我将尽力回应CUDA的具体问题或问题,但在您的平台上安装一般Octave可能会出现任何问题。我不是一个八度的专家,我将无法回应那些。我使用CentOS 6.2进行此测试。



如上所述,此方法涉及修改八度音阶的C源文件。



另一种方法在GTC 2013 GPU技术会议上的S3527会议中有详细讨论。这个会议实际上是一个实践的实验室练习。不幸的是,上面的材料不方便。然而,该方法没有涉及对GNU Octave源的任何修改,而是使用 LD_PRELOAD Linux的能力拦截BLAS库调用并重定向(适当的)到cublas库。



A更新,更好的方法(使用NVBLAS拦截库)在此博客中讨论第


I'd like to convert Octave to use CuBLAS for matrix multiplication. This video seems to indicate this is as simple as typing 28 characters:

Using CUDA Library to Accelerate Applications

In practice it's a bit more complex than this. Does anyone know what additional work must be done to make the modifications made in this video compile?

UPDATE

Here's the method I'm trying

in dMatrix.cc add

#include <cublas.h>

in dMatrix.cc change all occurences of (preserving case)

dgemm

to

cublas_dgemm

in my build terminal set

export CC=nvcc
export CFLAGS="-lcublas -lcudart"
export CPPFLAGS="-I/usr/local/cuda/include"
export LDFLAGS="-L/usr/local/cuda/lib64"

the error I receive is:

libtool: link: g++ -I/usr/include/freetype2 -Wall -W -Wshadow -Wold-style-cast 
-Wformat -Wpointer-arith -Wwrite-strings -Wcast-align -Wcast-qual -g -O2
-o .libs/octave octave-main.o  -L/usr/local/cuda/lib64 
../libgui/.libs/liboctgui.so ../libinterp/.libs/liboctinterp.so 
../liboctave/.libs/liboctave.so -lutil -lm -lpthread -Wl,-rpath
-Wl,/usr/local/lib/octave/3.7.5

../liboctave/.libs/liboctave.so: undefined reference to `cublas_dgemm_'

解决方案

EDIT2: The method described in this video requires the use of the fortran "thunking library" bindings for cublas. These steps worked for me:

  1. Download octave 3.6.3 from here:

    wget ftp://ftp.gnu.org/gnu/octave/octave-3.6.3.tar.gz
    

  2. extract all files from the archive:

    tar -xzvf octave-3.6.3.tar.gz
    

  3. change into the octave directory just created:

    cd octave-3.6.3
    

  4. make a directory for your "thunking cublas library"

    mkdir mycublas
    

  5. change into that directory

    cd mycublas
    

  6. build the "thunking cublas library"

    g++ -c -fPIC -I/usr/local/cuda/include -I/usr/local/cuda/src -DCUBLAS_GFORTRAN -o fortran_thunking.o /usr/local/cuda/src/fortran_thunking.c
    ar rvs libmycublas.a fortran_thunking.o
    

  7. switch back to the main build directory

    cd ..
    

  8. run octave's configure with additional options:

    ./configure --disable-docs LDFLAGS="-L/usr/local/cuda/lib64 -lcublas -lcudart -L/home/user2/octave/octave-3.6.3/mycublas -lmycublas"
    

    Note that in the above command line, you will need to change the directory for the second -L switch to that which matches the path to your mycublas directory that you created in step 4

  9. Now edit octave-3.6.3/liboctave/dMatrix.cc according to the instructions given in the video. It should be sufficient to replace every instance of dgemm with cublas_dgemm and every instance of DGEMM with CUBLAS_DGEMM. In the octave 3.6.3 version I used, there were 3 such instances of each (lower case and upper case).

  10. Now you can build octave:

    make
    

    (make sure you are in the octave-3.6.3 directory)

At this point, for me, Octave built successfully. I did not pursue make install although I assume that would work. I simply ran octave using the ./run-octave script in the octave-3.6.3 directory.

The above steps assume a proper and standard CUDA 5.0 install. I will try to respond to CUDA-specific questions or issues, but there are any number of problems that may arise with a general Octave install on your platform. I'm not an octave expert and I won't be able to respond to those. I used CentOS 6.2 for this test.

This method, as indicated, involves modification of the C source files of octave.

Another method was covered in some detail in the S3527 session at the GTC 2013 GPU Tech Conference. This session was actually a hands-on laboratory exercise. Unfortunately the materials on that are not conveniently available. However the method there did not involve any modification of GNU Octave source, but instead uses the LD_PRELOAD capability of Linux to intercept the BLAS library calls and re-direct (the appropriate ones) to the cublas library.

A newer, better method (using the NVBLAS intercept library) is discussed in this blog article

这篇关于将八度转换为使用CuBLAS的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆