使用 OpenBLAS 集成编译 numpy [英] Compiling numpy with OpenBLAS integration

查看:31
本文介绍了使用 OpenBLAS 集成编译 numpy的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 OpenBLAS 安装 numpy ,但是我不知道如何编写 site.cfg 文件.

I am trying to install numpy with OpenBLAS , however I am at loss as to how the site.cfg file needs to be written.

当按照安装程序完成安装时,没有错误,但是,将 OpenBLAS 使用的线程数从 1 增加(由环境变量 OMP_NUM_THREADS 控制)会导致性能下降.

When the installation procedure was followed the installation completed without errors, however there is performance degradation on increasing the number of threads used by OpenBLAS from 1 (controlled by the environment variable OMP_NUM_THREADS).

我不确定 OpenBLAS 集成是否完美.任何人都可以提供一个 site.cfg 文件来实现相同的功能.

I am not sure if the OpenBLAS integration has been perfect. Could any one provide a site.cfg file to achieve the same.

PS:OpenBLAS 集成在其他工具包中,例如基于 Python 的 Theano,在增加数量时提供了显着的性能提升线程数,在同一台机器上.

P.S.: OpenBLAS integration in other toolkits like Theano, which is based on Python, provides substantial performance boost on increasing the number of threads, on the same machine.

推荐答案

我刚刚在 virtualenv 中编译了 numpyOpenBLAS 集成,并且它似乎工作正常.

I just compiled numpy inside a virtualenv with OpenBLAS integration, and it seems to be working OK.

这是我的过程:

  1. 编译OpenBLAS:

$ git clone https://github.com/xianyi/OpenBLAS
$ cd OpenBLAS && make FC=gfortran
$ sudo make PREFIX=/opt/OpenBLAS install

如果您没有管理员权限,您可以将 PREFIX= 设置为您具有写入权限的目录(只需相应地修改下面的相应步骤即可).

If you don't have admin rights you could set PREFIX= to a directory where you have write privileges (just modify the corresponding steps below accordingly).

确保包含 libopenblas.so 的目录在您的共享库搜索路径中.

Make sure that the directory containing libopenblas.so is in your shared library search path.

  • 要在本地执行此操作,您可以编辑 ~/.bashrc 文件以包含该行

export LD_LIBRARY_PATH=/opt/OpenBLAS/lib:$LD_LIBRARY_PATH

LD_LIBRARY_PATH 环境变量将在您启动新的终端会话时更新(使用 $ source ~/.bashrc 在同一会话中强制更新).

The LD_LIBRARY_PATH environment variable will be updated when you start a new terminal session (use $ source ~/.bashrc to force an update within the same session).

另一个适用于多个用户的选项是在 /etc/ld.so.conf.d/ 中创建一个 .conf 文件,其中包含该行/opt/OpenBLAS/lib,例如:

Another option that will work for multiple users is to create a .conf file in /etc/ld.so.conf.d/ containing the line /opt/OpenBLAS/lib, e.g.:

$ sudo sh -c "echo '/opt/OpenBLAS/lib' > /etc/ld.so.conf.d/openblas.conf"

完成任一选项后,运行

$ sudo ldconfig

  • 获取numpy源代码:

    $ git clone https://github.com/numpy/numpy
    $ cd numpy
    

  • site.cfg.example 复制到 site.cfg 并编辑副本:

  • Copy site.cfg.example to site.cfg and edit the copy:

    $ cp site.cfg.example site.cfg
    $ nano site.cfg
    

    取消注释这些行:

    ....
    [openblas]
    libraries = openblas
    library_dirs = /opt/OpenBLAS/lib
    include_dirs = /opt/OpenBLAS/include
    ....
    

  • 检查配置、构建、安装(可选在 virtualenv 中)

    $ python setup.py config
    

    输出应该是这样的:

    ...
    openblas_info:
      FOUND:
        libraries = ['openblas', 'openblas']
        library_dirs = ['/opt/OpenBLAS/lib']
        language = c
        define_macros = [('HAVE_CBLAS', None)]
    
      FOUND:
        libraries = ['openblas', 'openblas']
        library_dirs = ['/opt/OpenBLAS/lib']
        language = c
        define_macros = [('HAVE_CBLAS', None)]
    ...
    

    使用 pip 安装比使用 python setup.py install首选/code>,因为 pip 将跟踪包元数据,并允许您在将来轻松卸载或升级 numpy.

    Installing with pip is preferable to using python setup.py install, since pip will keep track of the package metadata and allow you to easily uninstall or upgrade numpy in the future.

    $ pip install .
    

  • 可选:您可以使用此脚本来测试不同线程数的性能.

  • Optional: you can use this script to test performance for different thread counts.

    $ OMP_NUM_THREADS=1 python build/test_numpy.py
    
    version: 1.10.0.dev0+8e026a2
    maxint:  9223372036854775807
    
    BLAS info:
     * libraries ['openblas', 'openblas']
     * library_dirs ['/opt/OpenBLAS/lib']
     * define_macros [('HAVE_CBLAS', None)]
     * language c
    
    dot: 0.099796795845 sec
    
    $ OMP_NUM_THREADS=8 python build/test_numpy.py
    
    version: 1.10.0.dev0+8e026a2
    maxint:  9223372036854775807
    
    BLAS info:
     * libraries ['openblas', 'openblas']
     * library_dirs ['/opt/OpenBLAS/lib']
     * define_macros [('HAVE_CBLAS', None)]
     * language c
    
    dot: 0.0439578056335 sec
    

  • 对于更高的线程数,性能似乎有明显的提高.但是,我还没有非常系统地对此进行过测试,对于较小的矩阵,额外的开销可能会超过更高线程数带来的性能优势.

    There seems to be a noticeable improvement in performance for higher thread counts. However, I haven't tested this very systematically, and it's likely that for smaller matrices the additional overhead would outweigh the performance benefit from a higher thread count.

    这篇关于使用 OpenBLAS 集成编译 numpy的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆