使用OpenBLAS集成编译numpy [英] Compiling numpy with OpenBLAS integration

查看:268
本文介绍了使用OpenBLAS集成编译numpy的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将numpyOpenBLAS一起安装,但是我对如何编写site.cfg文件感到困惑.

I am trying to install numpy with OpenBLAS , however I am at loss as to how the site.cfg file needs to be written.

按照安装过程,安装成功完成,没有错误,但是,OpenBLAS使用的线程数从1增加(由环境变量OMP_NUM_THREADS控制)会降低性能.

When the installation procedure was followed the installation completed without errors, however there is performance degradation on increasing the number of threads used by OpenBLAS from 1 (controlled by the environment variable OMP_NUM_THREADS).

我不确定OpenBLAS集成是否完美.任何人都可以提供一个site.cfg文件来实现同样的目的.

I am not sure if the OpenBLAS integration has been perfect. Could any one provide a site.cfg file to achieve the same.

PS:基于Python的其他工具包(如 Theano )中的OpenBLAS集成在增加数量上提供了可观的性能提升同一台计算机上的线程数.

P.S.: OpenBLAS integration in other toolkits like Theano, which is based on Python, provides substantial performance boost on increasing the number of threads, on the same machine.

推荐答案

我刚刚在virtualenv中使用OpenBLAS集成编译了numpy,它似乎可以正常工作.

I just compiled numpy inside a virtualenv with OpenBLAS integration, and it seems to be working OK.

这是我的过程:

  1. 编译OpenBLAS:

$ git clone https://github.com/xianyi/OpenBLAS
$ cd OpenBLAS && make FC=gfortran
$ sudo make PREFIX=/opt/OpenBLAS install

如果您没有管理员权限,则可以将PREFIX=设置为具有写权限的目录(只需相应地修改下面的相应步骤).

If you don't have admin rights you could set PREFIX= to a directory where you have write privileges (just modify the corresponding steps below accordingly).

确保包含libopenblas.so的目录在共享库搜索路径中.

Make sure that the directory containing libopenblas.so is in your shared library search path.

  • 要在本地执行此操作,可以编辑~/.bashrc文件以包含该行

export LD_LIBRARY_PATH=/opt/OpenBLAS/lib:$LD_LIBRARY_PATH

LD_LIBRARY_PATH环境变量将在您启动新的终端会话时更新(使用$ source ~/.bashrc强制在同一会话中进行更新).

The LD_LIBRARY_PATH environment variable will be updated when you start a new terminal session (use $ source ~/.bashrc to force an update within the same session).

另一个适用于多个用户的选项是在/etc/ld.so.conf.d/中创建一个包含行/opt/OpenBLAS/lib.conf文件,例如:

Another option that will work for multiple users is to create a .conf file in /etc/ld.so.conf.d/ containing the line /opt/OpenBLAS/lib, e.g.:

$ sudo sh -c "echo '/opt/OpenBLAS/lib' > /etc/ld.so.conf.d/openblas.conf"

完成任一选项后,运行

$ sudo ldconfig

  • 获取numpy源代码:

  • Grab the numpy source code:

    $ git clone https://github.com/numpy/numpy
    $ cd numpy
    

  • site.cfg.example复制到site.cfg并编辑副本:

  • Copy site.cfg.example to site.cfg and edit the copy:

    $ cp site.cfg.example site.cfg
    $ nano site.cfg
    

    取消注释这些行:

    ....
    [openblas]
    libraries = openblas
    library_dirs = /opt/OpenBLAS/lib
    include_dirs = /opt/OpenBLAS/include
    ....
    

  • 检查配置,构建和安装(可选在virtualenv内部)

    $ python setup.py config
    

    输出应如下所示:

    ...
    openblas_info:
      FOUND:
        libraries = ['openblas', 'openblas']
        library_dirs = ['/opt/OpenBLAS/lib']
        language = c
        define_macros = [('HAVE_CBLAS', None)]
    
      FOUND:
        libraries = ['openblas', 'openblas']
        library_dirs = ['/opt/OpenBLAS/lib']
        language = c
        define_macros = [('HAVE_CBLAS', None)]
    ...
    

    使用pip安装对于python setup.py install来说首选,因为pip会跟踪软件包元数据,并允许您将来轻松卸载或升级numpy.

    Installing with pip is preferable to using python setup.py install, since pip will keep track of the package metadata and allow you to easily uninstall or upgrade numpy in the future.

    $ pip install .
    

  • 可选:您可以使用此脚本来测试不同线程数的性能.

  • Optional: you can use this script to test performance for different thread counts.

    $ OMP_NUM_THREADS=1 python build/test_numpy.py
    
    version: 1.10.0.dev0+8e026a2
    maxint:  9223372036854775807
    
    BLAS info:
     * libraries ['openblas', 'openblas']
     * library_dirs ['/opt/OpenBLAS/lib']
     * define_macros [('HAVE_CBLAS', None)]
     * language c
    
    dot: 0.099796795845 sec
    
    $ OMP_NUM_THREADS=8 python build/test_numpy.py
    
    version: 1.10.0.dev0+8e026a2
    maxint:  9223372036854775807
    
    BLAS info:
     * libraries ['openblas', 'openblas']
     * library_dirs ['/opt/OpenBLAS/lib']
     * define_macros [('HAVE_CBLAS', None)]
     * language c
    
    dot: 0.0439578056335 sec
    

  • 对于更高的线程数,性能似乎有了明显的提高.但是,我尚未对此进行非常系统的测试,对于较小的矩阵,可能会有更多的开销超过线程数增加带来的性能好处.

    There seems to be a noticeable improvement in performance for higher thread counts. However, I haven't tested this very systematically, and it's likely that for smaller matrices the additional overhead would outweigh the performance benefit from a higher thread count.

    这篇关于使用OpenBLAS集成编译numpy的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆