快速LAPACK / BLAS用于矩阵乘法 [英] Fast LAPACK/BLAS for matrix multiplication

查看:894
本文介绍了快速LAPACK / BLAS用于矩阵乘法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我现在正在探索用于线性代数的Armadillo C ++库。据我所知,它使用LAPACK / BLAS库用于基本矩阵操作(例如矩阵乘法)。作为Windows用户,我从这里下载了LAPACK / BLAS: http://icl.cs.utk .edu / lapack-for-windows / lapack /#running 。问题是矩阵乘法与Matlab或甚至R相比是非常慢的。例如,Matlab在我的计算机上在〜0.15秒内乘以两个1000×1000矩阵,R需要〜1秒,而C ++ / Armadillo / LAPACK / BLAS需要大于10秒。

I'm exploring the Armadillo C++ library for linear algebra at the moment. As far as I understood it uses LAPACK/BLAS library for basic matrix operations (e.g. matrix multiplication). As a Windows user I downloaded LAPACK/BLAS from here: http://icl.cs.utk.edu/lapack-for-windows/lapack/#running. The problem is that matrix multiplications are very slow comparing to Matlab or even R. For example, Matlab multiplies two 1000x1000 matrices in ~0.15 seconds on my computer, R needs ~1 second, while C++/Armadillo/LAPACK/BLAS needs more than 10 seconds for that.

因此,Matlab是基于高度优化的线性代数库。我的问题是,如果存在更快的LAPACK / BLAS brary从犰狳使用?或者,有没有办法提取Matlab线性代数库以某种方式,并在C + +中使用它们?

So, Matlab is based on highly optimized libraries for linear algebra. My question is if there exists a faster LAPACK/BLAS brary to use from Armadillo? Alternatively, is there a way to extract Matlab linear algebra libraries somehow and use them in C++?

推荐答案

LAPACK不做矩阵乘法。

LAPACK doesn't do matrix multiplication. It's BLAS that provides matrix multiplication.

如果你有一个64位操作系统,我建议先尝试64位版本的BLAS。

If you have a 64 bit operating system, I recommend to first try a 64 bit version of BLAS. This will get you an immediate doubling of performance.

其次,看看BLAS的高性能实现,例如打开BLAS 。 OpenBLAS使用矢量化和并行化(即多核)。这是一个免费的(无成本的)开源项目。

Secondly, have a look at a high-performance implementation of BLAS, such as OpenBLAS. OpenBLAS uses both vectorisation and parallelisation (ie. multi-core). It is a free (no cost) open source project.

Matlab内部使用英特尔MKL 库,您也可以使用 Armadillo库。英特尔MKL是封闭源,但免费用于非商业用途。注意,OpenBLAS可以获得与Intel MKL相当或更好的矩阵乘法性能。

Matlab internally uses the Intel MKL library, which you can also use with the Armadillo library. Intel MKL is closed source, but is free for non-commercial use. Note that OpenBLAS can obtain matrix multiplication performance that is on par or better than Intel MKL.

请注意,在Linux和Mac OS X上,高性能线性代数通常更容易完成比在Windows上。

Note that high performance linear algebra is generally easier to accomplish on Linux and Mac OS X than on Windows.

这篇关于快速LAPACK / BLAS用于矩阵乘法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆