具有 mkl 后端的特征库的按系数数组操作的性能 [英] performance of coefficient-wise array operations of the eigen library with mkl backend

查看:77
本文介绍了具有 mkl 后端的特征库的按系数数组操作的性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在将一个带有大量系数数组运算的 Matlab 算法移植到 C++,它看起来像这个例子,但通常要复杂得多:

I am porting a Matlab algorithm with lots of coefficient-wise array operations to C++, which look like this example, but are often much more complex:

Eigen::Array<double, Dynamic, 1> tx2(12);
tx2 << 1,2,3,4,5,6;
Eigen::Array<double, Dynamic, 1> tx1(12);
tx1 << 7,8,9,10,11,12;
Eigen::Array<double, Dynamic, 1> x = (tx1 + tx2) / 2;

结果证明 C++ 代码明显比 Matlab 慢(大约 20%).因此,在下一步中,我尝试打开 Eigen 的英特尔 MKL 实现,这对性能没有任何影响,就像字面上没有任何改进.MKL 是否有可能不改进系数向量操作?有没有办法测试我是否成功链接了 MKL?有没有比 Eigen::vector 类更快的替代品?提前致谢!

The C++ code turned out to be significantly slower than Matlab (around 20%). So in a next step I tried to turn on the Intel MKL implementation of Eigen, which did nothing for the performance, like literally no improvement. Is it possible that MKL does not improve coefficient-wise vector operations? Is there a way to test if I linked MKL sucessfully? Are there faster alternatives to the Eigen::vector classes? Thanks in advance!

我在运行 win7 64 位的 i7-3820 上使用 VS 2013.更长的例子是:

I`m using VS 2013 on an i7-3820 running win7 64bit. Longer Example would be:

    Array<double, Dynamic, 1> ts = (k2 / (6 * b.pow(3)) + k / b - b / 2) - (k2 / (6 * a.pow(3)) + k / a - a / 2);
    Array<double, Dynamic, 1> tp1 = -2 * r2*(b - a)/ (rp.pow(2));
    Array<double, Dynamic, 1> tp2 = -2 * r2*rp*log(b / a) / rm2;
    Array<double, Dynamic, 1> tp3 = r2*(b.pow(-1) - a.pow (-1)) / 2;
    Array<double, Dynamic, 1> tp4 = 16 * r2.pow(2)*(r2.pow(2) + 1)*log((2 * rp*b - rm2) / (2 * rp*a - rm2)) / (rp.pow(3)*rm2);
    Array<double, Dynamic, 1> tp5 = 16 * r2.pow(3)*((2 * rp*b - rm2).pow(-1) - (2 * rp*a - rm2).pow(-1)) / rp.pow(3);
    Array<double, Dynamic, 1> tp = tp1 + tp2 + tp3 + tp4 + tp5;
    Array<double, Dynamic, 1> f = (ts + tp) / (2 * ds*ds);

CMakeLists 的相关部分

relevant part of CMakeLists

    set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${OpenMP_C_FLAGS}")
    set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${OpenMP_CXX_FLAGS}")
    target_link_libraries(MK ${VTK_LIBRARIES} ${Boost_LIBRARIES} mkl_intel_lp64_dll.lib mkl_intel_thread_dll.lib mkl_core_dll.lib libiomp5md.lib)

到目前为止,我只定义了 EIGEN_USE_MKL_ALL.

and I've only defined EIGEN_USE_MKL_ALL so far.

推荐答案

将调用替换为 pow(2)pow(3) 和类似的square(), cube().pow(-1) 也一样,最好用除法代替.我希望 MatLab 能够为您完成所有这些优化,但在 C++ 中,只有在编译器级别工作才能使这种编译时优化成为可能.

Replace calls to pow(2), pow(3), and the likes to square(), cube(). Same for pow(-1) which is advantageously replaced by a division. I hope MatLab is able to do all these kind of optimizations for you, but in C++, only working at the compiler level would make such compile-time optimizations possible.

这篇关于具有 mkl 后端的特征库的按系数数组操作的性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆