加速矩阵乘法SSE(C ++) [英] Speed up matrix multiplication by SSE (C++)

查看:539
本文介绍了加速矩阵乘法SSE(C ++)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要运行一个矩阵向量乘法每秒240000次。矩阵是5x5并且总是相同的,而向量在每次迭代时改变。数据类型为float。我正在考虑使用一些SSE(或类似的)指令。

I need to run a matrix-vector multiplication 240000 times per second. The matrix is 5x5 and is always the same, whereas the vector changes at each iteration. The data type is float. I was thinking of using some SSE (or similar) instructions.

1)我担心算术运算的数量太少与所涉及的内存操作的数量。你认为我可以得到一些有形的(例如> 20%)的改进吗?

1) I am concerned that the number of arithmetic operations is too small compared to the number of memory operations involved. Do you think I can get some tangible (e.g. > 20%) improvement?

2)我需要英特尔编译器吗?

2) Do I need the Intel compiler to do it?

3)您能指出一些参考资料吗?

3) Can you point out some references?

感谢大家!

推荐答案

Eigen C ++模板库用于向量,矩阵...对于小型固定大小矩阵(以及动态大小的矩阵)都具有

The Eigen C++ template library for vectors, matrices, ... has both


  • 优化代码

  • optimised code for small fixed size matrices (as well as dynamically sized ones)

使用SSE优化的优化代码

optimised code that uses SSE optimisations

这篇关于加速矩阵乘法SSE(C ++)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆