如何用C使用上证所内部函数来计算向量的点积 [英] How to calculate vector dot product using SSE intrinsic functions in C

查看:119
本文介绍了如何用C使用上证所内部函数来计算向量的点积的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想,其中一个向量的每个元素被元件在​​同一索引处的其它矢量乘以相乘两个向量在一起。然后,我想总结的结果向量的所有元素,以获得一个号码。例如,计算是这样的向量{1,2,3,4}和{5,6,7,8}

I am trying to multiply two vectors together where each element of one vector is multiplied by the element in the same index at the other vector. I then want to sum all the elements of the resulting vector to obtain one number. For instance, the calculation would look like this for the vectors {1,2,3,4} and {5,6,7,8}:

1 * 5 + 2 * 6 + 3 * 7 + 4 * 8

1*5+2*6+3*7+4*8

从本质上讲,这是我在两个向量的点积。我知道有一个SSE指令要做到这一点,但是命令没有与其关联的固有功能。在这一点上,我不想写的内联汇编在我的C code,所以我想只用内部函数。这似乎是一个常见的​​计算,所以我由我自己感到惊讶,我找不到对谷歌的答案。

Essentially, I am taking the dot product of the two vectors. I know there is an SSE command to do this, but the command doesn't have an intrinsic function associated with it. At this point, I don't want to write inline assembly in my C code, so I want to use only intrinsic functions. This seems like a common calculation so I am surprised by myself that I couldn't find the answer on Google.

请注意:我优化其高达4.2 SSE支持特定的微架构。

Note: I am optimizing for a specific micro architecture which supports up to SSE 4.2.

感谢您的帮助。

推荐答案

GCC(至少4.3版本)包括< smmintrin.h> 与SSE4.1水平的内在函数,包括单和双precision点产品:

GCC (at least version 4.3) includes <smmintrin.h> with SSE4.1 level intrinsics, including the single and double-precision dot products:

_mm_dp_ps (__m128 __X, __m128 __Y, const int __M);
_mm_dp_pd (__m128d __X, __m128d __Y, const int __M);


对于较老的处理器备用,你可以用这个算法创建向量的点积 A B

r1 = _mm_mul_ps(a, b);
r2 = _mm_hadd_ps(r1, r1);
r3 = _mm_hadd_ps(r2, r2);
_mm_store_ss(&result, r3);

这篇关于如何用C使用上证所内部函数来计算向量的点积的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆