英特尔x86 SSE SIMD指令入门 [英] Getting started with Intel x86 SSE SIMD instructions
问题描述
我想了解有关使用 SSE 的更多信息.
I want to learn more about using the SSE.
除了可以轻松阅读英特尔®64和IA-32之外,还有哪些学习方法架构软件开发人员手册?
主要我有兴趣使用推荐答案
首先,我不建议您使用内置函数-它们不可移植(跨同一体系结构的编译器). First, I don't recommend on using the built-in functions - they are not portable (across compilers of the same arch). Use intrinsics, GCC does a wonderful job optimizing SSE intrinsics into even more optimized code. You can always have a peek at the assembly and see how to use SSE to it's full potential. 内部函数很简单-就像正常的函数调用一样: Intrinsics are easy - just like normal function calls: 使用 Use 当然还有更多选择,SSE确实功能强大,我认为相对容易学习. Of course there are way more options, SSE is really powerful and in my opinion relatively easy to learn. 另请参阅 https://stackoverflow.com/tags/sse/info 以获得一些指南链接. See also https://stackoverflow.com/tags/sse/info for some links to guides. 这篇关于英特尔x86 SSE SIMD指令入门的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!#include <immintrin.h> // portable to all x86 compilers
int main()
{
__m128 vector1 = _mm_set_ps(4.0, 3.0, 2.0, 1.0); // high element first, opposite of C array order. Use _mm_setr_ps if you want "little endian" element order in the source.
__m128 vector2 = _mm_set_ps(7.0, 8.0, 9.0, 0.0);
__m128 sum = _mm_add_ps(vector1, vector2); // result = vector1 + vector 2
vector1 = _mm_shuffle_ps(vector1, vector1, _MM_SHUFFLE(0,1,2,3));
// vector1 is now (1, 2, 3, 4) (above shuffle reversed it)
return 0;
}
_mm_load_ps
或_mm_loadu_ps
从数组中加载数据._mm_load_ps
or _mm_loadu_ps
to load data from arrays.