Intel x86 SSE SIMD 指令入门 [英] Getting started with Intel x86 SSE SIMD instructions

查看:27
本文介绍了Intel x86 SSE SIMD 指令入门的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想详细了解如何使用 SSE.

除了显而易见的阅读英特尔® 64 和 IA-32 之外,还有哪些学习方法架构软件开发人员手册?

主要我有兴趣使用 GCC X86 内置函数.

解决方案

首先,我不建议使用内置函数 - 它们不可移植(跨同一架构的编译器).

使用 intrinsics、GCC 做得很好 将 SSE 内在函数优化为更优化的代码.您可以随时查看程序集,了解如何充分利用 SSE.

内部函数很简单——就像普通的函数调用一样:

#include //可移植到所有 x86 编译器int main(){__m128 vector1 = _mm_set_ps(4.0, 3.0, 2.0, 1.0);//高元素优先,与 C 数组顺序相反.如果您想要源中的小端"元素顺序,请使用 _mm_setr_ps.__m128 vector2 = _mm_set_ps(7.0, 8.0, 9.0, 0.0);__m128 sum = _mm_add_ps(vector1, vector2);//结果 = 向量 1 + 向量 2vector1 = _mm_shuffle_ps(vector1, vector1, _MM_SHUFFLE(0,1,2,3));//vector1 现在是 (1, 2, 3, 4) (上面的 shuffle 反转了它)返回0;}

使用 _mm_load_ps_mm_loadu_ps 从数组加载数据.

当然还有更多选择,SSE 真的很强大,在我看来相对容易学习.

另请参阅https://stackoverflow.com/tags/sse/info,获取一些指南链接.>

I want to learn more about using the SSE.

What ways are there to learn, besides the obvious reading the Intel® 64 and IA-32 Architectures Software Developer's Manuals?

Mainly I'm interested to work with the GCC X86 Built-in Functions.

解决方案

First, I don't recommend on using the built-in functions - they are not portable (across compilers of the same arch).

Use intrinsics, GCC does a wonderful job optimizing SSE intrinsics into even more optimized code. You can always have a peek at the assembly and see how to use SSE to it's full potential.

Intrinsics are easy - just like normal function calls:

#include <immintrin.h>  // portable to all x86 compilers

int main()
{
    __m128 vector1 = _mm_set_ps(4.0, 3.0, 2.0, 1.0); // high element first, opposite of C array order.  Use _mm_setr_ps if you want "little endian" element order in the source.
    __m128 vector2 = _mm_set_ps(7.0, 8.0, 9.0, 0.0);

    __m128 sum = _mm_add_ps(vector1, vector2); // result = vector1 + vector 2

    vector1 = _mm_shuffle_ps(vector1, vector1, _MM_SHUFFLE(0,1,2,3));
    // vector1 is now (1, 2, 3, 4) (above shuffle reversed it)
    return 0;
}

Use _mm_load_ps or _mm_loadu_ps to load data from arrays.

Of course there are way more options, SSE is really powerful and in my opinion relatively easy to learn.

See also https://stackoverflow.com/tags/sse/info for some links to guides.

这篇关于Intel x86 SSE SIMD 指令入门的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆