AVX/SSE回合向下浮动并返回整数向量吗? [英] AVX/SSE round floats down and return vector of ints?

查看:60
本文介绍了AVX/SSE回合向下浮动并返回整数向量吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有没有一种方法可以使用AVX/SSE来获取浮点向量,向下舍入并生成整数向量?所有的Floor固有方法似乎都产生最终的浮点向量,这很奇怪,因为舍入会产生整数!

Is there a way using AVX/SSE to take a vector of floats, round-down and produce a vector of ints? All the floor intrinsic methods seem to produce a final vector of floating point, which is odd because rounding produces an integer!

推荐答案

SSE可以通过选择截断(向零)或当前舍入模式(通常为IEEE默认模式,最接近将平局符舍入为)从FP转换为整数.像 nearbyint()一样,不像 round()那样,平局决胜负数为-0.如果需要在x86上使用四舍五入模式,请

SSE has conversion from FP to integer with your choice of truncation (towards zero) or the current rounding mode (normally the IEEE default mode, nearest with tiebreaks rounding to even. Like nearbyint(), unlike round() where the tiebreak is away-from-0. If you need that rounding mode on x86, you have to emulate it, perhaps with truncate as a building block.)

相关说明为 CVTPS2DQ

The relevant instructions are CVTPS2DQ and CVTTPS2DQ to convert packed single-precision floats to signed doubleword integers. The version with the extra T in the mnemonic does Truncation instead of the current rounding mode.

; xmm0 is assumed to be packed float input vector
cvttps2dq xmm0, xmm0
; xmm0 now contains the (rounded) packed integer vector

或者使用内在函数, __ m128i _mm_cvt [t] ps_epi32(__ m128 a)

对于x86在硬件,floor(向-Inf)和ceil(向+ Inf)中提供的其他两种取整模式,一种简单的方法是使用此SSE4.1/AVX

For the other two rounding modes x86 provides in hardware, floor (toward -Inf) and ceil (toward +Inf), a simple way would be using this SSE4.1/AVX ROUNDPS instruction before converting to integer.

代码如下:

roundps  xmm0, xmm0, 1    ; nearest=0, floor=1,  ceil=2, trunc=3
cvtps2dq xmm0, xmm0       ; or cvttps2dq, doesn't matter
; xmm0 now contains the floored packed integer vector

对于AVX ymm向量,在指令前添加"V",并将xmm更改为ymm.

For AVX ymm vectors prefix the instructions with 'V' and change the xmm's to ymm's.

ROUNDPS的工作原理

在xmm2/m128中将单精度浮点值进行圆整包装,然后将结果放入xmm1中.取整模式由imm8确定.

Round packed single precision floating-point values in xmm2/m128 and place the result in xmm1. The rounding mode is determined by imm8.

舍入模式(立即数/第三个操作数)可以具有以下值(取自当前Intel文档的表 4-15-舍入模式和舍入控制(RC)字段的编码)):

the rounding mode (the immediate/the third operand) can have the following values (taken from table 4-15 - Rounding Modes and Encoding of Rounding Control (RC) Field of the current Intel Docs):

Rounding Mode               RC Field Setting   Description
----------------------------------------------------------
Round to nearest (even)     00B                Rounded result is the closest to the infinitely precise result. If two values are equally close, the result is nearest (even) the even value (i.e., the integer value with the least-significant bit of zero).
Round down (toward −∞)      01B                Rounded result is closest to but no greater than the infinitely precise result.
Round up (toward +∞)        10B                Rounded result is closest to but no less than the infinitely precise result.
Round toward 0 (truncate)   11B                Rounded result is closest to but no greater in absolute value than the infinitely precise result.

四舍五入操作的返回向量为 float 而不是 int 的可能原因可能是这样的,进一步的操作始终是float操作(在四舍五入的情况下)值),然后转换为 int 会很简单,如图所示.

The probable reason why the return vector of the rounding operation is float and not int may be that in this way the further operations could always be float operations (on rounded values) and a conversion to int would be trivial as shown.

可以在参考文档中找到相应的内在函数.将上述代码转换为内在函数(取决于 Rounding Control(RC)Field )的示例为:

The corresponding intrinsics are found in the referenced docs. An example of transforming the above code to intrinsics (which depend on the Rounding Control (RC) Field) is:

__m128 dst = _mm_cvtps_epi32( _mm_floor_ps(__m128 src) );

这篇关于AVX/SSE回合向下浮动并返回整数向量吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆