使用AVX2收集指令时的加载地址计算 [英] Load address calculation when using AVX2 gather instructions

查看:38
本文介绍了使用AVX2收集指令时的加载地址计算的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

查看 AVX2 内部函数文档,那里收集了一些加载指令,例如 VPGATHERDD:

Looking at the AVX2 intrinsics documentation there are gathered load instructions such as VPGATHERDD:

__m128i _mm_i32gather_epi32 (int const * base, __m128i index, const int scale);

我从文档中不清楚计算出的加载地址是元素地址还是字节地址,即元素<的加载地址代码>i:

What isn't clear to me from the documentation is whether the calculated load address is an element address or a byte address, i.e. is the load address for element i:

load_addr = base + index[i] * scale;               // (1) element addressing ?

或:

load_addr = (char *)base + index[i] * scale;       // (2) byte addressing ?

来自 英特尔文档 看起来可能是 (2),但是鉴于收集加载的最小元素大小是 32 位,这没有多大意义 - 为什么要从未对齐的地址加载(即使用 scale <4) ?

From the Intel docs it looks like it might be (2), but this doesn't make much sense given that the smallest element size for gathered loads is 32 bits - why would you want to load from misaligned addresses (i.e. use scale < 4) ?

推荐答案

收集指令没有任何对齐要求.所以不允许字节寻址就太严格了.

Gather instructions do not have any alignment requirements. So it would be too restrictive not to allow byte addressing.

另一个原因是一致性.使用 SIB 寻址,我们显然有 byte 地址:

Other reason is consistency. With SIB addressing we obviously have byte address:

MOV eax, [rcx + rdx * 2]

由于 VPGATHERDD 只是这个 MOV 指令的矢量化变体,我们不应该期望 VSIB 寻址有什么不同:

Since VPGATHERDD is just a vectorized variant of this MOV instruction, we should not expect anything different with VSIB addressing:

VPGATHERDD ymm0, [rcx + ymm2 * 2], ymm3

至于现实生活中字节寻址的使用,我们可以有一个 24 位彩色图像,其中每个像素都是 3 字节对齐的.我们可以使用单个 VPGATHERDD 指令加载 8 个像素,但前提是 VSIB 中的scale"字段为1"并且 VPGATHERDD 使用 byte 寻址.

As for real life use for byte addressing, we could have a 24-bit color image where each pixel is 3-byte aligned. We could load 8 pixels with single VPGATHERDD instruction but only if "scale" field in VSIB is "1" and VPGATHERDD uses byte addressing.

这篇关于使用AVX2收集指令时的加载地址计算的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆