对齐方式VLD1 [英] Alignment in VLD1

查看:806
本文介绍了对齐方式VLD1的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个关于ARM霓虹灯VLD1指令的对齐问题。如何在以下code工作对齐?

  DATA名.req R0
vld1.16 {D16,D17,D18,D19},[数据,:128]!

是否此起始地址读指令转移到DATA +的正整数,使得它是16的最小倍数(16字节= 128位),这是比数据少没有,或数据本身改变到最小的多16不低于DATA少?


解决方案

这是一个提示给CPU。只有我读到这样的提示的有用的事情是从的博客文章ARM的网站声称它使加载速度更快,它没有说如何或为何但是。大概是因为CPU可以发出更广泛的负载。


  

    

您也可以指定Rn中传递的指针对齐,使用可选:参数,这常常加快内存访问。


  

如果您提供的提示的,你必须确保 DATA 对齐到16字节,否则你会得到一个硬件异常。

此硬件行为在VLD1描述的 ARM ARM

 如果ConditionPassed(),然后
    EncodingSpecificOperations(); CheckAdvSIMDEnabled(); NullCheckIfThumbEE(N);
    地址= R [N];如果(地址MOD对齐)= 0,则GenerateAlignmentException()!;
    如果wback则R [N] = R [N] +(如果register_index则R [M]。别的ebytes);
    ELEM [D [D]指数,ESIZE] = MEMU [地址,ebytes]

主要是这行

  IF(MOD地址对齐)= 0,则GenerateAlignmentException()!;

其实,我不明白为什么CPU可以检查对准自己并应用最佳状态。可能是将花费太多的周期。

I have a question about ARM Neon VLD1 instruction's alignment. How does the alignment in the following code work?

DATA            .req r0  
vld1.16         {d16, d17, d18, d19}, [DATA, :128]!  

Does the starting address of this read instruction shifts to DATA + a positive integer, such that it is the smallest multiple of 16(16 bytes = 128 bits) which is no less than DATA, or DATA itself changes to the smallest multiple of 16 no less than DATA?

解决方案

It is a hint to the CPU. Only thing I read about the usefulness of such hint was from a blog post on ARM's site claiming it makes the loading faster, it doesn't say how or why however. Probably because CPU can issue wider loads.

You can also specify an alignment for the pointer passed in Rn, using the optional : parameter, which often speeds up memory accesses.

If you provide the hint you must make sure that DATA is aligned to 16 bytes otherwise you'll get an hardware exception.

This hardware behavior is described in VLD1 description in ARM ARM as

if ConditionPassed() then
    EncodingSpecificOperations(); CheckAdvSIMDEnabled(); NullCheckIfThumbEE(n);
    address = R[n]; if (address MOD alignment) != 0 then GenerateAlignmentException();
    if wback then R[n] = R[n] + (if register_index then R[m] else ebytes);
    Elem[D[d],index,esize] = MemU[address,ebytes];

mainly this line

if (address MOD alignment) != 0 then GenerateAlignmentException();

I actually can't understand why CPU can check alignment itself and apply the best condition. May be that would cost too much cycles.

这篇关于对齐方式VLD1的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆