MMX内联汇编固有 [英] MMX intrinsic to inline assembly

查看:76
本文介绍了MMX内联汇编固有的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何将这段代码转换为内联汇编?

How do I convert this code into inline assembly?

void tom::Transform(void* btr)
{   
    short* block =(short*)btr;
    __declspec(align(8)) __m64*block1 = (__m64*)block;  
		 int j;   
	
	/// Only do IT modes = 0 & 1.
	if(_mode != QuantOnly)	//< 2 = Q only.
	{
            __m64 s0,s1,s2,s3,f0,f1,f2,f3,temp4,temp5,temp6,temp7;
            j=0;
	    // transpose input
	    temp4 = _mm_unpacklo_pi16(block1[j],block1[j+1]);
	    temp5 = _mm_unpacklo_pi16(block1[j+2],block1[j+3]);
	    temp6 = _mm_unpackhi_pi16(block1[j],block1[j+1]);
	    temp7 = _mm_unpackhi_pi16(block1[j+2],block1[j+3]);
	    f0 = _mm_unpacklo_pi32(temp4,temp5);
	    f2 = _mm_unpacklo_pi32(temp6,temp7);
	    f1 = _mm_unpackhi_pi32(temp4,temp5);
	    f3 = _mm_unpackhi_pi32(temp6,temp7);
	    // stage one
	    s0 =_mm_add_pi16(f0,f3);
	    s3 =_mm_sub_pi16(f0,f3);
	    s1 =_mm_add_pi16(f1,f2);
	    s2 =_mm_sub_pi16(f1,f2); 
	    //stage 2
            block1[j] =_mm_add_pi16(s0,s1);
            block1[j+2] =_mm_sub_pi16(s0,s1);
            block1[j+1] =_mm_add_pi16(s2,_mm_slli_pi16(s3, 1));
            block1[j+3] =_mm_sub_pi16(s3,_mm_slli_pi16(s2, 1));





Thanks in advance!

推荐答案

但是将现在使用的MMX内部函数更改为内联汇编的最大好处是什么?使用MMX内部函数已经在使用处理器的MMX功能,并且对将其转换为嵌入式程序集并没有真正的帮助.看看下面的文章.您会发现MMX内在函数和MMS内联汇编之间没有任何差异.
MMX编程简介 [
But what is the great benefit of changing the MMX intrinsics used now to inline assembly? using MMX intrinsics is already using the MMX capabilities of your processor and it won''t really help to convert it to inline assembly. Have a look at the following article. You will notice there is no difference measured between MMX intrinsics and MMS inline assembly.
Introduction to MMX Programming[^]

Good luck!


这篇关于MMX内联汇编固有的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆