汉明重量(1的个数)将C与组件混合 [英] Hamming weight ( number of 1 in a number) mixing C with assembly
问题描述
我正在尝试计算数组中数字1的个数.
I'm trying to count how many number 1, are in the numbers of an array.
首先,我在C lenguaje中有一个代码(正常工作):
First I have a code in C lenguaje(work ok):
int popcount2(int* array, int len){
int i;
unsigned x;
int result=0;
for (i=0; i<len; i++){
x = array[i];
do{
result+= x & 0x1;
x>>= 1;
} while(x);
}
return result;
}
现在,我需要使用3-6行代码将do-while循环转换为Assembly.我已经写了一些代码,但是结果不正确.(我是汇编世界中的新手)
Now I need to translate the do-while loop into Assembly using 3-6 lines of code. I have write some code but the result is not correct.( I'm new in the assembly world )
int popcount3(int* array, int len){
int i;
unsigned x;
int result=0;
for (i=0; i<len; i++){
x = array[i];
asm(
"ini3: \n"
"adc $0,%[r] \n"
"shr %[x] \n"
"jnz ini3 \n"
: [r]"+r" (result)
: [x] "r" (x) );
}
}
我正在将GCC(在Linux上)与Intel处理器配合使用.
I'm using GCC ( on linux) with an Intel processor.
推荐答案
The nicest think you can do is using built in popcount
function as suggested by Paul R, but since you need to write it in assembly, this worked for me:
asm (
"start: \n"
"and %0, %1 \n"
"jz end \n"
"shr $0, %1 \n"
"jnc start \n"
"inc %1 \n"
"jmp start \n"
"end: \n"
: "+g" (result),
"+r" (x)
:
: "cc"
);
在前两行中,您只需检查x
的内容(如果它为零Jump Zero
,则结束).比您将x
右移一位,然后:
At first two lines you just check the contents of x
(and go to end if it's zero Jump Zero
). Than you shift x
one bit to right and:
如果未设置CF
,则只需开始(Jump Not Carry
),否则递增结果,然后开始.
If there's no CF
set, just go to start (Jump Not Carry
) else increment result and then go to start.
关于组装的美好想法是,您可以通过多种方式来做事...
And the beautiful think about assembly is that you can do things in so many ways...
asm (
"start: \n"
"shr $1, %1 \n"
"jnc loop_cond \n"
"inc %0 \n"
"and %1, %1 \n"
"loop_cond: \n"
"jnz start \n"
: "+g" (result),
"+r" (x)
:
: "cc"
);
如果没有CF
,请再次使用SHift Right
指令,进入循环条件.
Here you again use SHift Right
instruction, if no CF
is present just go to loop condition.
否则,再次增加结果并调用二进制文件AND
(INC
确实修改ZF
).
Otherwise again increment result and call binary AND
(INC
does modify ZF
).
我很好奇如何在3条指令中做到这一点(我想如果不可能的话,您的老师不会给您3的下限) 我意识到x86也具有 LOOP
指令:
每次执行LOOP指令时,计数寄存器都会递减,然后检查是否为0.如果计数为0,则循环终止,程序继续执行LOOP指令之后的指令.如果计数不为零,则执行到目标(目标)操作数的近跳转,这大概是循环开始时的指令. *
您可以使用 GCC输入约束:
c
-c
寄存器.
asm (
"start: \n"
"shr $1, %1 \n"
"adc $0, %0 \n"
"loop start \n"
: "+g" (result)
: "r" (x),
"c" (8) // Assuming 8b type (char)
);
只需确保可以编译为正确的汇编即可:
Just to make sure it compiles to proper assembly:
0x000000000040051f <+25>: mov $0x8,%ecx
0x0000000000400524 <+30>: mov -0x8(%rbp),%eax
0x0000000000400527 <+33>: shr %edx
0x0000000000400529 <+35>: adc $0x0,%eax
0x000000000040052c <+38>: loop 0x400527 <main+33>
我认为第一个应该具有更好的性能,特别是如果仅设置1位,则此方法始终会执行k*8
迭代.
我知道您必须使用循环,但这只是为了好玩...使用 SSE4扩展名您可以通过一条指令POPCNT
来做到这一点:
I know you have to use a loop, but just for fun... With SSE4 extension you could do this by just one instruction POPCNT
:
此指令计算第二个操作数(源)中设置为1的位数,并返回第一个操作数(目标寄存器)中的计数. *
我认为(我的笔记本电脑上有一个很旧的CPU,所以我无法为您测试一下),您应该可以只用一个简单的指令就可以做到这一点:
I think (I have a quite old CPU on my notebook, so I can't test this for you) you should be able to do this with just one simple instruction:
asm (
"POPCNT %1, %0 \n"
: "=r" (result)
: "mr" (x)
: "cc"
);
(如果您尝试这样做并且确实拥有SSE4扩展名,请告诉我它是否有效)
我已经测量了进行1亿次弹出式计数所需的时间,并将我的第一种和第二种方法与 David Wohlferd's 进行比较. [原始数据]
I've measured times required to do 100,000,000 popcounts comparing my first and second methods with David Wohlferd's. [Raw data]
+--------------+------------+------------+------------+
| | 0x00000000 | 0x80000001 | 0xffffffff |
+--------------+------------+------------+------------+
| 1st solution | 0.543 | 5.040 | 3.833 |
| LOOP | 11.530 | 11.523 | 11.523 |
| Davids | 0.750 | 4.893 | 4.890 |
+--------------+------------+------------+------------+
如果有人可以将这3个与SSE4的POPCNT
指令进行比较,我会很高兴的.
If anyone can compare these 3 with SSE4's POPCNT
instruction I'd be glad.
这篇关于汉明重量(1的个数)将C与组件混合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!