确定数组中负数和正数的数量 [英] Determine number of negative and positive numbers in an array
问题描述
我需要在汇编程序中确定数组中负数和正数的数量.汇编程序似乎没有将它们识别为负数.我怎么解决这个问题?我这样定义数组:
I need to determine the number of negative and positive numbers in an array in assembler. It seems that the assembler doesn't recognizes them as negative numbers. How can I solve this problem? I define the array this way:
word_array db 3, -2, 11, -1, -2, -7, -5, -20
我有这个计算正数的函数:
I have this function that counts the positive ones:
count_positives:
mov dx, word [word_array + 2*ecx - 2]
cmp edx, 0
JL skip
inc ebx
skip:
loopnz count_positives
推荐答案
您正在加载 DX 的低 16 位,留下高位(包括符号位)保留之前存在的任何垃圾.使用 16 位操作数大小进行比较.
You're loading into the low 16 bits of DX, leaving the high bits (including the sign bit) holding whatever garbage was there before. Use 16-bit operand-size for your compare.
计算负数或非负数,然后从总计数中减去另一个.
Count either negative or non-negative, and subtract that from the total count to get the other.
如果你需要计算负数和正数,那么你需要两个计数器,一个 test
或 cmp
后跟两个分支(这样零就不会进入任一计数器).
If you need to count negative and positive, then you need two counters, and a test
or cmp
followed by two branches (so that zero doesn't go into either counter).
改编自 Sten 的回答,但有一些改进.注意test value, -1
等价于cmp value, 0
.
Adapted from Sten's answer, but with some improvements. Note that test value, -1
is equivalent to cmp value, 0
.
section .rodata
word_array dw -1,2,-3,4
len equ $-word_array ; length in bytes. assembler constant, so we can mov reg, imm8/imm32 rather than loading it as data.
section .text
;; clobbers ESI, ECX. Returns in EAX, EDX
proc:
mov esi, word_array ; esi points to the array. In MASM, use OFFSET word_array
mov ecx, len/2 - 1 ; [esi + ecx*2] points to the last element
xor edx, edx ; non_neg_count = 0
countloop:
; cmp [esi + ecx*2], 0 ; This can't macro-fuse (memory and immediate operand). Also can't micro-fuse on SnB, because of a 2-reg addressing mode
movsx eax, word [esi + ecx*2] ; use a 2-reg addressing mode to save loop overhead, since this there's no ALU execution port component to this insn. It doesn't need to micro-fuse to be one uop
test eax, eax ; can macro-fuse with js
js isNegative
inc edx ; counting non-negative numbers
isNegative:
dec ecx ; can macro-fuse with jge, but probably won't unless alignment stops it from being decoded in the same cycle as the earlier test/js
jge countloop ; jge, not jnz, because we want ecx from [0 : len-1], rather than [1 : len]
; after the loop, ecx=-1, edx=non_neg_count
; neg_count = array_count - non_neg_count
mov eax, len/2
sub eax, edx ; eax = neg_count
ret ; return values in eax, edx
Intel 上的循环是 4 uop.(或者更可能在 Haswell 之前的 Sandybridge 上有 5 个,如果两个测试/分支对在同一周期中都击中解码器,那么只有一个宏融合.HSW 可以在单个解码组中进行 2 个宏融合).
The loop is 4 uops on Intel. (Or more likely 5 on Sandybridge before Haswell, if both test/branch pairs hit the decoders in the same cycle so only one macro-fuses. HSW can do 2 macro-fusions in a single decode group).
带有 sets bl/add edx, ebx
的无分支版本可能会运行良好.
A branch-less version with sets bl / add edx, ebx
might work well.
您可以通过将 eax 归零来稍微节省代码大小,然后在循环中使用 scasw
将 ax
与 [esi] 进行比较,并将 esi 增加 2,但是它通常不是性能的好选择.
You could maybe save slightly on code size by zeroing eax, then using scasw
in a loop to compare ax
with [esi], and increment esi by two, but it's not usually a good choice for performance.
section .rodata
word_array dw -1,2,0,-3,4
len equ $-word_array ; length in bytes. assembler constant, so we can mov reg, imm8/imm32 rather than loading it as data.
section .text
;; clobbers ESI, EDI, EBP. Returns in EAX, EDX
proc_pos_and_neg:
mov esi, word_array ; esi points to the array. In MASM, use OFFSET word_array
xor edx, edx ; pos_count = 0
xor eax, eax ; neg_count = 0
lea edi, [esi + len] ; points one past the end of the array
xor ebx, ebx ; clear upper portion, because setcc r32 isn't available, only setcc r8 :(
countloop:
cmp word [esi], 0
setg bl ; 0 or 1, depending on array[i] > 0
lea edx, [edx + ebx] ; add without affecting flags
setl bl
add eax, ebx ; can clobber flags now
add esi, 2 ; simple pointer-increment
cmp esi, edi
jb countloop ; loop while our pointer is below the pointer to one-past-the-end
ret ; neg_count in eax, pos_count in edx
如果需要,零计数是 n - eax - edx
,其中 n
是元素数.
And the zero count is n - eax - edx
if you want it, where n
is the number of elements.
我在这里使用了不同的循环结构只是为了变化.循环应该是 7 uop.
I used a different loop structure here just for variety. The loop should be 7 uops.
在 setcc 写入 bl 之后读取 ebx 避免了部分寄存器合并惩罚,因为我们在循环外对 EBX 进行了异或清零.(保存/恢复 EBX 的上下文切换或中断将消除该性能优势,但对于短循环,可能仍然值得将异或归零从循环中提升.)
Reading ebx after setcc writes bl avoids a partial-register merge penalty because we xor-zeroed EBX outside the loop. (A context switch or interrupt that saves/restores EBX will remove that performance benefit, but for short loops it's probably still worth hoisting the xor-zeroing out of the loop.)
这篇关于确定数组中负数和正数的数量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!