确定数组中负数和正数的数量 [英] Determine number of negative and positive numbers in an array

查看:46
本文介绍了确定数组中负数和正数的数量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要在汇编程序中确定数组中负数和正数的数量.汇编程序似乎没有将它们识别为负数.我怎么解决这个问题?我这样定义数组:

I need to determine the number of negative and positive numbers in an array in assembler. It seems that the assembler doesn't recognizes them as negative numbers. How can I solve this problem? I define the array this way:

word_array db 3, -2, 11, -1, -2, -7, -5, -20

我有这个计算正数的函数:

I have this function that counts the positive ones:

count_positives:
mov dx, word [word_array + 2*ecx - 2]
cmp edx, 0
JL skip
inc ebx
skip:
loopnz count_positives

推荐答案

您正在加载 DX 的低 16 位,留下高位(包括符号位)保留之前存在的任何垃圾.使用 16 位操作数大小进行比较.

You're loading into the low 16 bits of DX, leaving the high bits (including the sign bit) holding whatever garbage was there before. Use 16-bit operand-size for your compare.

计算负数或非负数,然后从总计数中减去另一个.

Count either negative or non-negative, and subtract that from the total count to get the other.

如果你需要计算负数和正数,那么你需要两个计数器,一个 testcmp 后跟两个分支(这样零就不会进入任一计数器).

If you need to count negative and positive, then you need two counters, and a test or cmp followed by two branches (so that zero doesn't go into either counter).

改编自 Sten 的回答,但有一些改进.注意test value, -1 等价于cmp value, 0.

Adapted from Sten's answer, but with some improvements. Note that test value, -1 is equivalent to cmp value, 0.

section .rodata

word_array dw -1,2,-3,4
len  equ $-word_array     ; length in bytes.  assembler constant, so we can mov reg, imm8/imm32   rather than loading it as data.

section .text
;; clobbers ESI, ECX.  Returns in EAX, EDX
proc:
  mov   esi, word_array  ; esi points to the array.  In MASM, use OFFSET word_array
  mov   ecx, len/2 - 1      ; [esi + ecx*2] points to the last element
  xor   edx, edx           ; non_neg_count = 0

countloop:
    ; cmp   [esi + ecx*2], 0   ; This can't macro-fuse (memory and immediate operand).  Also can't micro-fuse on SnB, because of a 2-reg addressing mode
  movsx   eax, word [esi + ecx*2]  ; use a 2-reg addressing mode to save loop overhead, since this there's no ALU execution port component to this insn.  It doesn't need to micro-fuse to be one uop
  test    eax, eax        ; can macro-fuse with js
  js isNegative
  inc   edx               ; counting non-negative numbers
isNegative:
  dec   ecx               ; can macro-fuse with jge, but probably won't unless alignment stops it from being decoded in the same cycle as the earlier test/js
  jge countloop       ; jge, not jnz, because we want ecx from [0 : len-1], rather than [1 : len]

; after the loop, ecx=-1, edx=non_neg_count
; neg_count = array_count - non_neg_count
  mov   eax, len/2
  sub   eax, edx        ;   eax =  neg_count

  ret    ; return values in eax, edx

Intel 上的循环是 4 uop.(或者更可能在 Haswell 之前的 Sandybridge 上有 5 个,如果两个测试/分支对在同一周期中都击中解码器,那么只有一个宏融合.HSW 可以在单个解码组中进行 2 个宏融合).

The loop is 4 uops on Intel. (Or more likely 5 on Sandybridge before Haswell, if both test/branch pairs hit the decoders in the same cycle so only one macro-fuses. HSW can do 2 macro-fusions in a single decode group).

带有 sets bl/add edx, ebx 的无分支版本可能会运行良好.

A branch-less version with sets bl / add edx, ebx might work well.

您可以通过将 eax 归零来稍微节省代码大小,然后在循环中使用 scaswax 与 [esi] 进行比较,并将 esi 增加 2,但是它通常不是性能的好选择.

You could maybe save slightly on code size by zeroing eax, then using scasw in a loop to compare ax with [esi], and increment esi by two, but it's not usually a good choice for performance.

section .rodata

word_array dw -1,2,0,-3,4
len  equ $-word_array     ; length in bytes.  assembler constant, so we can mov reg, imm8/imm32   rather than loading it as data.

section .text
;; clobbers ESI, EDI, EBP.  Returns in EAX, EDX
proc_pos_and_neg:
  mov   esi, word_array   ; esi points to the array.  In MASM, use OFFSET word_array
  xor   edx, edx           ; pos_count = 0
  xor   eax, eax           ; neg_count = 0

  lea   edi, [esi + len]  ; points one past the end of the array
  xor   ebx, ebx          ; clear upper portion, because setcc r32 isn't available, only setcc r8  :(

countloop:
  cmp    word [esi], 0
  setg   bl               ; 0 or 1, depending on  array[i] > 0
  lea    edx, [edx + ebx]  ; add without affecting flags
  setl   bl
  add    eax, ebx          ; can clobber flags now

  add    esi, 2            ; simple pointer-increment
  cmp    esi, edi
  jb  countloop            ; loop while our pointer is below the pointer to one-past-the-end

ret     ; neg_count in eax,  pos_count in edx

如果需要,零计数是 n - eax - edx,其中 n 是元素数.

And the zero count is n - eax - edx if you want it, where n is the number of elements.

我在这里使用了不同的循环结构只是为了变化.循环应该是 7 uop.

I used a different loop structure here just for variety. The loop should be 7 uops.

在 setcc 写入 bl 之后读取 ebx 避免了部分寄存器合并惩罚,因为我们在循环外对 EBX 进行了异或清零.(保存/恢复 EBX 的上下文切换或中断将消除该性能优势,但对于短循环,可能仍然值得将异或归零从循环中提升.)

Reading ebx after setcc writes bl avoids a partial-register merge penalty because we xor-zeroed EBX outside the loop. (A context switch or interrupt that saves/restores EBX will remove that performance benefit, but for short loops it's probably still worth hoisting the xor-zeroing out of the loop.)

这篇关于确定数组中负数和正数的数量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆