在数组中找到第二个最大值的汇编程序 [英] Assembly program that finds second largest value in array
问题描述
我编写了一个汇编程序,该程序在数组中找到最大值,但是现在我希望它在数组中找到第二大数字.我将如何修改我的程序来做到这一点?
I wrote a assembly program that finds the max value in an array, but now I would like it to find the second largest number in the array. How would I modify my program to do this?
这是我编写的程序,它确实起作用.程序将打印数组中的所有值,然后找到数组的最大值.现在,我希望它找到第二个最大值.
This is the program I wrote and it does work. The program prints all the values in the array and then finds the max value of the array. Now I want it to find the second largest value.
%include "io.mac"
.STACK 100H
.DATA
Numbers DW 3,4,5,2,6,0
msg0 db "Printing values in array",0
msg1 db "Max",0
msg2 db "Min",0
.CODE
.STARTUP
PutStr msg0
mov dx, Numbers
mov si, dx ;point to array
printValues:
cmp word [si], 0
je procedure
nwln
PutInt [si]
add si, 2
jmp printValues
procedure:
push Numbers ;push Number to stack to pass parameter by stack
call maxMeth
nwln
PutStr msg1
nwln
PutInt ax
nwln
complete:
.EXIT
maxMeth:
enter 0,0 ;save old bp and set bp to sp
mov si, [bp+4] ;point to array
mov ax, [si] ; ax holds max
add si,2 ; Increment si to next number
;Now entering loop
max:
cmp word [si],0 ; checks to see if the number is 0 and if it is, then we are done.
je finish
cmp ax, [si] ; ax holds the max . So if ax is greater than si, then dont assign si to ax.
jge increment
jmp newMax
newMax:
mov ax, [si] ; Otherwise we have a new a max
increment:
add si, 2 ;now increment si to check the next value and jump back to the main loop.
jmp max
finish: ;clean up.
leave ;pop bp
ret ;return
我编辑了我的程序以跟踪第二个最大值,并且我收到的第二个最大值的结果是3.我期望程序输出5.我不确定为什么我得到了错误的输出.这是我对该程序的编辑.
I edited my program to keep track of the second max and the result I receive for the second max value is 3. I was expecting the program to output 5. Im not sure why I am getting the wrong output. Here are my edits to the program.
%include "io.mac"
.STACK 100H
.DATA
Numbers DW 3,4,5,2,6,0
msg0 db "Printing values in array",0
msg1 db "Max",0
.CODE
.STARTUP
PutStr msg0
mov dx, Numbers
mov si, dx ;point to array
printValues:
cmp word [si], 0
je procedure
nwln
PutInt [si]
add si, 2
jmp printValues
procedure:
push Numbers ;push Number to stack to pass parameter by stack
call maxMeth
nwln
PutStr msg1
nwln
PutInt ax
nwln
PutInt bx
nwln
complete:
.EXIT
maxMeth:
enter 0,0 ;save old bp and set bp to sp
mov si, [bp+4] ;point to array
mov ax, [si] ; ax holds max
mov bx, [si] ; ax holds max
add si,2 ; Increment si to next number
;Now entering loop
max:
cmp word [si],0 ; checks to see if the number is 0 and if it is, then we are done.
je finish ;equals 0, then finished
cmp ax, [si]
jg testSecondMax ;So if ax is greater than si, then dont assign si to ax and check second max
jmp newMax ;otherwise go to new max
newMax:
mov ax, [si] ; save new max
jmp increment ; and increment
testSecondMax:
cmp bx, [si]
jl secondMax
jg increment
secondMax:
mov bx, [si]
jmp increment
increment:
add si, 2 ;now increment si to check the next value and jump back to the main loop.
jmp max
finish: ;clean up.
leave ;pop bp
ret ;return
推荐答案
我最终为此编写了代码,因为这使我想知道实现循环的效率如何.如果您想自己解决家庭作业,请不要看我的代码,而只能看英文的要点.使用调试器单步执行代码.
I ended up writing code for this because it made me wonder how efficiently I could implement the loop. If you want to solve your homework yourself, don't look at my code, just the bullet points in English. Use a debugger to single-step through your code.
这是我将如何更改您的代码:
Here's how I would change your code:
-
NASM样式:在函数内使用局部标签(如
.noswap:
).将操作数缩进一致的列,这样看起来就不会显得参差不齐.使用输入/返回值和调用约定(将其注册为clobbers)注释函数.
NASM style: use local labels (like
.noswap:
) within functions. Indent operands to a consistent column so it doesn't look ragged. Comment your function with inputs / return value and calling convention (what registers it clobbers).
在newMax:
之前优化jmp next_instruction
,因为这只是跳转到执行失败的代价高昂的禁止操作.
optimize out the jmp next_instruction
before newMax:
because it's just an expensive no-op to jump where execution was going to fall through anyway.
除非可能要针对真实的8086进行优化,否则请不要使用enter
,它会很慢.
Unless maybe optimizing for a real 8086, don't use enter
, it's slow.
将要检查的每个元素加载到寄存器中,而不是多次使用相同的内存操作数. (x86-16除了BP/SP之外还有6个整数寄存器;请使用它们.)
Load each element being checking into a register, instead of using the same memory operand multiple times. (x86-16 has 6 integer registers other than BP/SP; use them.)
将循环退出条件分支放在底部. (如果需要,请跳到此处作为循环入口点).
put the loop-exit conditional branch at the bottom. (Jump to there for the loop entry point if needed).
在两个寄存器中保留一个最大值和第二个最大值,就像您在AX中保留最大值一样.
Keep a max and 2nd-max in two registers, like you're keeping max in AX.
当找到大于2nd-max的元素时,请保留您拥有的3个数字中的最高2个.即在2个寄存器中维护2个元素的队列/排序列表.
When you find an element greater than 2nd-max, keep the highest 2 of the 3 numbers you have. i.e. maintain a 2-element queue / sorted list in 2 registers.
未经测试:
; word max2Meth(word *array);
; Input: implicit-length array (terminated by a 0 element),
; pointed to by pointer passed on the stack. (DS segment)
; returns in ax
; clobbers: cx, dx
global max2Meth
max2Meth:
push bp
mov bp, sp ; make a stack frame. (ENTER is slow, don't use it)
push si ; SI is call-preserved in many calling conventions. Omit this if you want to just clobber it.
mov si, [bp+4] ; pointer to array
mov ax, [si] ; assume that the list is non-empty
mov dx, ax ; start with max = max2 instead of putting a conditional xchg outside the loop
jmp .loop_entry ; enter at the bottom, at the conditional branch
;;; ax: 2nd max
;;; dx: max
.maxloop: ; do {
cmp cx, ax ; check against 2nd-max, because the common case is less than both.
jg .updateMaxes ; optimize for the common case: fall through on not-found
.loop_entry:
add si, 2
mov cx, [si] ; c = *p++;
test cx, cx
jnz .maxloop ; } while (c != 0);
.finish:
; 2nd-max is already in AX, just clean up and return
pop si
leave ;pop bp would be faster because SP is already pointing to the right place
ret
; This block is part of the function, even though it's placed after the ret
.updateMaxes:
mov ax, cx ; Bump the old 2nd-max unconditionally
cmp ax, dx
jle .loop_entry ; if 2nd_max <= max, return to the loop
xchg ax, dx ; otherwise swap
jmp .loop_entry
将一个罕见情况下的代码块放在行外是很好的,因为这意味着普通情况可能会在没有采取分支的情况下掉线.通常,如果if/else条件处于行内,则需要jmp
只是为了避免在if之后运行else部分.但是.updateMaxes:
最终还是相当优雅,IMO:它必须跳回到循环中,我们可以在交换之前或之后做到这一点.
Putting the block for a rare case out-of-line is nice, because it means the common case can just fall through with no taken branches. Often putting if/else conditions inline require a jmp
somewhere just to avoid running the else part after the if. But .updateMaxes:
ends up being rather elegant, IMO: it has to jump back into the loop, and we can do that either before or after swapping.
16 -bit xchg
是3微秒(与3条mov
指令一样昂贵),但是在new-max情况下,使它更便宜(并且仅做mov ax, dx
/mov dx, cx
)使new-2ndmax的情况变慢.假设仅更新第二个最大值而不是同时更新两个最大值,我认为仅mov和cmp/jcc就是赢家.您可以使用cmov
(在686 CPU上)使该部件无分支,这可能很好.使整个事情变为无分支将给您一个相当长的依赖链,并且不值得,除非数组元素平均到最后逐渐变大,所以您一直都在频繁进行max-updates(但没有使用模式,所以您导致分支未命中.)
16-bit xchg
is 3 uops (as expensive as 3 mov
instructions), but to make that cheaper (and only do mov ax, dx
/ mov dx, cx
) in the new-max case we'd have to make the new-2ndmax case slower. Assuming it's more likely to just update the 2nd max than to update both, I think just mov and cmp/jcc is a win there. You could make that part branchless with cmov
(on a 686 CPU), which might be good. Making the whole thing branchless would give you quite a long dependency chain, and not be worth it unless the array elements were on average getting larger towards the end so you had frequent max-updates all the time (but not with a pattern, so you get branch misses).
在Intel Haswell/Skylake上,内部循环只有4个融合域uops(比较/分支均可宏融合).如果长期不进行更新,则每个时钟应以1次迭代的速度运行.
On Intel Haswell/Skylake, the inner loop is only 4 fused-domain uops (both compare/branches can macro-fuse). Over long runs of no updates, it should run at 1 iteration per clock.
如果您要针对超速的代码大小进行优化(例如,针对实数8086),请使用ax
作为临时代码,并使用lodsw
代替mov ax, [si]
和add si, 2
. (将max
保留在其他寄存器中.)
If you're optimizing for code-size over speed (e.g. for a real 8086), use ax
as the temporary and lodsw
instead of mov ax, [si]
and add si, 2
. (Keep your max
in a different register).
对于隐式长度列表,您不能仅将scasw
与ax
中的2nd-max一起使用,因为您需要检查0以及> 2nd-max:/
With an implicit-length list, you can't just use scasw
with 2nd-max in ax
, because you need to check for 0 as well as > 2nd-max :/
作为进一步的优化,如果使用的是微型模型(SS = DS),则可以使用bp
代替si
,因为在加载指针后无需访问堆栈.您可以使用pop bp
代替leave
As a further optimization, you could use bp
instead of si
, if you're using the tiny model (SS = DS), because you don't need to access the stack after loading the pointer. You can just pop bp
instead of leave
在我想到只用ax = dx =第一个元素进入循环之前,我将在循环之前使用此代码:
Before I thought of simply entering the loop with ax = dx = first element, I was going to use this code before the loop:
mov ax, [si] ; assume that the list is non-empty
mov dx, [si+2] ; but check if it's only 1 element long, like maxMeth does
test dx, dx
jz .finish
add si, 4 ; first 2 elements are loaded
cmp ax, dx ; sort them so ax <= dx
jng .noswap
xchg ax, dx
.noswap:
构造内部循环的另一种方法是这样的:
Another way to structure the inner loop would be like this:
.maxloop: ; do {
add si, 2
mov cx, [si] ; c = *p++;
test cx, cx
jz .finish ; jz instead of jnz: exit the loop on the sentinel value
cmp cx, ax ; check against 2nd-max, because the common case is less than both.
jng .maxloop
;; .updateMaxes: ;; conditionally fall through into this part
mov ax, cx ; Bump the old 2nd-max unconditionally
cmp ax, dx
jle .maxloop ; if 2nd_max <= max, return to the loop
xchg ax, dx ; otherwise swap
jmp .maxloop
.finish:
这可能更好,因为我们可以进入循环的顶部开始.我们使用jz
跳过了条件更新的内容,从而摆脱了循环,因此我们没有任何分支可以跳过妨碍"的代码. (即,我们已经成功地有效地布置了代码块.)
This is probably better because we can just fall into the top of the loop to start. We get out of the loop with a jz
that skips over the conditional-update stuff, so we don't have any branches that exist only to skip over code that's "in the way". (i.e. we've succeeded at laying out our code blocks efficiently.)
某些CPU唯一的缺点是test/jz和cmp/jg背靠背.当条件分支被两条以上的指令分开时,某些CPU的性能会更好. (例如,除非您很幸运,它们如何在Sandybridge上运行,否则两个分支中的一个不会进行宏保险丝.但是它们会在第一个循环中出现.)
The only downside for some CPUs is that the test/jz and cmp/jg are back to back. Some CPUs do better when conditional branches are separated by a couple more instructions than that. (e.g. unless you get lucky on how these hit the decoders on Sandybridge, one of the two branches won't macro-fuse. But they would with the first loop.)
提醒:堆栈溢出用户贡献已在 cc by-sa 3.0 下获得许可必须注明出处,因此,如果您复制并粘贴我的整个代码,请确保在注释中加入https://stackoverflow.com/questions/47466116/assembly-program-that-finds-second-largest-value-in-array/47467682#47467682
.
Reminder: Stack Overflow user contributions are licensed under cc by-sa 3.0 with attribution required, so if you copy-paste my whole code, make sure you include https://stackoverflow.com/questions/47466116/assembly-program-that-finds-second-largest-value-in-array/47467682#47467682
in a comment.
这篇关于在数组中找到第二个最大值的汇编程序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!