索引寻址模式和隐式寻址模式 [英] Indexed addressing mode and implied addressing mode

查看:144
本文介绍了索引寻址模式和隐式寻址模式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

索引寻址模式通常用于访问数组,因为数组是连续存储的。我们有一个索引寄存器,该索引寄存器在每次迭代中都会递增,当添加到基地址时,它会给出数组元素的地址。
我不了解此寻址模式的实际需求。为什么我们不能直接寻址呢?我们有基本地址,每次访问时都可以向其添加1。为什么我们需要索引寻址模式,它需要索引寄存器的开销?

Indexed addressing mode is usually used for accessing arrays as arrays are stored contiguosly. We have a index register which gets incremented in every iteration which when added to base address gives the array element address. I don't understand the actual need of this addressing mode. Why can't we do this with direct addressing ? We have the base address and we can just add 1 to it every time when accessing. Why do we need indexed addressing mode which has a overhead of index register ?

我不确定隐式寻址模式的指令格式。假设我们有一条指令INC AC。指令中是否指定了交流电的地址,或者是否有特殊的操作码表示 INC AC,我们不包括交流电(累加器)的地址?

I am not sure about the instruction format for implied addressing mode. Suppose we have a instruction INC AC. Is the address of AC specified in the instruction or is there a special opcode which means 'INC AC' and we don't include the address of AC (accumulator)?

推荐答案


我不了解此寻址模式的实际需要。为什么我们不能直接寻址呢?

I don't understand the actual need of this addressing mode. Why can't we do this with direct addressing?

您可以; MIPS只有一种寻址模式,编译器仍然可以为其生成代码。但是有时它必须使用额外的移位+ add 指令来计算地址(如果它不只是遍历数组)。

You can; MIPS only has one addressing mode and compilers can still generate code for it just fine. But sometimes it has to use an extra shift + add instruction to calculate an address (if it's not just looping through an array).

寻址模式的重点是保存指令和保存寄存器,尤其是在x86这样的2操作数指令集中,其中添加eax,ecx 覆盖 eax 的结果( eax + = ecx ),与MIPS或其他的3指令ISA不同$ t2,$ t1,$ t0 确实 t2 = t1 + t0 。在x86上,这需要一个副本( mov )和一个添加。 (或者在特殊情况下, lea edx,[eax + ecx] :x86可以使用与内存相同的指令编码进行复制和添加(和移位)

The point of addressing modes is to save instructions and save registers, especially in 2-operand instruction sets like x86, where add eax, ecx overwrites eax with the result (eax += ecx), unlike MIPS or other 3-instruction ISAs where addu $t2, $t1, $t0 does t2 = t1 + t0. On x86, that would require a copy (mov) and an add. (Or in that special case, lea edx, [eax+ecx]: x86 can copy-and-add (and shift) using the same instruction-encoding it uses for memory operands.)

考虑一个直方图问题:您无法确定的顺序生成数组索引,并且必须对数组进行索引。在x86-64上, add dword [rbx + rdi * 4],1 将使用单个4字节指令递增内存中的32位计数器,该指令仅解码为前端向现代英特尔CPU的乱序内核发出2微秒的指令。 ( http://agner.org/optimize/ )。 ( rbx 是基址寄存器, rdi 是缩放索引)。具有可缩放的索引非常强大; x86 16位寻址模式支持2个寄存器,但不支持缩放索引。

Consider a histogram problem: you generate array indices in unpredictable order, and have to index an array. On x86-64, add dword [rbx + rdi*4], 1 will increment a 32-bit counter in memory using a single 4-byte instruction, which decodes to only 2 uops for the front-end to issue into the out-of-order core on modern Intel CPUs. (http://agner.org/optimize/). (rbx is the base register, rdi is a scaled index). Having a scaled index is very powerful; x86 16-bit addressing modes support 2 registers, but not a scaled index.

经典MIPS仅具有单独的移位和加法指令,尽管MIPS32确实添加了缩放加法指令用于地址计算。这样可以在此处保存说明。作为加载存储机器,加载和存储必须始终是单独的指令(与在x86上不同,在x86上,将解码添加为微融合加载+添加和存储。请参见 INC指令与ADD 1:有关系吗?)。

Classic MIPS only has separate shift and add instructions, although MIPS32 did add a scaled-add instruction for address calculation. That would save an instruction here. Being a load-store machine, the loads and stores always have to be separate instructions (unlike on x86 where that add decodes as a micro-fused load+add and a store. See INC instruction vs ADD 1: Does it matter?).

ARM对MIPS可能是更好的比较:它也是一台负载存储RISC计算机。但是它确实可以选择寻址模式,包括使用桶形移位器的缩放索引。因此,您无需为每个数组索引分别进行移位/添加,而可以使用 LDR R0,[R1,R2,LSL#2] 添加r0, r0,#1 / str 具有相同的寻址模式。

Probably ARM would be a better comparison for MIPS: It's also a load-store RISC machine. But it does have a selection of addressing modes, including scaled index using the barrel shifter. So instead of needing a separate shift / add for each array index, you'd use LDR R0, [R1, R2, LSL #2], add r0, r0, #1 / str with the same addressing mode.

通常在循环时通过数组,最好只增加x86上的指针。但这也是使用索引的一种选择,尤其是对于具有相同索引的多个数组的循环,例如 C [i] = A [i] + B [i] 。索引寻址模式有时可以在硬件上效率稍低,因此,当编译器使用时正在展开循环时,即使必须单独增加所有3个指针而不是增加一个索引,通常也应使用指针。

Often when looping through an array, it is best to just increment pointers on x86. But it's also an option to use an index, especially for loops with multiple arrays using the same index, like C[i] = A[i] + B[i]. Indexed addressing mode can sometimes be slightly less efficient in hardware, though, so when a compiler is unrolling a loop it usually should use pointers, even though it has to increment all 3 pointers separately instead of one index.

指令集设计的重点不仅在于图灵完善,还在于使 efficiency 代码能够以更少的时钟周期和/或更小的代码大小完成更多的工作,或者为程序员提供选择

The point of instruction-set design is not merely to be Turing complete, it's to enable efficient code that gets more work done with fewer clock cycles and/or smaller code-size, or give programmers the option of aiming for either of those goals.

可编程计算机的最低阈值非常低,例如,参见各种一个指令集计算机体系结构。 (没有真正实现的,只是在纸上设计的,目的是表明可以编写程序,只用减零和分支运算(如果小于零)指令,并在指令中编码内存操作数。

The minimum threshold for a computer to be programmable is extremely low, see for example various One instruction set computer architectures. (None implemented for real, just designed on paper to show that it's possible to write programs with nothing but a subtract-and-branch-if-less-than-zero instruction, with memory operands encoded in the instruction.

易于解码(尤其是并行解码)与紧凑型之间需要权衡取舍。x86令人恐惧,因为它演变为一系列扩展,通常没有太多计划来留出空间如果您对ISA设计决策感兴趣,请访问Agner Fog的博客,进行有趣的讨论,以为结合高性能x86的高性能CPU设计ISA(很多工作与一条指令结合,例如内存操作数为ALU指令的一部分)具有RISC的最佳功能(易于解码,寄存器很多):关于理想的可扩展指令集的建议

There's a tradeoff between easy to decode (especially to decode in parallel) vs. compact. x86 is horrible because it evolved as a series of extensions, often without a lot of planning to leave room for future extensions. If you're interested in ISA design decisions, have a look at Agner Fog's blog for interesting discussion about designing an ISA for high-performance CPUs that combines the best of x86 (lots of work with one instruction, e.g. memory operand as part of an ALU instruction) with the best features of RISC (easy to decode, lots of registers): Proposal for an ideal extensible instruction set.

尤其是在像大多数RISC一样的固定指令宽度ISA中。不同的ISA做出不同的选择。

There's also a tradeoff in how you spend the bits in an instruction word, especially in a fixed instruction width ISA like most RISCs. Different ISAs made different choices.


  • PowerPC使用许多编码空间来存储像 rlwinm (向左旋转并掩盖位窗口),还有很多操作码。 IDK,如果通常难以发音且难以记忆的助记符与此有关...

  • ARM将高4位用于根据条件代码来执行任何指令。它为桶形移位器使用了更多位(第二个源操作数可以选择通过另一个寄存器的立即数或计数来移位或旋转)。

  • MIPS具有相对较大的立即数,并且基本上很简单。

  • PowerPC uses lots of the coding space for powerful bitfield instructions like rlwinm (rotate left and mask off a window of bits), and lots of opcodes. IDK if the generally unpronounceable and hard-to-remember mnemonics are related to that...
  • ARM uses the high 4 bits for predicated execution of any instruction based on condition codes. It uses more bits for the barrel shifter (the 2nd source operand is optionally shifted or rotated by an immediate or a count from another register).
  • MIPS has relatively large immediate operands, and is basically simple.

x86 32/64位寻址模式使用可变长度编码,当有索引时,带有一个额外的字节SIB(标度/索引/基数)字节,以及可选的disp8或disp32立即替换。 (例如 add esi,[rax + rdx + 12340] 需要2 +1 + 4字节进行编码,而 add esi需要2字节,[ rax]

x86 32/64-bit addressing modes use a variable-length encoding, with an extra byte SIB (scale/index/base) byte when there's an index, and an optional disp8 or disp32 immediate displacement. (e.g. add esi, [rax + rdx + 12340] takes 2 + 1 + 4 bytes to encode, vs. 2 bytes for add esi, [rax].

x86 16位寻址模式的局限性更大,并将除可选的disp8 / disp16位移之外的所有内容打包到ModR /

x86 16-bit addressing modes are much more limited, and pack everything except the optional disp8/disp16 displacement into the ModR/M byte.


假设我们有一条指令INC AC,是否指定了AC地址在指令中还是有一个特殊的操作码,它表示 INC AC,我们不包括AC(累加器)的地址?

Suppose we have a instruction INC AC. Is the address of AC specified in the instruction or is there a special opcode which means 'INC AC' and we don't include the address of AC (accumulator)?

是的,某些ISA中某些指令的机器代码格式包括隐式操作数,许多机器都有 push / pop 隐式使用特定寄存器作为堆栈指针的指令。例如,在x86-64的 push rax 中,RAX是显式的寄存器操作数(编码为一字节的低3位使用 push r64 短格式的操作码),而RSP是隐式操作数。

Yes, the machine-code format for some instructions in some ISAs includes implicit operands. Many machines have push / pop instructions that implicitly use a specific register as the stack pointer. For example, in x86-64's push rax, RAX is an explicit register operand (encoded in the low 3 bits of the one-byte opcode using the push r64 short form), while RSP is an implicit operand.

Older 8位CPU通常具有诸如DECA(减法累加器A)之类的指令。即该寄存器有一个特定的操作码。这可能与在操作码字节中的某些位上指定DEC的DEC指令相同(就像x86在x86-64重新使用简短的INC / DEC编码作为REX前缀:请注意64位模式列中 dec r32 <的 NE(不可编码))。但是,如果没有常规模式,则可以肯定将其视为隐式操作数。

Older 8-bit CPUs often had instructions like DECA (to decrement the accumulator, A). i.e. there was a specific opcode for that register. This could be the same thing as having a DEC instruction with some bits in the opcode byte specifying which register (like x86 does before x86-64 repurposed the short INC/DEC encodings as REX prefixes: note the "N.E" (Not Encodeable) in the 64-bit mode column for dec r32). But if there's no regular pattern then it can definitely be considered an implicit operand.

有时将内容归为整洁的类别会失败,因此不必担心是否使用对于x86,具有操作码字节的位计为隐式或显式。这是一种花费更多操作码空间来节省常用指令的代码大小,同时仍允许与其他寄存器一起使用的方法。

Sometimes putting things into neat categories breaks down, so don't worry too much about whether using bits with the opcode byte counts as implicit or explicit for x86. It's a way of spending more opcode space to save code-size for commonly used instructions while still allowing use with different registers.

某些ISA仅将某个寄存器用作堆栈按照约定的指针,没有隐式使用。 MIPS就是这样。

Some ISAs only use a certain register as the stack pointer by convention, with no implicit uses. MIPS is like this.

ARM32(在ARM中,不是Thumb模式)在push / pop中也使用显式操作数。它的push / pop助记符只是存储前多个递减 /加载多个后递增(LDMIA / STMDB)的别名,以实现全递减堆栈。请参见 ARM的文档适用于LDM / STM的解释说明,以及如何使用这些说明的一般情况,例如LDMDB递减指针,然后加载(与POP相反)。

ARM32 (in ARM, not Thumb mode) also uses explicit operands in push/pop. Its push/pop mnemonics are just aliases for store-multiple decrement-before / load-multiple increment-after (LDMIA / STMDB) to implement a full-descending stack. See ARM's docs for LDM/STM which explains this, and what you can do with the general case of these instructions, e.g. LDMDB to decrement a pointer and then load (in the opposite direction of POP).

这篇关于索引寻址模式和隐式寻址模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆