从8086实模式存储器中读取,而使用“ORG为0x0000” [英] Reading from memory in 8086 real mode while using 'ORG 0x0000'

查看:400
本文介绍了从8086实模式存储器中读取,而使用“ORG为0x0000”的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在x86-16装配瞎搞和VirtualBox的运行它。出于某种原因,当我从内存中读取并尝试打印它作为一个人物,我得到了我的期待完全不同的结果。然而,当我硬code中的字符作为指令的一部分,它工作正常。
这里的code:

  0 ORG
位16推字0xB800;在彩色显示器实模式文本屏幕的视频内存地址
推CS
流行DS; DS = CS
流行ES; ES = 0xB800
JMP启动;输入=二(位置* 2),斧头(字符和属性)
的putchar:
    STOSW
    RET;输入= SI(NUL结尾的字符串)
打印:
    CLI
    CLD
    .nextChar:
        LODSB; MOV等,[DS:SI]; SI + = 1
        测试人,人
        JZ .finish
        打电话的putchar
        JMP .nextChar
    。完:
        STI
        RET开始:
    MOV啊,为0x0E
    MOV DI,8    ;应打印P
    MOV人,字节[味精]
    打电话的putchar    ;应打印
    MOV人,字节[味精+ 1]
    打电话的putchar    ;应打印Ø
    MOV人,字节[味精+ 2]
    打电话的putchar    ;应打印!
    MOV人,字节[味精+ 3]
    打电话的putchar    ;应打印点¯x
    MOV人,X
    打电话的putchar    ;应打印ÿ
    MOV人,Y
    打电话的putchar    CLI
    HLT味精:DB'PAO!',0;填充字节的其余部分与高达0字节510
次510 - ($ - $$)0分贝;头
将0x55分贝
DB和0xAA

打印标签和说明书中可因为我还没有使用它,因为我一直在尝试打印存储在内存中的字符的问题被忽略。我既FASM和NASM组装,并有同样的问题这意味着它显然是我的错。

它打印是这样的:


解决方案

ORG指令

当您指定的 ORG 的如 ORG 0×0000 指令在你的汇编程序的顶部,并使用位16 你通知的 NASM 的,解决标签code和数据时,也将将根据起点在指定的偏移量生成的绝对偏差的 ORG 的(16位code将被限制在一个偏移量是一个的的/ 2字节)。

如果您有 ORG 0×0000 的开始和放置标签启动:在$ C的开始$ C,启动将有绝对的为0x0000抵消。如果你使用 ORG 0x7C00 然后标签启动将有绝对的0x7c00抵消。这将适用于任何数据标签和code标签。

我们可以简化您的例子,看到一个数据变量和硬codeD人物打交道时,发生了什么事在生成的code。虽然这code不完全执行相同的操作你的code,它是足够接近展现什么可行,什么不可行。

使用示例 ORG 0×0000

 位16
ORG为0x0000开始:
    推CS
    流行DS; DS = CS
    推0xb800
    流行ES; ES = 0xB800(显存)
    MOV啊,0x0E的; AH =属性(黑底黄字)    MOV人,字节[味精]
    MOV [ES:0×00],斧头;这应该打印字母'P'
    MOV人,字节[味精+ 1]
    MOV [ES:0×02],斧头;这应该打印字母'A'
    MOV人,'O'
    MOV [ES:0×04],斧头;这应该打印字母O
    MOV人,!
    MOV [ES:0×06],斧头;这应该打印字母'!    CLI
    HLT味精:DBPA;启动扇区填充
次510 - ($ - $$)0分贝
DW 0xAA55将

如果你在的的VirtualBox 的第2个字符是垃圾运行此而 0!应正确显示。我将使用通过这个答案的其余部分这个例子。


的VirtualBox / CS:IP /段:偏移双

在虚拟盒的情况下,这将有效地做的相当于的一个的 FAR JMP 的为0x0000的:0x7c00物理地址0x00007c00加载引导扇区之后。 A FAR JMP 的(或同等学历)不仅将跳转到指定的地址,它设置的 CS IP 应用于取值。 A FAR JMP 的为0x0000:0x7c00将设置的 CS = 0x0000,并的 IP 的= 0x7c00。

如果一个是不熟悉落后16位段的计算:偏移对他们是如何映射到物理地址,然后这个<一个href=\"https://web.archive.org/web/20150908174936/http://thestarman.pcministry.com/asm/debug/Segments.html\"相对=nofollow>文件是一个相当不错的起点,理解概念。一般方程从16位段获得一个物理内存地址:偏移对是(段&LT;&LT; 4)+偏移量= 20位物理地址

由于VirtualBox中使用的 CS:IP 的0x0000到:0x7c00将开始在一个物理地址执行code(0×0000&LT;&4;)+ 0x7c00 = 20位物理地址0x07c00。请注意,这是不能保证在所有环境中的情况。因为段的性质:抵消对,有参考物理地址0x07c00一种以上的方式。请参见这个答案就如何妥善处理这一年底。


什么是与你的引导错误去?

假设我们使用VirtualBox和上面在previous节中的信息被认为是正确的,那么 CS = 0x0000,并的 IP 的= 0x7c00进入我们的bootloader在。如果我们把例如code(使用 ORG 0×0000 )我在这个答案的第一部分中写道,并期待在反汇编的信息(我将使用的 objdump的的输出),我们会看到这一点:

  objdump的-Mintel -mi8086 -D -b二进制 - 调整 -  VMA = 0×0000 BOOT.BIN00000000所述;的.data计算值:
   0:0E推CS
   1:1F流行DS
   2:68 00 B8推0xb800
   5:07流行ES
   6:B4 0E MOV啊,0xe
   8:A0 24 00 MOV人,DS:0X24
   A:26 00 A3 00 MOV ES:为0x0,斧
   F:A0 25 00 MOV人,DS:0x25
  12:26 A3 02 00 MOV ES:0X2,斧
  16:B0 4F MOV人,0x4f
  18:26 A3 04 00 MOV ES:为0x4,斧
  1C:B0 21 MOV等,为0x21
  1E:26 06 A3 00 MOV ES:为0x6,斧
  22:FA CLI
  23:F4 HLT
  24:50推斧;字母'P'
  25:41 INC CX;字母A
        ...
 1FE:55推基点
 1FF:AA STOS BYTE PTR ES:[二],人

由于组装到一个二进制文件时,ORG信息丢失,我用 - 调整 - VMA = 0×0000 ,这样值的第一列(内存地址)启动为0x0000。我想这样做,因为我用 ORG 0×0000 原汇编code。我还增加了code一些意见,看看在那里我们的数据部分(以及其中字母 P A 中的code后放置)。

如果你运行在VirtualBox中此计划的前两个字符会出来为乱码。那么,为什么会这样?首先召回的VirtualBox通过的 CS 的设置为0x0000和的 IP 的到0x7c00到达了code。这code再复制的 CS DS 的:

  0:0E推CS
   1:1F流行DS

由于 CS 的是零,那么的 DS 的是零。现在,就让我们来看看这行:

  8:24 A0 00 MOV人,DS:0X24

DS:0X24 实际上是恩codeD地址在我们的数据部分的信息的变量。在字节偏移0X24中有值 P (0x25具有 A )。您可能会看到,事情可能出错。我们的 DS 的= 0×0000这样 MOV人,DS:0X24 真的一样 MOV等,为0x0000:0X24 。此语法是无效的,但我更换的 DS 的将0x0000提出一个观点。 0×0000:0X24 是我们code,同时执行将尝试阅读我们的信 P 从。可是等等!即物理地址(0×0000&下; 4;)+ 0X24 = 0x00024。该存储器地址恰好是在存储器中的中断矢量表的中间的底部。显然,这不是我们所预期的!

有几种方法来解决这个问题。最简单的(和preferred法)是真正把正确的段成的 DS 的,而我们的程序运行时不依赖于什么 CS 的可能。由于我们设定的 ORG 的0x0000到,我们需要有一个数据段( DS 的)= 0x07c0。一个片段:offset对0x07c0的:0×0000 =物理地址0x07c00。这是我们的bootloader的地址是什么。所以我们要做的就是通过更换修改code:

 推CS
    流行DS; DS = CS

使用:

 推0x07c0
    流行DS; DS = 0x07c0

的VirtualBox 的运行时,这种变化应该提供正确的输出。现在,让我们来看看为什么。这code并没有改变:

  8:24 A0 00 MOV人,DS:0X24

执行时

现在的 DS 的= 0x07c0。这本来好像说 MOV人,0x07c0:0X24 0x07c0:0X24 ,这将转化成的物理地址(0x07c0&LT;&4;)+ 0X24 = 0x07c24。这是我们所希望的,因为我们的bootloader是由BIOS物理放入内存开始在该位置,因此它应该参考我们的信息的变量正确。

这个故事告诉我们?什么都你使用的 ORG 的应该是在 DS 的寄存器当我们开始program.We应该明确地设置它,而不是依靠什么是在<具有应用价值EM> CS 的


为什么立即值打印?

使用原来的code,打印乱码的第2个字符,但最后两个没有。正如在previous部分讨论有一个原因,第2个字符不会打印,但什么有关所做的最后2个字符?

让我们看看第三个字符 0 更仔细的拆解:

  16:B0 4F MOV人,0x4f; 0x4f ='O'

由于我们使用的即时(常量)值,并把它移到到寄存器的 AL 的,性格本身就是连接$ C $光盘作为指令的一部分。它不依赖于通过的 DS 的注册内存访问。正因为如此正确显示的最后2个字符。


罗斯里奇的建议和为什么它在VirtualBox中

罗斯岭建议我们使用 ORG 0x7c00 ,你观察到它的工作。为什么会这样?并且是理想解决方案?

使用我的第一个例子,并修改 ORG 0×0000 ORG 0x7c00 ,然后组装。 objdump的就已经提供了这个拆卸:

  objdump的-Mintel -mi8086 -D -b二进制 - 调整 -  VMA = 0x7c00 BOOT.BINBOOT.BIN:文件格式的二进制
段.data拆卸:00007c00&LT;。数据计算值:
    7c00:0E推CS
    7c01:1F流行DS
    7c02:68 00 B8推0xb800
    7c05:07流行ES
    7c06:B4 0E MOV啊,0xe
    7c08:A0 24 7C MOV人,DS:0x7c24
    7c0b:26 00 A3 00 MOV ES:为0x0,斧
    7c0f:A0 25 7C MOV人,DS:0x7c25
    7c12:26 02 A3 00 MOV ES:0X2,斧
    7c16:B0 4F MOV人,0x4f
    7c18:26 04 A3 00 MOV ES:为0x4,斧
    7c1c:B0 21 MOV等,为0x21
    7c1e:26 06 A3 00 MOV ES:为0x6,斧
    7c22:FA CLI
    7c23:F4 HLT
    7c24:50推斧;字母'P'
    7c25:41 INC CX;字母A
        ...
    7dfe:55推基点
    7dff:AA STOS BYTE PTR ES:[二],人

VirtualBox虚拟设置的 CS 的为0x0000,当它上升到我们的bootloader。我们原来的code再复制的 CS DS 的,这样的 DS 的= 0×0000。现在观察一下 ORG 0x7c00 指令做我们产生code:

  7c08:A0 24 7C MOV人,DS:0x7c24

请注意,我们现在是如何使用偏移0x7c24的!这就像 MOV等,为0x0000:0x7c24 这是物理地址(0×0000&LT;&4;)+ 0x7c24 = 0x07c24。这就是装载引导加载程序的正确记忆位置,是我们的信息的字符串的正确位置。因此,它的工作原理。

时使用 ORG 0x7c00 一个坏主意?不,它是罚款。但是,我们有一个微妙的问题,与之抗衡。如果另一个虚拟PC环境或真实的硬件确实会发生什么事没有的 FAR JMP 导入使用的 CS我们的bootloader:0x0000到IP 的:0x7c000?这个有可能。有与远跳实际上相当于确实给 BIOS的许多物理PC 0x07c0:0×0000 。那也是物理地址 0x07c00 正如我们已经看到的。在这种环境下,当我们的code运行 CS 的= 0x07c0。如果我们使用原来的code那份 CS DS DS 的现在有0x07c0了。现在观察会发生什么,这code在这种情况下:

  7c08:A0 24 7C MOV人,DS:0x7c24

DS 的=在这种情况下0x07c0。现在,我们有类似的东西 MOV人,0x07c0:0x7c24 当程序实际运行。 UT-哦,那很糟糕。这是什么转换为物理地址? (0x07c0&下; 4;)+ 0x7c24 = 0x0F824。这超出了我们的bootloader地方,它会包含无论发生什么事在计算机启动后在那里。可能为零,但应假定为垃圾。显然不是我们在那里的信息的字符串加载!

那么,我们如何解决这个问题?修改罗斯什么建议岭,并听取我previously给了有关建议明确设置的 DS 应用于我们真的想(不要以为段的 CS 的是正确的再盲目地复制到的 DS 的),我们应该把为0x0000到的 DS 的时候,如果我们用我们的bootloader启动 ORG 0x7c00 。因此,我们可以改变这个code:

  ORG 0x7c00开始:
    推CS
    流行DS; DS = CS

  ORG 0x7c00开始:
    异斧,斧; AX = 0×0000
    MOV DS,AX; DS = 0×0000

在这里,我们不的 CS 的依靠一个不受信任的价值。我们只需设置的 DS 应用于段值有道理给予的 ORG 的我们使用。你可以推0x0000并弹出它变成的 DS 的你一直在做。我更习惯于清零寄存器和移动,为的 DS

通过这种方法,它无关紧要什么价值的 CS 的可能已被使用,以达到我们的bootloader中,code仍然会引用相应的内存位置,对于我们的数据。


不要假设第一阶段是通过BIOS调用与CS:IP = 0×0000:0x7c00

在我的<一个href=\"http://stackoverflow.com/questions/32701854/boot-loader-doesnt-jump-to-kernel-$c$c/32705076#32705076\">General Bootloader的建议,我在previous StackOverflow的答案写,提示#1是非常重要的:


  

      
  • 当BIOS跳转到您的code,你不能依赖于CS,DS,ES,SS,SP寄存器具有有效的或预期值。他们应该引导程序启动的时候进行适当的设置。您只能保证你的引导程序被加载并从物理地址0x07c00和引导驱动器号被装入DL寄存器运行。

  •   

在BIOS可能远远JMP'ed(或同等学历)我们code与 JMP 0x07c0:0×0000 ,以及一些仿真器和真实的硬件做这种方式。其他人使用 JMP为0x0000:0x7c00 像VirtualBox的不

我们应通过的 DS 的明确到我们所需要的,并将其设置为情理之中的事情,因为我们在我们的 ORG 的指令使用的值。


摘要

不要想当然的 CS 的是我们期望的值,也不要盲目照搬 CS DS 的。设置的 DS 的明确。

您code可能是固定的,以使用 ORG 0×0000 ,你原来有它,如果我们设置的 DS 的适当地0x07c0为previously讨论。这可能是这样的:

  0 ORG
位16推字0xB800;在彩色显示器实模式文本屏幕的视频内存地址
推0x07c0
流行DS; DS = 0x07c0因为我们使用ORG为0x0000
流行ES

另外,我们也可以使用 ORG 0x7c00 是这样的:

  ORG 0x7c00
位16推字0xB800;在彩色显示器实模式文本屏幕的视频内存地址
推为0x0000
流行DS; DS = 0×0000,因为我们使用ORG 0x7c00
流行ES

I've been messing around with x86-16 assembly and running it with VirtualBox. For some reason when I read from memory and try to print it as a character, I get completely different results from what I was expecting. However when I hard-code the character as part of the instruction, it works fine. Here's the code:

ORG 0
BITS 16

push word 0xB800        ; Address of text screen video memory in real mode for colored monitors
push cs
pop ds                  ; ds = cs
pop es                  ; es = 0xB800
jmp start

; input = di (position*2), ax (character and attributes)
putchar:
    stosw
    ret

; input = si (NUL-terminated string)
print:
    cli
    cld
    .nextChar:
        lodsb   ; mov al, [ds:si] ; si += 1
        test al, al
        jz .finish
        call putchar
        jmp .nextChar
    .finish:
        sti
        ret

start:
    mov ah, 0x0E
    mov di, 8

    ; should print P
    mov al, byte [msg]
    call putchar

    ; should print A
    mov al, byte [msg + 1]
    call putchar

    ; should print O
    mov al, byte [msg + 2]
    call putchar

    ; should print !
    mov al, byte [msg + 3]
    call putchar

    ; should print X
    mov al, 'X'
    call putchar

    ; should print Y
    mov al, 'Y'
    call putchar

    cli
    hlt

msg: db 'PAO!', 0

; Fill the rest of the bytes upto byte 510 with 0s
times 510 - ($ - $$) db 0

; Header
db 0x55
db 0xAA

The print label and instructions in it can be ignored since I haven't used it yet because of the problem I've been having trying to print a character stored in memory. I've assembled it with both FASM and NASM and have the same problem meaning it's obviously my fault.

It prints something like:

解决方案

The ORG Directive

When you specify an ORG directive like ORG 0x0000 at the top of your assembler program, and use BITS 16 you are informing NASM that when resolving labels to Code and Data, that the absolute offsets that will be generated will be based on the starting offset specified in ORG (16-bit code will be limited to an offset being a WORD/2 bytes) .

If you have ORG 0x0000 at the start and place a label start: at the beginning of the code, start will have an absolute offset of 0x0000. If you use ORG 0x7C00 then the label start will have an absolute offset of 0x7c00. This will apply to any data labels and code labels.

We can simplify your example to see what is going on in the generated code when dealing with a data variable and a hard coded character. Although this code doesn't exactly perform the same actions as your code, it is close enough to show what works and what doesn't.

Example using ORG 0x0000:

BITS 16
ORG 0x0000

start:
    push cs
    pop  ds      ; DS=CS
    push 0xb800
    pop es       ; ES = 0xB800 (video memory)
    mov ah, 0x0E ; AH = Attribute (yellow on black)

    mov al, byte [msg]
    mov [es:0x00], ax   ; This should print letter 'P'
    mov al, byte [msg+1]
    mov [es:0x02], ax   ; This should print letter 'A'
    mov al, 'O'
    mov [es:0x04], ax   ; This should print letter 'O'
    mov al, '!'
    mov [es:0x06], ax   ; This should print letter '!'

    cli
    hlt

msg: db "PA"

; Bootsector padding
times 510-($-$$) db 0
dw 0xAA55

If you were to run this on VirtualBox the first 2 characters would be garbage while O! should display correctly. I will use this example through the rest of this answer.


VirtualBox / CS:IP / Segment:Offset Pairs

In the case of Virtual Box, it will effectively do the equivalent of a FAR JMP to 0x0000:0x7c00 after loading the boot sector at physical address 0x00007c00. A FAR JMP (or equivalent) will not only jump to a given address, it sets CS and IP to the values specified. A FAR JMP to 0x0000:0x7c00 will set CS = 0x0000 and IP = 0x7c00 .

If one is unfamiliar with the calculations behind 16-bit segment:offset pairs and how they map to a physical address then this document is a reasonably good starting point to understanding the concept. The general equation to get a physical memory address from a 16-bit segment:offset pair is (segment<<4)+offset = 20-bit physical address .

Since VirtualBox uses CS:IP of 0x0000:0x7c00 it would start executing code at a physical address of (0x0000<<4)+0x7c00 = 20-bit physical address 0x07c00 . Please be aware that this isn't guaranteed to be the case in all environments. Because of the nature of segment:offset pairs, there is more than one way to reference physical address 0x07c00. See the section at the end of this answer on ways to handle this properly.


What is Going Wrong with Your Bootloader?

Assuming we are using VirtualBox and the information above in the previous section is considered correct, then CS = 0x0000 and IP = 0x7c00 upon entry to our bootloader. If we take the example code (Using ORG 0x0000) I wrote in the first section of this answer and look at the disassembled information (I'll use objdump output) we'd see this:

objdump -Mintel -mi8086 -D -b binary --adjust-vma=0x0000 boot.bin

00000000 <.data>:
   0:   0e                      push   cs
   1:   1f                      pop    ds
   2:   68 00 b8                push   0xb800
   5:   07                      pop    es
   6:   b4 0e                   mov    ah,0xe
   8:   a0 24 00                mov    al,ds:0x24
   b:   26 a3 00 00             mov    es:0x0,ax
   f:   a0 25 00                mov    al,ds:0x25
  12:   26 a3 02 00             mov    es:0x2,ax
  16:   b0 4f                   mov    al,0x4f
  18:   26 a3 04 00             mov    es:0x4,ax
  1c:   b0 21                   mov    al,0x21
  1e:   26 a3 06 00             mov    es:0x6,ax
  22:   fa                      cli
  23:   f4                      hlt
  24:   50                      push   ax          ; Letter 'P'
  25:   41                      inc    cx          ; Letter 'A'
        ...
 1fe:   55                      push   bp
 1ff:   aa                      stos   BYTE PTR es:[di],al

Since the ORG information is lost when assembling to a binary file, I use --adjust-vma=0x0000 so that the first column of values (memory address) start at 0x0000. I want to do this because I used ORG 0x0000 in the original assembler code. I have also added some comments in the code to show where our data section is (and where the letters P and A were placed after the code).

If you were to run this program in VirtualBox the first 2 characters will come out as gibberish. So why is that? First recall VirtualBox reached our code by setting CS to 0x0000 and IP to 0x7c00. This code then copied CS to DS:

   0:   0e                      push   cs
   1:   1f                      pop    ds

Since CS was zero, then DS is zero. Now let us look at this line:

   8:   a0 24 00                mov    al,ds:0x24

ds:0x24 is actually the encoded address for the msg variable in our data section. The byte at offset 0x24 has the value P in it (0x25 has A). You might see where things might go wrong. Our DS = 0x0000 so mov al,ds:0x24 is really the same as mov al,0x0000:0x24. This syntax isn't valid but I'm replacing DS with 0x0000 to make a point. 0x0000:0x24 is where our code while executing will attempt to read our letter P from. But wait! That is physical address (0x0000<<4)+0x24 = 0x00024. This memory address happens to be at the bottom of memory in the middle of the interrupt vector table. Clearly this is not what we intended!

There are a couple ways to tackle this issue. The easiest (and preferred method) is to actually place the proper segment into DS, and not rely on what CS might be when our program runs. Since we set an ORG of 0x0000 we need to have a Data Segment(DS) = 0x07c0 . A segment:offset pair of 0x07c0:0x0000 = physical address 0x07c00 . Which is what the address of our bootloader is at. So all we have to do is amend the code by replacing:

    push cs
    pop  ds      ; DS=CS

With:

    push 0x07c0
    pop  ds      ; DS=0x07c0 

This change should provide the correct output when run in VirtualBox . Now let us see why. This code didn't change:

   8:   a0 24 00                mov    al,ds:0x24

Now when executed DS=0x07c0. This would have been like saying mov al,0x07c0:0x24. 0x07c0:0x24, which would translate into a physical address of (0x07c0<<4)+0x24 = 0x07c24 . This is what we want since our bootloader was physically placed into memory by the BIOS starting at that location and so it should reference our msg variable correctly.

Moral of the story? What ever you use for ORG there should be an applicable value in the DS register when we start our program.We should set it explicitly, and not rely on what is in CS.


Why Do Immediate Values Print?

With the original code, the first 2 characters printed gibberish, but the last two didn't. As was discussed in the previous section there was a reason the first 2 character wouldn't print, but what about the last 2 characters that did?

Let us examine the disassembly of the 3rd character O more carefully:

  16:   b0 4f                   mov    al,0x4f        ; 0x4f = 'O'

Since we used an immediate (constant) value and moved it into register AL, the character itself is encoded as part of the instruction. It doesn't rely on a memory access via the DS register. Because of this the last 2 characters displayed properly.


Ross Ridge's Suggestion and Why it Works in VirtualBox

Ross Ridge suggested we use ORG 0x7c00, and you observed that it worked. Why did that happen? And is that solution ideal?

Using my very first example and modify ORG 0x0000 to ORG 0x7c00, and then assemble it. objdump would have provided this disassembly:

objdump -Mintel -mi8086 -D -b binary  --adjust-vma=0x7c00 boot.bin

boot.bin:     file format binary   
Disassembly of section .data:

00007c00 <.data>:
    7c00:       0e                      push   cs
    7c01:       1f                      pop    ds
    7c02:       68 00 b8                push   0xb800
    7c05:       07                      pop    es
    7c06:       b4 0e                   mov    ah,0xe
    7c08:       a0 24 7c                mov    al,ds:0x7c24
    7c0b:       26 a3 00 00             mov    es:0x0,ax
    7c0f:       a0 25 7c                mov    al,ds:0x7c25
    7c12:       26 a3 02 00             mov    es:0x2,ax
    7c16:       b0 4f                   mov    al,0x4f
    7c18:       26 a3 04 00             mov    es:0x4,ax
    7c1c:       b0 21                   mov    al,0x21
    7c1e:       26 a3 06 00             mov    es:0x6,ax
    7c22:       fa                      cli
    7c23:       f4                      hlt
    7c24:       50                      push   ax          ; Letter 'P'
    7c25:       41                      inc    cx          ; Letter 'A'
        ...
    7dfe:       55                      push   bp
    7dff:       aa                      stos   BYTE PTR es:[di],al

VirtualBox set CS to 0x0000 when it jumped to our bootloader. Our original code then copied CS to DS, so DS = 0x0000. Now observe what the ORG 0x7c00 directive has done to our generated code:

    7c08:       a0 24 7c                mov    al,ds:0x7c24

Notice how we are now using an offset of 0x7c24! This would be like mov al,0x0000:0x7c24 which is physical address (0x0000<<4)+0x7c24 = 0x07c24. That is the right memory location where the bootloader was loaded, and is the proper position of our msg string. So it works.

Is using an ORG 0x7c00 a bad idea? No. It is fine. But we have a subtle issue to contend with. What happens if another Virtual PC environment or real hardware doesn't FAR JMP to our bootloader using a CS:IP of 0x0000:0x7c000? This is possible. There are many physical PCs with a BIOS that actually does the equivalent of a far jump to 0x07c0:0x0000. That too is physical address 0x07c00 as we have already seen. In that environment, when our code runs CS = 0x07c0. If we use the original code that copies CS to DS, DS now has 0x07c0 too. Now observe what would happen to this code in that situation:

    7c08:       a0 24 7c                mov    al,ds:0x7c24

DS=0x07c0 in this scenario. We now have something resembling mov al,0x07c0:0x7c24 when the program actually runs. Ut-oh, that looks bad. What does that translate to as a physical address? (0x07c0<<4)+0x7c24 = 0x0F824. That is somewhere above our bootloader and it will contain whatever happens to be there after the computer boots. Likely zeros, but it should be assumed to be garbage. Clearly not where our msg string was loaded!

So how do we resolve this? To amend what Ross Ridge suggested, and to heed the advice I previously gave about explicitly setting DS to the segment we really want (don't assume CS is correct and then blindly copy to DS) we should place 0x0000 into DS when our bootloader starts if we use ORG 0x7c00. So we can change this code:

ORG 0x7c00

start:
    push cs
    pop  ds      ; DS=CS

to:

ORG 0x7c00

start:
    xor ax, ax   ; ax=0x0000
    mov ds, ax   ; DS=0x0000

Here we don't rely on an untrusted value in CS. We simply set DS to the segment value that makes sense given the ORG we used. You could have pushed 0x0000 and popped it into DS as you have been doing. I am more accustomed to zeroing out a register and moving that to DS.

By taking this approach, it doesn't matter what value in CS might have been used to reach our bootloader, the code would still reference the appropriate memory location for our data.


Don't Assume 1st Stage is Invoked by BIOS with CS:IP=0x0000:0x7c00

In my General Bootloader Tips that I wrote in a previous StackOverflow answer, tip #1 is very important:

  • When the BIOS jumps to your code you can't rely on CS,DS,ES,SS,SP registers having valid or expected values. They should be set up appropriately when your bootloader starts. You can only be guaranteed that your bootloader will be loaded and run from physical address 0x07c00 and that the boot drive number is loaded into the DL register.

The BIOS could have FAR JMP'ed (or equivalent) to our code with jmp 0x07c0:0x0000, and some emulators and real hardware do it this way. Others use jmp 0x0000:0x7c00 like VirtualBox does.

We should account for this by setting DS explicitly to what we need, and set it to what makes sense for the value we use in our ORG directive.


Summary

Don't assume CS is a value we expect, and don't blindly copy CS to DS . Set DS explicitly.

Your code could be fixed to use either ORG 0x0000 as you originally had it, if we set DS appropriately to 0x07c0 as previously discussed. That could look like:

ORG 0
BITS 16

push word 0xB800        ; Address of text screen video memory in real mode for colored monitors
push 0x07c0
pop ds                  ; DS=0x07c0 since we use ORG 0x0000
pop es

Alternatively we could have used ORG 0x7c00 like this:

ORG 0x7c00
BITS 16

push word 0xB800        ; Address of text screen video memory in real mode for colored monitors
push 0x0000
pop ds                  ; DS=0x0000 since we use ORG 0x7c00
pop es

这篇关于从8086实模式存储器中读取,而使用“ORG为0x0000”的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆