分段寄存器的使用 [英] Segmentation registers use

查看:112
本文介绍了分段寄存器的使用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图了解内存管理的底层知识,并有两个问题.

I am trying to understand how memory management goes on low level and have a couple of questions.

1)Kip R. Irvine撰写的有关汇编语言的书说,在实模式下,程序启动时,前三个段寄存器会加载代码,数据和堆栈段的基地址.这对我来说有点模棱两可.这些值是手动指定的,还是汇编程序生成指令以将这些值写入寄存器?如果自动发生,它如何找出这些段的大小?

1) A book about assembly language by by Kip R. Irvine says that in the real mode first three segment registers are loaded with base addresses of code, data, and stack segment when the program starts. This is a bit ambigous to me. Are these values specified manually or does the assembler generates instructions to write the values into registers? If it happens automatically, how it finds out what is the size of these segments?

2)我知道Linux使用扁平线性模型,即以非常有限的方式使用分段.另外,根据Daniel P. Bovet和Marco Cesati的了解Linux内核",有四个主要部分:GDT中的用户数据,用户代码,内核数据和内核代码.所有四个段具有相同的大小和基地址.我不明白为什么如果其中四个仅在类型和访问权限上有所不同(它们都产生相同的线性地址,对吗?),为什么需要四个?为什么不只使用其中之一并将其描述符写入所有段寄存器?

2) I know that Linux uses flat linear model, i.e. uses segmentation in a very limited way. Also, according to "Understanding the Linux Kernel" by Daniel P. Bovet and Marco Cesati there are four main segments: user data, user code, kernel data and kernel code in GDT. All four segments have the same size and base address. I do not understand why there is need in four of them if they differ only in type and access rights (they all produce the same linear address, right?). Why not use just one of them and write its descriptor to all segment registers?

3)不使用分段的操作系统如何将程序划分为逻辑段?例如,它们如何区分堆栈和没有段描述符的代码.我读到分页可用于处理此类问题,但不了解如何处理.

3) How operating systems that do not use segmentation divide programs into logical segments? For example, how they differentiate stack from code without segment descriptors. I read that paging can be used to handle such things, but don't understand how.

推荐答案

Beenoit对问题3的回答中进行扩展.

由不同的代理在不同的时间点将程序划分为逻辑部分,例如代码,常量数据,可修改的数据和堆栈.

The division of programs into logical parts such as code, constant data, modifiable data and stack is done by different agents at different points in time.

首先,您的编译器(和链接器)将在指定此划分的位置创建可执行文件.如果查看多种可执行文件格式(PE,ELF等),您会发现它们支持某种类型的节或段或任何您想调用的格式.除了文件中的地址,大小和位置之外,这些部分还具有告诉操作系统这些部分目的的属性,例如这部分包含代码(这是入口点),这是-初始化的常量数据,那-未初始化的数据(通常不占用文件空间),这是关于堆栈的内容,上面是依赖项列表(例如DLL),等

First, your compiler (and linker) creates executable files where this division is specified. If you look at a number of executable file formats (PE, ELF, etc), you'll see that they support some kind of sections or segments or whatever you want to call it. Besides addresses and sizes and locations within the file, those sections bear attributes telling the OS the purpose of these sections, e.g. this section contains code (and here's the entry point), this - initialized constant data, that - uninitialized data (typically not taking space in the file), here's something about the stack, over there is the list of dependencies (e.g. DLLs), etc.

接下来,当OS开始执行程序时,它将解析文件以查看程序需要多少内存,每个部分需要在何处以及需要什么内存保护.后者通常通过页表来完成.代码页被标记为可执行和只读,常量数据页被标记为非可执行和只读,其他数据页(包括堆栈的那些)被标记为非可执行和可读写.这就是通常的样子.

Next, when the OS starts executing the program, it parses the file to see how much memory the program needs, where and what memory protection is needed for every section. The latter is commonly done via page tables. The code pages are marked as executable and read-only, the constant data pages are marked as not executable and read-only, other data pages (including those of the stack) are marked as not executable and read-write. This is how it ought to be normally.

通常,程序需要读写,同时还需要可执行区域来动态生成代码,或者仅仅为了能够修改现有代码.组合的RWX访问可以在可执行文件中指定,也可以在运行时请求.

Often times programs need read-write and, at the same time, executable regions for dynamically generated code or just to be able to modify the existing code. The combined RWX access can be either specified in the executable file or requested at run time.

还有其他一些特殊页面,例如用于动态堆栈扩展的保护页面,它们被放置在堆栈页面的旁边.例如,您的程序开始时为64KB堆栈分配了足够的页面,然后当该程序尝试访问该页面之外的内容时,操作系统将拦截对这些保护页面的访问,为堆栈分配更多页面(最大支持大小),然后进一步移动保护页面.这些页面无需在可执行文件中指定,操作系统可以自行处理它们.该文件应仅指定堆栈大小以及位置.

There can be other special pages such as guard pages for dynamic stack expansion, they're placed next to the stack pages. For example, your program starts with enough pages allocated for a 64KB stack and then when the program tries to access beyond that point, the OS intercepts access to those guard pages, allocates more pages for the stack (up to the maximum supported size) and moves the guard pages further. These pages don't need to be specified in the executable file, the OS can handle them on its own. The file should only specify the stack size(s) and perhaps the location.

如果操作系统中没有硬件或代码可区分代码存储器和数据存储器或强制执行存储器访问权限,则划分是非常正式的. 16位实模式DOS程序(COM和EXE)没有以某种特殊方式标记的代码,数据和堆栈段. COM程序将所有内容放在一个公共的64KB段中,它们以IP = 0x100和SP = 0xFFxx开头,并且代码和数据的顺序在内部可以是任意的,并且可以自由地相互缠绕. DOS EXE文件仅指定了CS:IP和SS:SP的起始位置,除此之外,代码,数据和堆栈段与DOS不能区分开.它所需要做的就是加载文件,执行重定位(仅适用于EXE),设置PSP(程序段前缀,包含命令行参数和一些其他控制信息),加载SS:SP和CS:IP.它无法保护内存,因为在实地址模式下无法使用内存保护,因此16位DOS可执行文件格式非常简单.

If there's no hardware or code in the OS to distinguish code memory from data memory or to enforce memory access rights, the division is very formal. 16-bit real-mode DOS programs (COM and EXE) didn't have code, data and stack segments marked in some special way. COM programs had everything in one common 64KB segment and they started with IP=0x100 and SP=0xFFxx and the order of code and data could be arbitrary inside, they could intertwine practically freely. DOS EXE files only specified the starting CS:IP and SS:SP locations and beyond that the code, data and stack segments were indistinguishable to DOS. All it needed to do was load the file, perform relocation (for EXEs only), set up the PSP (Program Segment Prefix, containing the command line parameter and some other control info), load SS:SP and CS:IP. It could not protect memory because memory protection isn't available in the real address mode, and so the 16-bit DOS executable formats were very simple.

这篇关于分段寄存器的使用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆