brk() 系统调用有什么作用? [英] What does the brk() system call do?

查看:29
本文介绍了brk() 系统调用有什么作用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据Linux程序员手册:

According to Linux programmers manual:

brk() 和 sbrk() 改变程序中断的位置,即定义进程数据段的结尾.

brk() and sbrk() change the location of the program break, which defines the end of the process's data segment.

这里的数据段是什么意思?是只是数据段还是数据、BSS、堆的结合?

What does the data segment mean over here? Is it just the data segment or data, BSS, and heap combined?

根据维基数据段:

有时数据、BSS 和堆区域统称为数据段".

Sometimes the data, BSS, and heap areas are collectively referred to as the "data segment".

我认为没有理由只更改数据段的大小.如果它是数据,BSS 和堆一起,那么这是有道理的,因为堆将获得更多空间.

I see no reason for changing the size of just the data segment. If it is data, BSS and heap collectively then it makes sense as heap will get more space.

这就引出了我的第二个问题.在我目前阅读的所有文章中,作者都​​说堆向上增长,堆栈向下增长.但是他们没有解释的是当堆占据了堆和栈之间的所有空间时会发生什么?

Which brings me to my second question. In all the articles I read so far, author says that heap grows upward and stack grows downward. But what they do not explain is what happens when heap occupies all the space between heap and stack?

推荐答案

在您发布的图表中,break"——由 brksbrk 操作的地址——是堆顶部的虚线.

In the diagram you posted, the "break"—the address manipulated by brk and sbrk—is the dotted line at the top of the heap.

您阅读的文档将其描述为数据段"的结尾,因为在传统(预共享库,预mmap)Unix 中,数据段与堆是连续的;在程序启动之前,内核会将文本"和数据"块加载到 RAM 中,从地址 0 开始(实际上略高于地址 0,因此 NULL 指针真正不指向任何内容)并将中断地址设置为数据段的结尾.对 malloc 的第一次调用将使用 sbrk 移动拆分并在数据段的顶部和新的之间创建堆, 更高的中断地址,如图所示,随后使用 malloc 将根据需要使用它来扩大堆.

The documentation you've read describes this as the end of the "data segment" because in traditional (pre-shared-libraries, pre-mmap) Unix the data segment was continuous with the heap; before program start, the kernel would load the "text" and "data" blocks into RAM starting at address zero (actually a little above address zero, so that the NULL pointer genuinely didn't point to anything) and set the break address to the end of the data segment. The first call to malloc would then use sbrk to move the break up and create the heap in between the top of the data segment and the new, higher break address, as shown in the diagram, and subsequent use of malloc would use it to make the heap bigger as necessary.

同时,堆栈从内存顶部开始并向下增长.堆栈不需要显式系统调用来使它变大;要么开始时分配给它的 RAM 与它所能拥有的一样多(这是传统方法),要么在堆栈下方有一个保留地址区域,当内核注意到有写入尝试时,它会自动分配 RAM(这是现代方法).无论哪种方式,地址空间底部可能有也可能没有可用于堆栈的保护"区域.如果该区域存在(所有现代系统都这样做),则它会永久取消映射;如果堆栈或堆试图向其中增长,则会出现分段错误.但是,传统上,内核不会尝试强制执行边界;堆栈可以增长到堆中,或者堆也可以增长到堆栈中,无论哪种方式,它们都会在彼此的数据上乱写并且程序会崩溃.如果你很幸运,它会立即崩溃.

Meantime, the stack starts at the top of memory and grows down. The stack doesn't need explicit system calls to make it bigger; either it starts off with as much RAM allocated to it as it can ever have (this was the traditional approach) or there is a region of reserved addresses below the stack, to which the kernel automatically allocates RAM when it notices an attempt to write there (this is the modern approach). Either way, there may or may not be a "guard" region at the bottom of the address space that can be used for stack. If this region exists (all modern systems do this) it is permanently unmapped; if either the stack or the heap tries to grow into it, you get a segmentation fault. Traditionally, though, the kernel made no attempt to enforce a boundary; the stack could grow into the heap, or the heap could grow into the stack, and either way they would scribble over each other's data and the program would crash. If you were very lucky it would crash immediately.

我不确定这张图中的 512GB 数字来自哪里.它意味着一个 64 位虚拟地址空间,这与您在那里拥有的非常简单的内存映射不一致.一个真正的 64 位地址空间看起来更像这样:

I'm not sure where the number 512GB in this diagram comes from. It implies a 64-bit virtual address space, which is inconsistent with the very simple memory map you have there. A real 64-bit address space looks more like this:

              Legend:  t: text, d: data, b: BSS

这不是遥不可及的,它不应该被解释为任何给定的操作系统是如何做事的(在我绘制它之后我发现 Linux 实际上将可执行文件比我想象的更接近于地址零,并且共享库的地址高得惊人).此图中的黑色区域未映射——任何访问都会立即导致段错误——并且它们相对于灰色区域巨大.浅灰色区域是程序及其共享库(可能有几十个共享库);每个都有一个独立文本和数据段(以及bss"段,它也包含全局数据,但被初始化为所有位为零而不是占用磁盘上的可执行文件或库中的空间).堆不再必须与可执行文件的数据段连续——我是这样画的,但看起来 Linux 至少不会那样做.堆栈不再挂在虚拟地址空间的顶部,堆和堆栈之间的距离如此之大,您不必担心跨越它.

This is not remotely to scale, and it shouldn't be interpreted as exactly how any given OS does stuff (after I drew it I discovered that Linux actually puts the executable much closer to address zero than I thought it did, and the shared libraries at surprisingly high addresses). The black regions of this diagram are unmapped -- any access causes an immediate segfault -- and they are gigantic relative to the gray areas. The light-gray regions are the program and its shared libraries (there can be dozens of shared libraries); each has an independent text and data segment (and "bss" segment, which also contains global data but is initialized to all-bits-zero rather than taking up space in the executable or library on disk). The heap is no longer necessarily continous with the executable's data segment -- I drew it that way, but it looks like Linux, at least, doesn't do that. The stack is no longer pegged to the top of the virtual address space, and the distance between the heap and the stack is so enormous that you don't have to worry about crossing it.

break 仍然是堆的上限.然而,我没有展示的是,可能有几十个独立的内存分配在某个地方,使用 mmap 而不是 brk.(操作系统会尽量让它们远离 brk 区域,以免它们发生碰撞.)

The break is still the upper limit of the heap. However, what I didn't show is that there could be dozens of independent allocations of memory off there in the black somewhere, made with mmap instead of brk. (The OS will try to keep these far away from the brk area so they don't collide.)

这篇关于brk() 系统调用有什么作用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆