为什么我们可以在C语言之外编写代码? [英] Why is it that we can write outside of bounds in C?

查看:137
本文介绍了为什么我们可以在C语言之外编写代码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近完成了有关虚拟内存的阅读,并且对malloc在虚拟地址空间和物理内存中的工作方式有疑问.

I recently finished reading about virtual memory and I have a question about how malloc works within the Virtual address space and Physical Memory.

例如(从另一个SO帖子复制的代码)

For example (code copied from another SO post)

void main(){
int *p;
p=malloc(sizeof(int));
p[500]=999999;
printf("p[0]=%d\n",p[500]); //works just fine. 
}

为什么允许这种情况发生?还是喜欢为什么p [500]上的地址甚至可写?

Why is this allowed to happen? Or like why is that address at p[500] even writable?

这是我的猜测.

当调用malloc时,也许OS决定为该进程提供整个页面.我仅假设每个页面都值得4KB的空间.整个东西都被标记为可写了吗?这就是为什么您最多可以进入页面500 * sizeof(int)的原因(假设int是4字节大小的32位系统).

When malloc is called, perhaps the OS decides to give the process an entire page. I will just assume that each page is worth 4KB of space. Is that entire thing marked as writable? That's why you can go as far as 500*sizeof(int) into the page (assuming 32bit system where int is size of 4 bytes).

我看到当我尝试以更大的值进行编辑时...

I see that when I try to edit at a larger value...

   p[500000]=999999; // EXC_BAD_ACCESS according to XCode

段故障.

如果是这样,那么这是否意味着某些页面专用于您的代码/指令/文本段,并且标记为不可写的页面与您的堆栈/变量所在的页面(事情确实有所改变)完全分开,并且标记为可写?当然,该过程认为它们在32位系统的4gb地址空间中紧挨着每个顺序.

If so, then does that mean that there are pages that are dedicated to your code/instructions/text segments and marked as unwrite-able completely separate from your pages where your stack/variables are in (where things do change) and marked as writable? Of course, the process thinks they're next to each order in the 4gb address space on a 32-bit system.

推荐答案

请考虑以下Linux代码:

Consider the following code for Linux:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int staticvar;
const int constvar = 0;

int main(void)
{
        int stackvar;
        char buf[200];
        int *p;

        p = malloc(sizeof(int));
        sprintf(buf, "cat /proc/%d/maps", getpid());
        system(buf);

        printf("&staticvar=%p\n", &staticvar);
        printf("&constvar=%p\n", &constvar);
        printf("&stackvar=%p\n", &stackvar);
        printf("p=%p\n", p);
        printf("undefined behaviour: &p[500]=%p\n", &p[500]);
        printf("undefined behaviour: &p[50000000]=%p\n", &p[50000000]);

        p[500] = 999999; //undefined behaviour
        printf("undefined behaviour: p[500]=%d\n", p[500]);
        return 0;
}

它将打印进程的内存映射以及某些不同类型的内存的地址.

It prints the memory map of the process and the addresses of some different type of memory.

[osboxes@osboxes ~]$ gcc tmp.c -g -static -Wall -Wextra -m32
[osboxes@osboxes ~]$ ./a.out
08048000-080ef000 r-xp 00000000 fd:00 919429                /home/osboxes/a.out
080ef000-080f2000 rw-p 000a6000 fd:00 919429                /home/osboxes/a.out
080f2000-080f3000 rw-p 00000000 00:00 0
0824d000-0826f000 rw-p 00000000 00:00 0                     [heap]
f779c000-f779e000 r--p 00000000 00:00 0                     [vvar]
f779e000-f779f000 r-xp 00000000 00:00 0                     [vdso]
ffe4a000-ffe6b000 rw-p 00000000 00:00 0                     [stack]
&staticvar=0x80f23a0
&constvar=0x80c2fcc
&stackvar=0xffe69b88
p=0x824e2a0
undefined behaviour: &p[500]=0x824ea70
undefined behaviour: &p[50000000]=0x1410a4a0
undefined behaviour: p[500]=999999

或者为什么为什么p [500]上的地址甚至可写?

Or like why is that address at p[500] even writable?

堆是从0824d000-0826f000开始的,而& p [500]是0x824ea70,因此该内存是可写和可读的,但是该内存区域可能包含将要更改的真实数据!对于示例程序,很可能未使用该示例程序,因此对该存储器的写操作对该过程无害.

Heap is from 0824d000-0826f000 and &p[500] is 0x824ea70 by chance, so the memory is writeable and readable, but this memory region may contain real data which will be altered! In the case of the sample program it is most likely that it is unused so the write to this memory is not harmful for the process to work.

& p [50000000]偶然是0x1410a4a0,它不在内核映射到该进程的页面中,因此是不可写和不可读的,因此存在段错误.

&p[50000000] is 0x1410a4a0 by chance, which is not in a page the kernel mapped to the process and therefore is unwriteable and unreadable, hence the seg fault.

如果使用-fsanitize=address进行编译,则会检查内存访问,

If you compile it with -fsanitize=address memory accesses will be checked and many but not all illegal memory accesses will be reported by AddressSanitizer. Slowdown is about two times slower than without AddressSanitizer.

[osboxes@osboxes ~]$ gcc tmp.c -g -Wall -Wextra -m32 -fsanitize=address
[osboxes@osboxes ~]$ ./a.out
[...]
undefined behaviour: &p[500]=0xf5c00fc0
undefined behaviour: &p[50000000]=0x1abc9f0
=================================================================
==2845==ERROR: AddressSanitizer: heap-buffer-overflow on address 0xf5c00fc0 at pc 0x8048972 bp 0xfff44568 sp 0xfff44558
WRITE of size 4 at 0xf5c00fc0 thread T0
    #0 0x8048971 in main /home/osboxes/tmp.c:24
    #1 0xf70a4e7d in __libc_start_main (/lib/libc.so.6+0x17e7d)
    #2 0x80486f0 (/home/osboxes/a.out+0x80486f0)

AddressSanitizer can not describe address in more detail (wild memory access suspected).
SUMMARY: AddressSanitizer: heap-buffer-overflow /home/osboxes/tmp.c:24 main
[...]
==2845==ABORTING

如果是这样,那么这是否意味着某些页面专用于您的代码/指令/文本段,并且标记为不可写的页面与您的堆栈/变量所在的页面(事情确实有所改变)完全分开,并且标记为可写?

If so, then does that mean that there are pages that are dedicated to your code/instructions/text segments and marked as unwrite-able completely separate from your pages where your stack/variables are in (where things do change) and marked as writable?

是的,请参见上面的进程内存映射的输出. r-xp表示可读和可执行,rw-p表示可读和可写.

Yes, see the output of the process' memory map above. r-xp means readable and executable, rw-p means readable and writeable.

这篇关于为什么我们可以在C语言之外编写代码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆