可以修改 C 中的字符串文字吗? [英] Can a string literal in C be modified?

查看:34
本文介绍了可以修改 C 中的字符串文字吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近有一个问题,我知道一个指向如下代码中初始化的常量数组的指针位于 .rodata 区域中,并且该区域仅可读.但是,我在模式 C11 中看到,写入此内存地址行为将是未定义的.我知道Borland的Turbo-C编译器可以写指针指向的地方,这是因为处理器在当时的某些系统上以实模式运行,例如MS-DOS?还是独立于处理器的工作模式?是否有任何其他编译器写入指针并且在保护模式下使用处理器不会发生任何内存破坏故障?

#include int main(void) {char *st = "aaa";*st = 'b';返回0;}

在MS-DOS下用Turbo-C编译的这段代码,你将能够写入内存

解决方案

是否有其他编译器可以写入指针并且在保护模式下使用处理器不会发生任何内存破坏故障?

一些 GCC 编译器如何修改常量字符指针?

GCC 3 及更早版本曾经支持 gcc -fwriteable-strings 以让您编译旧的 K&RC,根据 https://gcc.gnu.org/onlinedocs/gcc-3.3.6/gcc/不兼容.html.(这是 ISO C 中未定义的行为,因此是 ISO C 程序中的错误).该选项将定义 ISO C 未定义的赋值行为.

<块引用>

<块引用>

GCC3.3.6 手册-C方言选项

<块引用>

-fwritable-strings
将字符串常量存储在可写数据段中,并且不要对其进行唯一化.这是为了与假设可以写入字符串常量的旧程序兼容.

<块引用>

写入字符串常量是一个非常糟糕的主意;常量"应该是常量.

GCC 4.0 删除了该选项(发行说明);最后一个 GCC3 系列是 2006 年 3 月的 gcc3.4.6.虽然显然 它已经变得有问题 在那个版本中.

gcc -fwritable-strings 会将字符串文字视为非常量匿名字符数组(请参阅@gnasher 的回答),因此它们进入 .data 部分而不是.rodata,从而链接到映射到读写页面的可执行文件段,而不是只读的.(可执行段基本上与 x86 分段无关,它只是从可执行文件到内存的 start+range 内存映射.)

它会禁用重复字符串合并,所以 char *foo() { return "hello";}char *bar() { return "hello";} 将返回不同的指针值,而不是合并相同的字符串文字.


相关:


链接器选项:仍然是未定义的行为,所以可能不可行

在 GNU/Linux 上,使用 ld -N (--omagic) 链接将使文本(以及数据)部分读+写.这可能适用于 .rodata 即使现代 GNU Binutils ld.rodata 放在它自己的部分(通常带有 read 但not exec 权限),而不是使其成为 .text 的一部分..text 可写很容易成为一个安全问题:你永远不希望一个页面同时具有 write+exec,否则一些错误(如缓冲区溢出)可能会变成代码注入攻击.

要从 gcc 执行此操作,请在链接时使用 gcc -Wl,-N 将该选项传递给 ld.

这对于编写 const 对象的未定义行为没有任何作用.编译器仍然会合并重复的字符串,因此写入一个 char *foo = "hello"; 将影响整个程序中 "hello" 的所有其他使用,甚至跨文件.

如果你想要一些可写的东西,使用 static char foo[] = "hello"; 其中引用的字符串只是一个非常量数组的数组初始值设定项. 作为奖励,这在全局范围内比 static char *foo = "hello"; 更有效,因为获取数据的间接级别少了一层:它只是一个数组而不是一个指针存储在内存中.

I recently had a question, I know that a pointer to a constant array initialized as it is in the code below, is in the .rodata region and that this region is only readable. However, I saw in pattern C11, that writing in this memory address behavior will be undefined. I was aware that the Borland's Turbo-C compiler can write where the pointer points, this would be because the processor operated in real mode on some systems of the time, such as MS-DOS? Or is it independent of the operating mode of the processor? Is there any other compiler that writes to the pointer and does not take any memory breach failure using the processor in protected mode?

#include <stdio.h>

int main(void) {
    char *st = "aaa";
    *st = 'b'; 
    return 0;
}

In this code compiling with Turbo-C in MS-DOS, you will be able to write to memory

解决方案

Is there any other compiler that writes to the pointer and does not take any memory breach failure using the processor in protected mode?

How can some GCC compilers modify a constant char pointer?

GCC 3 and earlier used to support gcc -fwriteable-strings to let you compile old K&R C where this was apparently legal, according to https://gcc.gnu.org/onlinedocs/gcc-3.3.6/gcc/Incompatibilities.html. (It's undefined behaviour in ISO C and thus a bug in an ISO C program). That option will define the behaviour of the assignment which ISO C leaves undefined.

GCC 3.3.6 manual - C Dialect options

-fwritable-strings
Store string constants in the writable data segment and don't uniquize them. This is for compatibility with old programs which assume they can write into string constants.

Writing into string constants is a very bad idea; "constants" should be constant.

GCC 4.0 removed that option (release notes); the last GCC3 series was gcc3.4.6 in March 2006. Although apparently it had become buggy in that version.

gcc -fwritable-strings would treat string literals like non-const anonymous character arrays (see @gnasher's answer), so they go in the .data section instead of .rodata, and thus get linked into a segment of the executable that's mapped to read+write pages, not read-only. (Executable segments have basically nothing to do with x86 segmentation, it's just a start+range memory-mapping from the executable file to memory.)

And it would disable duplicate-string merging, so char *foo() { return "hello"; } and char *bar() { return "hello"; } would return different pointer values, instead of merging identical string literals.


Related:


Linker option: still Undefined Behaviour so probably not viable

On GNU/Linux, linking with ld -N (--omagic) will make the text (as well as data) section read+write. This may apply to .rodata even though modern GNU Binutils ld puts .rodata in its own section (normally with read but not exec permission) instead of making it part of .text. Having .text writeable could easily be a security problem: you never want a page with write+exec at the same time, otherwise some bugs like buffer overflows can turn into code-injection attacks.

To do this from gcc, use gcc -Wl,-N to pass on that option to ld when linking.

This doesn't do anything about it being Undefined Behaviour to write const objects. e.g. the compiler will still merge duplicate strings, so writing into one char *foo = "hello"; will affect all other uses of "hello" in the whole program, even across files.

If you want something writeable, use static char foo[] = "hello"; where the quoted string is just an array initializer for a non-const array. As a bonus, this is more efficient than static char *foo = "hello"; at global scope, because there's one fewer level of indirection to get to the data: it's just an array instead a pointer stored in memory.

这篇关于可以修改 C 中的字符串文字吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆