为什么用char代替unsigned char型做用C的工作字符串函数的阵列? [英] Why do the string functions in C work on arrays with char instead of unsigned char?
问题描述
在 C
标准库函数,字符串的元素是字符
秒。是否有一个很好的理由,决定代替 unsigned char型
?
In the C
standard library functions, the elements of the strings are char
s. Is there a good reason why it was decided instead of unsigned char
?
使用 unsigned char型
8位串有一些,虽然小的优点:
Using unsigned char
for 8-bit strings has some, albeit small advantages:
- 它更直观,就像我们平时记住ASCII codeS无符号数,并在二进制数据的时候,我们preFER为0xFF,无符号的范围为0x00,而不是处理负数。所以我们要投。
- 与无符号整数工作可能会更快/更有效,或者在某些处理器产生较小的code。
推荐答案
C提供了三种不同的字符类型:
C provides three different character types:
-
字符
重新presents一个字符(C也称为字节)。 -
unsigned char型
重新presents位的字节大小的图案,或者一个无符号整数。 -
符号字符
重新presents字节大小的有符号整数。
char
represents a character (which C also calls a "byte").unsigned char
represents a byte-sized pattern of bits, or an unsigned integer.signed char
represents a byte-sized signed integer.
这是实现定义是否字符
是一个符号或无符号类型,所以我觉得这个问题等于或者为什么字符
存在于一切,因为这可能签署的类型?或者为什么不是C要求字符
是无符号的?
It is implementation-defined whether char
is a signed or an unsigned type, so I think the question amounts to either "why does char
exist at all as this maybe-signed type?" or "why doesn't C require char
to be unsigned?".
要知道的第一件事是,里奇添加了char类型到B语言于1971年,和C继承了它从那里。在此之前,B是面向字的,而不是面向字节(这么说他本人,请参阅B的问题。)
The first thing to know is that Ritchie added the "char" type to the B language in 1971, and C inherited it from there. Prior to that, B was word-oriented rather than byte-oriented (so says the man himself, see "The Problems of B".)
做完这些后,回答我的两个问题,可能是早期c版本没有无符号类型。
With that done, the answer to both of my questions might be that early versions of C didn't have unsigned types.
在字符
并建立字符串处理函数,它们全部更改为 unsigned char型
将是一个严重的打破变化(即几乎所有现有的code将停止工作),以及C已经尝试了几十年培养其用户基础的方法之一是大多避免灾难性的不兼容的更改。因此,这将是令人惊讶的对C进行改变。
Once char
and the string-handling functions were established, changing them all to unsigned char
would be a serious breaking change (i.e. almost all existing code would stop working), and one of the ways C has tried to cultivate its user-base over the decades is by mostly avoiding catastrophic incompatible changes. So it would be surprising for C to make that change.
由于字符
将是字符类型,那(当你观察),它使一个很大的意义它是无符号的,但大量的实现已经存在于其中焦炭上签字,我想这使得焦炭实现定义的符号性是可行的折衷办法 - 现有code将继续工作。只要它是使用字符
仅作为一个字符,而不是算术或订单比较,它也将被移植到实现,其中字符
是无符号的。
Given that char
is going to be the character type, and that (as you observe) it makes a lot of sense for it to be unsigned, but that plenty of implementations already existed in which char was signed, I suppose that making the signedness of char implementation-defined was a workable compromise -- existing code would continue working. Provided that it was using char
only as a character and not for arithmetic or order comparisons, it would also be portable to implementations where char
is unsigned.
不像一些C的古老实现定义的变化,实施者也仍然选择签订字符(英特尔)。 C标准委员会不能不看到,有些人似乎坚持使用某种原因签署字符。不管这些人的原因是,当前或历史,C必须允许它,因为现有的C实现依赖于它是允许的。所以迫使字符
是无符号远比强迫 INT
下的可实现的目标名单上是2的补,并C已经甚至没有做到这一点。
Unlike some of C's age-old implementation-defined variations, implementers do still choose signed characters (Intel). The C standard committee cannot help but observe that some people seem to stick with signed characters for some reason. Whatever those people's reasons are, current or historical, C has to allow it because existing C implementations rely on it being allowed. So forcing char
to be unsigned is far lower on the list of achievable goals than forcing int
to be 2's complement, and C hasn't even done that.
一个补充问题是为什么英特尔还指定字符
在其ABI的签名?,而我不知道答案,但我猜他们从来没有机会没有巨大破坏,否则做。也许他们甚至喜欢他们。
A supplementary question is "why does Intel still specify char
to be signed in its ABIs?", to which I don't know an answer but I'd guess that they've never had an opportunity to do otherwise without massive disruption. Maybe they even like them.
这篇关于为什么用char代替unsigned char型做用C的工作字符串函数的阵列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!