C和C ++是否保证[a-f]和[A-F]字符的ASCII? [英] Does C and C++ guarantee the ASCII of [a-f] and [A-F] characters?

查看：76 发布时间：2020/9/7 19:33:41 c++ c ascii

本文介绍了C和C ++是否保证[a-f]和[A-F]字符的ASCII?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在查看以下代码，以测试十六进制数字并将其转换为整数.该代码之所以巧妙，是因为它利用了大写字母和小写字母之间的差异(32位)，即位5.因此，该代码执行了一个额外的OR，但是节省了一个JMP和两个CMP.

I'm looking at the following code to test for a hexadecimal digit and convert it to an integer. The code is kind of clever in that it takes advantage of difference between between capital and lower letters is 32, and that's bit 5. So the code performs one extra OR, but saves one JMP and two CMPs.

static const int BIT_FIVE = (1 << 5);
static const char str[] = "0123456789ABCDEFabcdef";

for (unsigned int i = 0; i < COUNTOF(str); i++)
{
    int digit, ch = str[i];

    if (ch >= '0' && ch <= '9')
        digit = ch - '0';
    else if ((ch |= BIT_FIVE) >= 'a' && ch <= 'f')
        digit = ch - 'a' + 10;
    ...
}

C和C ++是否保证ASCII或[a-f]和[A-F]字符的值?在这里，保证意味着上，下字符集将始终以可以用位表示的常量值不同(对于上述技巧).如果没有，标准对他们有什么看法?

Do C and C++ guarantee the ASCII or values of [a-f] and [A-F] characters? Here, guarantee means the upper and lower character sets will always differ by a constant value that can be represented by a bit (for the trick above). If not, what does the standard say about them?

(对不起，C和C ++标记.我对两种语言在主题上的立场都感兴趣).

(Sorry for the C and C++ tag. I'm interested in both language's position on the subject).

正当化

世界上有很多字符编码.如果您关心可移植性，则可以使程序可移植到不同的字符集，也可以选择一种字符集以在所有地方使用(例如Unicode).我将继续为您大致分类大多数现有的字符编码:

Justification

There are a lot of character encodings out in the world. If you care about portability, you can either make your program portable to different character sets, or you can choose one character set to use everywhere (e.g. Unicode). I'll go ahead and loosely categorize most existing character encodings for you:

与ISO/IEC 646兼容的单字节字符编码.数字0-9和字母A-Z和a-z始终位于相同的位置.

Single byte character encodings compatible with ISO/IEC 646. Digits 0-9 and letters A-Z and a-z always occupy the same positions.

多字节字符编码(Big5，Shift JIS，基于ISO 2022).在这些编码中，您的程序可能已经已损坏，并且您需要花时间修复它.但是，解析数字仍然可以按预期进行.

Multibyte character encodings (Big5, Shift JIS, ISO 2022-based). In these encodings, your program is probably already broken and you'll need to spend time fixing it if you care. However, parsing numbers will still work as expected.

Unicode编码.数字0-9和字母A-Z，a-z始终占据相同的位置.您可以自由地使用代码点或代码单元，如果您使用的代码点数小于128(您自己)，则将获得相同的结果. (您使用的是UTF-7吗?否，您只能将其用于电子邮件.

Unicode encodings. Digits 0-9 and letters A-Z, a-z always occupy the same positions. You can either work with code points or code units freely and you will get the same result, if you are working with code points below 128 (which you are). (Are you working with UTF-7? No, you should only use that for email.

EBCDIC.为数字和字母分配的值与其在ASCII中的值不同，但是0-9和A-F，a-f仍然是连续的.即使这样，您的代码也可以在EBCDIC系统上运行的机会基本上为零.

EBCDIC. Digits and letters are assigned different values than their values in ASCII, however, 0-9 and A-F, a-f are still contiguous. Even then, the chance that your code will run on an EBCDIC system is essentially zero.

所以这里的问题是:您是否认为将来会发明一个假设的第五种选择，与Unicode相比，它以某种方式不那么兼容/更难使用?

So the question here is: Do you think that a hypothetical fifth option will be invented in the future, somehow less compatible / more difficult to use than Unicode?

您在乎EBCDIC吗?

Do you care about EBCDIC?

我们可能整天都在幻想奇异的系统...假设CHAR_BIT是11或sizeof(long) = 100，或者假设我们使用补码算术，或者malloc()总是返回NULL，或者假设您的像素显示器排列在六边形网格中.假设您的浮点数不是IEEE 754，并且所有数据指针的大小都不同.归根结底，这并不能使我们更接近在实际的现代系统上编写工作软件的目标(有偶然例外).

We could dream up bizarre systems all day... suppose CHAR_BIT is 11, or sizeof(long) = 100, or suppose we use one's complement arithmetic, or malloc() always returns NULL, or suppose the pixels on your monitor are arranged in a hexagonal grid. Suppose your floating-point numbers aren't IEEE 754, suppose all of your data pointers are different sizes. At the end of the day, this does not get us closer to our goals of writing working software on actual modern systems (with the occasional exception).

这篇关于C和C ++是否保证[a-f]和[A-F]字符的ASCII?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

C和C ++是否保证[a-f]和[A-F]字符的ASCII? [英] Does C and C++ guarantee the ASCII of [a-f] and [A-F] characters?

问题描述

推荐答案

正当化

Justification

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

C和C ++是否保证[a-f]和[A-F]字符的ASCII? [英] Does C and C++ guarantee the ASCII of [a-f] and [A-F] characters?

问题描述

推荐答案

正当化

Justification

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭