为什么" \\&QUOT?;在C / C转义序列++? [英] Why is "\?" an escape sequence in C/C++?

查看:151
本文介绍了为什么" \\&QUOT?;在C / C转义序列++?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有需要在C语言中可以逃脱四个特殊的非字母字符/ C ++:单引号 \\ ,双引号 \\ ,反斜线 \\\\ 和问号 \\ <?/ code>。这显然是因为他们具有特殊的含义。 字符的字符串, \\ 的转义序列,但为什么是其中之一吗?

我在一本教科书今天读来转义序列的表,我意识到,我已经的从不逃跑之前,从来没有遇到问题有了它,只是要确定,我测试了它在海湾合作委员会:

 的#include&LT;&stdio.h中GT;
INT主要(无效)
{
    的printf(问号逃到\\ \\ n?);
    返回0;
}

和C ++版本:

 的#include&LT;&iostream的GT;
INT主要(无效)
{
    性病::法院LT&;&LT; 问号?逃走\\? &LT;&LT;的std :: ENDL;
    返回0;
}

这两个程序的输出:问号?逃走?

所以,我有两个问题:


  1. 为什么 \\ <?/ code>转义序列的人物之一?

  2. 为什么不逃跑工作正常,甚至有没有一个警告。


我要问这个问题之前,我发现自己的答案,因为我没有发现SO重复,我决定将它张贴在Q&安培; A风格

更有趣的事实是,逃脱 \\ <?/ code>可以在其他语言中使用的相同还有,我在Lua / Ruby的测试,这也是事实,即使我没有找到这个文件。


解决方案

  

为什么 \\ <?/ code>转义序列的人物之一?


由于它是特殊的,答案会导致三字符,在C / C ++ preprocessor替换以下三字符序列为相应的单个字符。 (C11§5.2.1.1和C ++ 11§2.3)

 三字符:??(??)??&LT; ??&GT; ?? ?? = / ?? ??! ??  - 
更换:[] {}#\\ ^ | 〜

三字符几乎是无用的,现在,主要用于混淆的目的,一些实施例可以在 IOCCC 的看到。

GCC默认不支持三字母,并警告你,如果有三字母在code,除非选项 -trigraphs 3 的已启用。 ?在 -trigraphs 选项,第二个 \\ 在下面的例子中有用的:

 的printf(?!\\ \\ \\ n);

输出将为 | 如果 未逃脱

有关三字母的更多信息,请参见隐秘线&QUOT; ?? ??! !&QUOT;在传统code



  

为什么不逃跑工作正常,甚至有没有一个警告。


由于 (和双引号)可重新$ P $按标准自行psented:


  

C11§6.4.4.4的字符常量的第4节


  
  

双引号和问号重新presentable可以由本人或者通过转义序列 \\ \\ <?/ code>分别,但单引号 和反斜线 \\ 应重新presented,分别由转义序列 \\ \\\\


在C ++类似:


  

C ++ 11§2.13.2的字符文字的第3节


  
  

某些非图形字符,单引号 ,双引号,问号?根据表6。和反斜线 \\ ,可以重新presented双引号 psented为自己或由转义序列 \\ 和问号,可以重新$ p $ \\ <?/ code>分别,但单引号 和反斜线 \\ 应重新由转义序列psented $ p $ \\ \\\\ 分别。如果按照一个反斜杠字符不是那些指定的一个,其行为是不确定的。转义序列指定一个字符。


There are four special non-alphabet characters that need to be escaped in C/C++: the single quote \', the double quote \", the backslash \\ and the question mark \?. It's apparently because they have special meanings. ' for single char, " for string literals, \ for escape sequences, but why is ? one of them?

I read the table of escape sequences in a textbook today and I realized that I've never escape ? before and never encountered a problem with it , just to be sure, I tested it under gcc:

#include <stdio.h>
int main(void)
{
    printf("question mark ? and escaped \?\n");
    return 0;
}

and the C++ version:

#include <iostream>
int main(void)
{
    std::cout << "question mark ? and escaped \?" << std::endl;
    return 0;
}

Both programs output: question mark ? and escaped ?

So I have two questions:

  1. Why is \? one of the escape sequence characters ?
  2. Why non-escaping ? works fine, there's not even a warning.


Before I'm about to ask this question, I found the answer myself, since I didn't find a duplicate in SO, I decided to post it in Q&A style.

The more interesting fact is that the escaped \? can be used the same as ? in some other languages as well, I tested in Lua/Ruby, it's also true even though I didn't find this documented.

解决方案

Why is \? one of the escape sequence characters ?

Because it is special, the answer leads to Trigraph, the C/C++ preprocessor replaces following three-character sequence to the corresponding single character. (C11 §5.2.1.1 and C++11 §2.3)

Trigraph:       ??(  ??)  ??<  ??>  ??=  ??/  ??'  ??!  ??-
Replacement:      [    ]    {    }    #    \    ^    |    ~

Trigraph is nearly useless now, mainly used for obfuscated purpose, some examples can be seen in IOCCC.

gcc doesn't support trigraph by default, and will warn you if there's trigraph in the code, unless the option -trigraphs3 is enabled. Under -trigraphs option, the second \? is useful in the following example:

printf("\?\?!\n");  

Output would be | if ? is not escaped.

For more information on trigraph, see Cryptic line "??!??!" in legacy code


Why non-escaping ? works fine, there's not even a warning.

Because ?(and double quote ") can be represented by themselves by the standard:

C11 §6.4.4.4 Character constants Section 4

The double-quote " and question-mark ? are representable either by themselves or by the escape sequences \" and \?, respectively, but the single-quote ' and the backslash \ shall be represented, respectively, by the escape sequences \' and \\.

Similar in C++:

C++11 §2.13.2 Character literals Section 3

Certain nongraphic characters, the single quote , the double quote ", the question mark ?, and the backslash \, can be represented according to Table 6. The double quote " and the question mark ?, can be represented as themselves or by the escape sequences \" and \? respectively, but the single quote and the backslash \ shall be represented by the escape sequences \’ and \\ respectively. If the character following a backslash is not one of those specified, the behavior is undefined. An escape sequence specifies a single character.

这篇关于为什么&QUOT; \\&QUOT?;在C / C转义序列++?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆