为什么某些ASCII字符不能以Java源代码中的'\ uXXXX'形式表示？ [英] Why can some ASCII characters not be expressed in the form '\uXXXX' in Java source code?

查看：280 发布时间：2018/12/6 13:07:03 java

本文介绍了为什么某些ASCII字符不能以Java源代码中的'\ uXXXX'形式表示？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我今天偶然发现了这个问题：

I stumbled over this (again) today:

class Test {
    char ok = '\n';
    char okAsWell = '\u000B';
    char error = '\u000A';
}

它不编译：

第4行中的字符常量无效。

Invalid character constant in line 4.

编译器似乎坚持要我写' \ n'而不是。我认为没有理由，但它非常令人讨厌。

The compiler seems to insist that I write '\n' instead. I see no reason for this, yet it's very annoying.

是否有合理符号的字符（如 \t ， \ n ， \ r ）必须在Java源代码中用该表单表示吗？

Is there a logical explanation why characters that have a special notation (like \t, \n, \r) must be expressed in that form in Java source?

推荐答案

Unicode字符被其值替换，所以你的行被编译器替换为：

Unicode characters are replaced by their value, so your line is replaced by the compiler with:

char error = '
';

这不是有效的Java语句。

which is not a valid Java statement.

这是由语言规范决定的：

Java编程语言的编译器（Java编译器）首先在其输入中识别Unicode转义，转换ASCII字符\ u后跟四个十六进制数字到指定十六进制值的UTF-16代码单元（第3.1节），并且不更改所有其他字符。表示补充字符需要两个连续的Unicode转义。此转换步骤将生成一系列Unicode输入字符。

A compiler for the Java programming language ("Java compiler") first recognizes Unicode escapes in its input, translating the ASCII characters \u followed by four hexadecimal digits to the UTF-16 code unit (§3.1) of the indicated hexadecimal value, and passing all other characters unchanged. Representing supplementary characters requires two consecutive Unicode escapes. This translation step results in a sequence of Unicode input characters.

这可能导致令人惊讶的东西，例如，这是一个有效的Java程序（它包含隐藏的unicode字符） - 由Peter Lawrey提供：

This can lead to surprising stuff, for example, this is a valid Java program (it contains hidden unicode characters) - courtesy of Peter Lawrey:

public static void main(String[] args) {
    for (char c⁯‮h = 0; c⁯‮h < Character.MAX_VALUE; c⁯‮h++) {
        if (Character.isJavaIdentifierPart(c⁯‮h) && !Character.isJavaIdentifierStart(c⁯‮h)) {
            System.out.printf("%04x <%s>%n", (int) c⁯‮h, "" + c⁯‮h);
        }
    }
}

这篇关于为什么某些ASCII字符不能以Java源代码中的'\ uXXXX'形式表示？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

为什么某些ASCII字符不能以Java源代码中的'\ uXXXX'形式表示？ [英] Why can some ASCII characters not be expressed in the form '\uXXXX' in Java source code?

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

为什么某些ASCII字符不能以Java源代码中的'\ uXXXX'形式表示？ [英] Why can some ASCII characters not be expressed in the form &#39;\uXXXX&#39; in Java source code?

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

为什么某些ASCII字符不能以Java源代码中的'\ uXXXX'形式表示？ [英] Why can some ASCII characters not be expressed in the form '\uXXXX' in Java source code?

登录关闭