将unicode符号用作Java标识符是一个好主意吗? [英] Is it a good idea to use unicode symbols as Java identifiers?

查看:90
本文介绍了将unicode符号用作Java标识符是一个好主意吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一段代码如下:

 doubleΔt= lastPollTime  -  pollTime; 
doubleα= 1 - Math.exp(-Δt/τ);
平均值+ =α*(x - 平均值);

在Java标识符中使用unicode字符的想法有多糟糕?或者这完全可以接受吗?

解决方案

出于各种原因,这是一个坏主意。




  • 许多人的键盘不支持这些字符。如果我要在qwerty键盘(或任何其他没有希腊字母)上维护该代码,我必须始终复制并粘贴这些字符。


  • <某些人的编辑或终端可能无法正确显示这些字符。例如,一些编辑(不幸的是)仍默认为某些ISO-8859(拉丁文)变体。 ASCII仍然如此普遍的主要原因是它几乎总是有效。


  • 即使字符可以正确呈现,它们也可能引起混淆。 直接来自太阳(强调我的):


    具有相同外观的标识符可能不同。例如,由单个字母组成的标识符拉丁文大写字母A(A,\ 0000),拉丁文小写字母A(a,\ u0061),希腊大写字母ALPHA(A,\ u0391),CYRILLIC小写字母A(a,\ u0430)和数学大胆ITALIC SMALL A(a,\\\�\\\�)都不同。



    ...



    Unicode复合字符与分解的字符不同。例如,拉丁文大写字母A ACUTE(Á,\ u00c1)可被视为与拉丁文大写字母A(A,\\ \\ u0041)在排序时紧接着是非间距ACUTE(',\ u0301),但这些标识符不同。


    这绝不是一个想象中的问题:α(U + 03b1 GREEK SMALL LETTER ALPHA)和⍺(U + 237a APL FUNCTIONAL SYMBOL ALPHA)是不同的字符!


  • 无法确定哪些字符有效。代码中的字符可以正常工作,但是当我使用FUNCTIONAL SYMBOL ALPHA时,我的Java编译器会抱怨非法字符:\ 9082。即使功能符号在此代码中更合适。关于哪些字符可以接受似乎没有可靠的规则,除了询问 Character.isJavaIdentifierPart()


  • 即使你可以得到它编译,所有Java虚拟机实现都经过严格的Unicode标识符测试似乎令人怀疑。如果这些字符仅用于方法范围中的变量,它们应该被编译掉,但如果它们是类成员,它们最终也会出现在 .class 文件中,可能会在有缺陷的JVM实现上破坏你的程序。



I have a snippet of code that looks like this:

double Δt = lastPollTime - pollTime;
double α = 1 - Math.exp(-Δt / τ);
average += α * (x - average);

Just how bad an idea is it to use unicode characters in Java identifiers? Or is this perfectly acceptable?

解决方案

It's a bad idea, for various reasons.

  • Many people's keyboards do not support these characters. If I were to maintain that code on a qwerty keyboard (or any other without Greek letters), I'd have to copy and paste those characters all the time.

  • Some people's editors or terminals might not display these characters properly. For example, some editors (unfortunately) still default to some ISO-8859 (Latin) variant. The main reason why ASCII is still so prevalent is that it nearly always works.

  • Even if the characters can be rendered properly, they may cause confusion. Straight from Sun (emphasis mine):

    Identifiers that have the same external appearance may yet be different. For example, the identifiers consisting of the single letters LATIN CAPITAL LETTER A (A, \u0041), LATIN SMALL LETTER A (a, \u0061), GREEK CAPITAL LETTER ALPHA (A, \u0391), CYRILLIC SMALL LETTER A (a, \u0430) and MATHEMATICAL BOLD ITALIC SMALL A (a, \ud835\udc82) are all different.

    ...

    Unicode composite characters are different from the decomposed characters. For example, a LATIN CAPITAL LETTER A ACUTE (Á, \u00c1) could be considered to be the same as a LATIN CAPITAL LETTER A (A, \u0041) immediately followed by a NON-SPACING ACUTE (´, \u0301) when sorting, but these are different in identifiers.

    This is in no way an imaginary problem: α (U+03b1 GREEK SMALL LETTER ALPHA) and ⍺ (U+237a APL FUNCTIONAL SYMBOL ALPHA) are different characters!

  • There is no way to tell which characters are valid. The characters from your code work, but when I use the FUNCTIONAL SYMBOL ALPHA my Java compiler complains about "illegal character: \9082". Even though the functional symbol would be more appropriate in this code. There seems to be no solid rule about which characters are acceptable, except asking Character.isJavaIdentifierPart().

  • Even though you may get it to compile, it seems doubtful that all Java virtual machine implementations have been rigorously tested with Unicode identifiers. If these characters are only used for variables in method scope, they should get compiled away, but if they are class members, they will end up in the .class file as well, possibly breaking your program on buggy JVM implementations.

这篇关于将unicode符号用作Java标识符是一个好主意吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆