JavaScript Unicode规范化 [英] JavaScript Unicode normalization
问题描述
我的印象是JavaScript解释器假定它正在解释的源代码已经规范化.什么,归一化到底是什么?它不能是文本编辑器,否则源的纯文本表示将发生变化.是否有一些进行标准化的预处理器"?
I'm under the impression that JavaScript interpreter assumes that the source code it is interpreting has already been normalized. What, exactly does the normalizing? It can't be the text editor, otherwise the plaintext representation of the source would change. Is there some "preprocessor" that does the normalization?
推荐答案
否,没有根据ECMAScript 5在JavaScript上自动使用甚至没有使用Unicode规范化功能.所有字符均保持原始代码点不变,可能是非普通形式.
No, there is no Unicode Normalization feature used automatically on—or even available to—JavaScript as per ECMAScript 5. All characters remain unchanged as their original code points, potentially in a non-Normal Form.
例如尝试:
<script type="text/javascript">
var a= 'café'; // caf\u00E9
var b= 'café'; // cafe\u0301
alert(a+' '+a.length); // café 4
alert(b+' '+b.length); // café 5
alert(a==b); // false
</script>
更新: ECMAScript 6将引入JavaScript字符串的Unicode规范化.
这篇关于JavaScript Unicode规范化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!