什么是“代理对"?在爪哇? [英] What is a "surrogate pair" in Java?

查看:48
本文介绍了什么是“代理对"?在爪哇?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在阅读 StringBuffer 的文档,特别是 reverse() 方法.该文档提到了一些关于代理对的内容.在这种情况下什么是代理对?什么是 lowhigh 代理?

I was reading the documentation for StringBuffer, in particular the reverse() method. That documentation mentions something about surrogate pairs. What is a surrogate pair in this context? And what are low and high surrogates?

推荐答案

术语代理对"是指在 UTF-16 编码方案中对具有高代码点的 Unicode 字符进行编码的方法.

The term "surrogate pair" refers to a means of encoding Unicode characters with high code-points in the UTF-16 encoding scheme.

在 Unicode 字符编码中,字符被映射到 0x0 到 0x10FFFF 之间的值.

In the Unicode character encoding, characters are mapped to values between 0x0 and 0x10FFFF.

在内部,Java 使用 UTF-16 编码方案来存储 Unicode 文本字符串.在 UTF-16 中,使用 16 位(两字节)代码单元.由于 16 位只能包含从 0x0 到 0xFFFF 的字符范围,因此使用一些额外的复杂性来存储高于此范围(0x10000 到 0x10FFFF)的值.这是使用称为代理的成对代码单元完成的.

Internally, Java uses the UTF-16 encoding scheme to store strings of Unicode text. In UTF-16, 16-bit (two-byte) code units are used. Since 16 bits can only contain the range of characters from 0x0 to 0xFFFF, some additional complexity is used to store values above this range (0x10000 to 0x10FFFF). This is done using pairs of code units known as surrogates.

代理代码单元分为两个范围,称为高代理"和低代理",具体取决于它们是否允许出现在两代码单元序列的开头或结尾.

The surrogate code units are in two ranges known as "high surrogates" and "low surrogates", depending on whether they are allowed at the start or end of the two-code-unit sequence.

这篇关于什么是“代理对"?在爪哇?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆