如何在Swift中将代理项​​对转换为Unicode标量 [英] How to convert surrogate pair to Unicode scalar in Swift

查看:198
本文介绍了如何在Swift中将代理项​​对转换为Unicode标量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下示例取自和这个项目似乎是自定义解决方案,但Swift内置了什么(特别是Swift 2.0+)这样做?

解决方案

有基于代理对计算原始代码点的公式,反之亦然。来自 https://mathiasbynens.be/notes/javascript-encoding#surrogate-formulae


Unicode标准的第3.7节
3.0
定义了转换代理对和从代理对转换的算法。



代码点 C 大于 0xFFFF 对应于代理货币对
< H,L> 按照以下公式:

  H = Math.floor((C  -  0x10000)/ 0x400)+ 0xD800 
L =(C - 0x10000)%0x400 + 0xDC00

反向映射,即从代理对< H,L> 到Unicode
代码点 C ,由下式给出:

  C =(H  -  0xD800)* 0x400 + L  -  0xDC00 + 0x10000 



The following example is taken from the Strings and Characters documentation:

The values 55357 (U+D83D in hex) and 56374 (U+DC36 in hex) are the surrogate pairs that form the Unicode scalar U+1F436, which is the DOG FACE character. Is there any way to go the other direction? That is, can I convert a surrogate pair into a scalar?

I tried

let myChar: Character = "\u{D83D}\u{DC36}"

but I got an "Invalid Unicode scalar" error.

This Objective C answer and this project seem to be custom solutions, but is there anything built into Swift (especially Swift 2.0+) that does this?

解决方案

There are formulas to calculate the original code point based on a surrogate pair and vice versa. From https://mathiasbynens.be/notes/javascript-encoding#surrogate-formulae:

Section 3.7 of The Unicode Standard 3.0 defines the algorithms for converting to and from surrogate pairs.

A code point C greater than 0xFFFF corresponds to a surrogate pair <H, L> as per the following formula:

H = Math.floor((C - 0x10000) / 0x400) + 0xD800
L = (C - 0x10000) % 0x400 + 0xDC00

The reverse mapping, i.e. from a surrogate pair <H, L> to a Unicode code point C, is given by:

C = (H - 0xD800) * 0x400 + L - 0xDC00 + 0x10000

这篇关于如何在Swift中将代理项​​对转换为Unicode标量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆