如何获取字符的 Unicode 代码点? [英] How can I get the Unicode code point(s) of a Character?
问题描述
如何提取给定 Character
的 Unicode 代码点而不先将其转换为 String
?我知道我可以使用以下内容:
How can I extract the Unicode code point(s) of a given Character
without first converting it to a String
? I know that I can use the following:
let ch: Character = "A"
let s = String(ch).unicodeScalars
s[s.startIndex].value // returns 65
但似乎应该有一种更直接的方法来使用 Swift 的标准库来实现这一点.语言指南部分 "使用字符" 和 "Unicode" 只讨论遍历String
中的字符,不直接使用字符
s.
but it seems like there should be a more direct way to accomplish this using just Swift's standard library. The Language Guide sections "Working with Characters" and "Unicode" only discuss iterating through the characters in a String
, not working directly with Character
s.
推荐答案
从我在文档中收集到的信息,他们希望您从 String
中获取 Character
值因为它提供了上下文.这个 Character
是用 UTF8、UTF16 还是 21 位代码点(标量)编码的?
From what I can gather in the documentation, they want you to get Character
values from a String
because it gives context. Is this Character
encoded with UTF8, UTF16, or 21-bit code points (scalars)?
如果你看看 Character
在 Swift 框架中是如何定义的,它实际上是一个 enum
值.这可能是由于来自 String.utf8
、String.utf16
和 String.unicodeScalars
的各种表示.
If you look at how a Character
is defined in the Swift framework, it is actually an enum
value. This is probably done due to the various representations from String.utf8
, String.utf16
, and String.unicodeScalars
.
似乎他们不希望您使用 Character
值,而是使用 Strings
并且您作为程序员决定如何从 String中获取这些值code> 本身,允许保留编码.
It seems they do not expect you to work with Character
values but rather Strings
and you as the programmer decide how to get these from the String
itself, allowing encoding to be preserved.
也就是说,如果您需要以简洁的方式获取代码点,我会推荐这样的扩展:
That said, if you need to get the code points in a concise manner, I would recommend an extension like such:
extension Character
{
func unicodeScalarCodePoint() -> UInt32
{
let characterString = String(self)
let scalars = characterString.unicodeScalars
return scalars[scalars.startIndex].value
}
}
然后你可以像这样使用它:
Then you can use it like so:
let char : Character = "A"
char.unicodeScalarCodePoint()
总而言之,当您考虑所有可能性时,字符串和字符编码是一件棘手的事情.为了让每一种可能性都能被表现出来,他们采用了这个方案.
In summary, string and character encoding is a tricky thing when you factor in all the possibilities. In order to allow each possibility to be represented, they went with this scheme.
还要记住这是一个 1.0 版本,我相信他们很快就会扩展 Swift 的语法糖.
Also remember this is a 1.0 release, I'm sure they will expand Swift's syntactical sugar soon.
这篇关于如何获取字符的 Unicode 代码点?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!