如何获取字符的 Unicode 代码点? [英] How can I get the Unicode code point(s) of a Character?

查看:61
本文介绍了如何获取字符的 Unicode 代码点?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何提取给定 Character 的 Unicode 代码点而不先将其转换为 String?我知道我可以使用以下内容:

How can I extract the Unicode code point(s) of a given Character without first converting it to a String? I know that I can use the following:

let ch: Character = "A"
let s = String(ch).unicodeScalars
s[s.startIndex].value // returns 65

但似乎应该有一种更直接的方法来使用 Swift 的标准库来实现这一点.语言指南部分 "使用字符""Unicode" 只讨论遍历String中的字符,不直接使用字符s.

but it seems like there should be a more direct way to accomplish this using just Swift's standard library. The Language Guide sections "Working with Characters" and "Unicode" only discuss iterating through the characters in a String, not working directly with Characters.

推荐答案

从我在文档中收集到的信息,他们希望您从 String 中获取 Character 值因为它提供了上下文.这个 Character 是用 UTF8、UTF16 还是 21 位代码点(标量)编码的?

From what I can gather in the documentation, they want you to get Character values from a String because it gives context. Is this Character encoded with UTF8, UTF16, or 21-bit code points (scalars)?

如果你看看 Character 在 Swift 框架中是如何定义的,它实际上是一个 enum 值.这可能是由于来自 String.utf8String.utf16String.unicodeScalars 的各种表示.

If you look at how a Character is defined in the Swift framework, it is actually an enum value. This is probably done due to the various representations from String.utf8, String.utf16, and String.unicodeScalars.

似乎他们不希望您使用 Character 值,而是使用 Strings 并且您作为程序员决定如何从 String 本身,允许保留编码.

It seems they do not expect you to work with Character values but rather Strings and you as the programmer decide how to get these from the String itself, allowing encoding to be preserved.

也就是说,如果您需要以简洁的方式获取代码点,我会推荐这样的扩展:

That said, if you need to get the code points in a concise manner, I would recommend an extension like such:

extension Character
{
    func unicodeScalarCodePoint() -> UInt32
    {
        let characterString = String(self)
        let scalars = characterString.unicodeScalars

        return scalars[scalars.startIndex].value
    }
}

然后你可以像这样使用它:

Then you can use it like so:

let char : Character = "A"
char.unicodeScalarCodePoint()

总而言之,当您考虑所有可能性时,字符串和字符编码是一件棘手的事情.为了让每一种可能性都能被表现出来,他们采用了这个方案.

In summary, string and character encoding is a tricky thing when you factor in all the possibilities. In order to allow each possibility to be represented, they went with this scheme.

还要记住这是一个 1.0 版本,我相信他们很快就会扩展 Swift 的语法糖.

Also remember this is a 1.0 release, I'm sure they will expand Swift's syntactical sugar soon.

这篇关于如何获取字符的 Unicode 代码点?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆