whitespaceAndNewlineCharacterSet()中有哪些字符? [英] What characters are in whitespaceAndNewlineCharacterSet()?

查看:288
本文介绍了whitespaceAndNewlineCharacterSet()中有哪些字符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在解析一些令人讨厌的文件-您知道,在单行中混合逗号,空格和制表符分隔符,然后通过文本编辑器运行该文件,并在CRLF的第65列自动换行。 gh。

I'm parsing some nasty files - you know, mix comma, space and tab delimiters in a single line, and then run it through a text editor that word wraps at column 65 with CRLF. Ugh.

在可可分析中,我使用Apple的 whitespaceAndNewlineCharacterSet 。但是,恰好在那个集合中是什么?该文档说 Unicode通用类别Z *,U000A〜U000D和U0085。我能够找到最后三个(85很有趣,但是〜的含义是什么,什么是General Category Z *?

As part of my efforts to parse this in Cocoa, I use Apple's whitespaceAndNewlineCharacterSet. But what, exactly is in that set? The documentation says "Unicode General Category Z*, U000A ~ U000D, and U0085". I was able to find the last three (85 is interesting, but what does the ~ mean, and what is General Category Z*?

那里有任何Unicode专家吗?

Any Unicode gurus out there?

推荐答案

NSCharacterSet是一个不透明的类,不会轻易公开其内容,您必须将其更多地视为成员身份规则服务

NSCharacterSet is an opaque class that does not expose its content easily. You have to see it more as a "membership" rule service than a list of characters.

这可能有点残酷,但是您可以通过遍历所有16位标量值并检查来获取NSCharacterSet中的成员列表。对于集合中的成员资格:

This may be a somewhat brutal approach, but you can get the list of members in an NSCharacterSet by going through all 16 bit scalar values and checking for membership in the set:

 let charSet = NSCharacterSet.whitespaceAndNewlineCharacterSet()
 for i in 0..<65536
 {
    let u:UInt16 = UInt16(i)
    if charSet.characterIsMember(u)
    { print("\(u): \(Character(UnicodeScalar(u)))") }
 }

这为不可显示给出了令人惊讶的结果字符集,但它可能可以回答您的问题。

This gives surprising results for non-displayable character sets but it can probably answer your question.

这篇关于whitespaceAndNewlineCharacterSet()中有哪些字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆