将unicode字符串转换为nsstring [英] convert unicode string to nsstring

查看:112
本文介绍了将unicode字符串转换为nsstring的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个unicode字符串

I have a unicode string as

{\rtf1\ansi\ansicpg1252\cocoartf1265
{\fonttbl\f0\fswiss\fcharset0 Helvetica;\f1\fnil\fcharset0 LucidaGrande;}
{\colortbl;\red255\green255\blue255;}
{\*\listtable{\list\listtemplateid1\listhybrid{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\levelspace360\levelindent0{\*\levelmarker \{check\}}{\leveltext\leveltemplateid1\'01\uc0\u10003 ;}{\levelnumbers;}\fi-360\li720\lin720 }{\listname ;}\listid1}}
{\*\listoverridetable{\listoverride\listid1\listoverridecount0\ls1}}
\paperw11900\paperh16840\margl1440\margr1440\vieww22880\viewh16200\viewkind0
\pard\li720\fi-720\pardirnatural
\ls1\ilvl0
\f0\fs24 \cf0 {\listtext    
\f1 \uc0\u10003 
\f0     }One\
{\listtext  
\f1 \uc0\u10003 
\f0     }Two\
}

这里我有unicode数据\ u10003,相当于✓字符。我使用了
[NSString stringWithCharacters:\ u10003length:NSUTF16StringEncoding],这会引发编译错误。请告诉我如何将这些unicode字符转换为✓。

Here i have unicode data \u10003 which is equivalent to "✓" characters. I have used [NSString stringWithCharacters:"\u10003" length:NSUTF16StringEncoding] which is throwing compilation error. Please let me know how to convert these unicode characters to "✓".

问候,
Boom

Regards, Boom

推荐答案

我认为:


  • 您正在从文件或其他外部读取此RTF数据来源。

  • 您正在自己解析它(而不是使用AppKit的内置RTF解析器)。

  • 你有理由说'自己解析它,原因不是等待,AppKit内置了吗?。

  • 你已经遇到了 \u ... 在您正在解析的输入中,需要将其转换为字符以进一步处理和/或包含在输出文本中。

  • 您已排除 \uc ,这是一个不同的东西(它指定 \u ... 序列后面的非Unicode字节数,如果我理解正确的RTF规格。)

  • You are reading this RTF data from a file or other external source.
  • You are parsing it yourself (not using, say, AppKit's built-in RTF parser).
  • You have a reason why you're parsing it yourself, and that reason isn't "wait, AppKit has this built in?".
  • You have come upon \u… in the input you're parsing and need to convert that to a character for further handling and/or inclusion in the output text.
  • You have ruled out \uc, which is a different thing (it specifies the number of non-Unicode bytes that follow the \u… sequence, if I understood the RTF spec correctly).

\ u 被跟踪按十六进制数字。你需要解析那些数字;该数字是序列表示的字符的Unicode代码点编号。然后,您需要创建一个包含该字符的NSString。

\u is followed by hexadecimal digits. You need to parse those to a number; that number is the Unicode code point number for the character the sequence represents. You then need to create an NSString containing that character.

如果您使用NSScanner解析输入,那么(假设您已经扫描过 \ u 本身)你可以简单地要求扫描仪 scanHexInt:。将指针传递给 unsigned int 变量。

If you're using NSScanner to parse the input, then (assuming you have already scanned past the \u itself) you can simply ask the scanner to scanHexInt:. Pass a pointer to an unsigned int variable.

如果您没有使用NSScanner,请执行任何有意义的操作但是你正在解析它。例如,如果您已将RTF数据转换为C字符串并自行读取,则需要使用 strtoul 来解析十六进制数字。它会解释你指定的任何基数(在本例中为16),然后将指针指向下一个字符,无论你想要它。

If you're not using NSScanner, do whatever makes sense for however you're parsing it. For example, if you've converted the RTF data to a C string and are reading through it yourself, you'll want to use strtoul to parse the hex number. It'll interpret the number in whatever base you specify (in this case, 16) and then put the pointer to the next character wherever you want it.

你的 unsigned int unsigned long 变量将包含指定字符的Unicode代码点值。在您的问题的示例中,这将是 0x10003 或U + 10003。

Your unsigned int or unsigned long variable will then contain the Unicode code point value for the specified character. In the example from your question, that will be 0x10003, or U+10003.

现在,对于大多数字符,你可以简单地将它分配给 unichar 变量并从中创建一个NSString。这在这里不起作用: unichar s只能达到 0xFFFF ,并且此代码点高于此值(在技​​术方面,它超出了基本多语言平面。

Now, for most characters, you could simply assign that over to a unichar variable and create an NSString from that. That won't work here: unichars only go up to 0xFFFF, and this code point is higher than that (in technical terms, it's outside the Basic Multilingual Plane).

幸运的是,* CF * String有一个功能可以帮助你:

Fortunately, *CF*String has a function to help you:

unsigned int codePoint = /*…*/;

unichar characters[2];
NSUInteger numCharacters = 0;
if (CFStringGetSurrogatePairForLongCharacter(codePoint, characters)) {
    numCharacters = 2;
} else {
    characters[0] = codePoint;
    numCharacters = 1;
}

然后你可以使用 stringWithCharacters:length:从这个16位字符数组创建一个NSString。

You can then use stringWithCharacters:length: to create an NSString from this array of 16-bit characters.

这篇关于将unicode字符串转换为nsstring的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆