将转义的UTF8字符转换回原始形式 [英] Converting escaped UTF8 characters back to their original form

查看:369
本文介绍了将转义的UTF8字符转换回原始形式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



数组中的字符串包含转义的UTF8字符 - 例如,当从plist读取时,NušaFlorjančič变为Nu \\\ša Florjan\\\či\\\č。没有办法改变plist的内容,但我的程序需要正确显示名称。



奇怪的是,Objective-C似乎自动执行当我硬编码字符串。



为了给你一个例子,下面是一些代码:

  NSString * name1 = @Nu \\\ša Florjan\\\či\\\č; 
NSString * name2 = [list objectAtIndex:0];
NSLog(@name 1:%@,name1);
NSLog(@name 2:%@,name2);

[list objectAtIndex:0] code> @Nu \\\ša Florjan \\\či\\\č - 唯一的区别是它已通过plist编辑器设置。



控制台输出为:

  2011-10-22 18:00:02.595测试[13410:11c03 ] name 1:NušaFlorjančič
2011-10-22 18:00:02.595 Test [13410:11c03] name 2:Nu\\\ša Florjan\\\či\\\č

我已经尝试过各种各样的事情,包括将字符串转换为C字符串,然后创建一个 NSString object with an UTF-8 encoding but nothing in working。



我真的很感谢你的任何指针,可以帮助我解决这个看似平凡的问题。

解决方案

看起来plist中的字符串包含字符\\\š,而不是Unicode字符0x161 。因此,您需要解码从plist中提取的字符串中的\u转义。 NSString 可以使用 NSNonLossyASCIIStringEncoding

  #import< Foundation / Foundation.h> 
int main(int argc,const char * argv [])
{
@autoreleasepool {
NSString * name2escaped = @Nu \\\\ša Florjan \\ u010di \\\\č;
NSString * name2 = [NSString
stringWithCString:[name2escaped cStringUsingEncoding:NSUTF8StringEncoding]
encoding:NSNonLossyASCIIStringEncoding];
NSLog(@name2 =%@,name2);
}
return 0;
}


I'm trying to read strings from an array that's coming from a plist and print those strings.

The strings in the array contain escaped UTF8 characters - for example "Nuša Florjančič" becomes "Nu\u0161a Florjan\u010di\u010d" when read from the plist. There is no way to change the content of the plist, but my program needs to display the names properly.

The strange thing is that Objective-C seems to do this automatically when I'm hardcoding the string. However, if I get the string from the plist nothing happens at all.

To give you an example, here's some code:

NSString *name1 = @"Nu\u0161a Florjan\u010di\u010d";
NSString *name2 = [list objectAtIndex:0];       
NSLog(@"name 1: %@", name1);
NSLog(@"name 2: %@", name2);

[list objectAtIndex:0] contains @"Nu\u0161a Florjan\u010di\u010d" - the only difference is that it has been set via the plist editor.

The console output is:

2011-10-22 18:00:02.595 Test[13410:11c03] name 1: Nuša Florjančič
2011-10-22 18:00:02.595 Test[13410:11c03] name 2: Nu\u0161a Florjan\u010di\u010d

I've tried all sorts of things, including transforming the string into a C-string and then creating an NSString object with a UTF-8 encoding but nothing worked at all.

I'd really appreciate any pointers from you that might help me solve this seemingly mundane problem.

解决方案

It sounds like the string in the plist contains the characters "\u0161" rather than the Unicode character number 0x161. So you need to decode the \u escapes in the string you've extracted from the plist. NSString can do that for you using NSNonLossyASCIIStringEncoding:

#import <Foundation/Foundation.h>
int main (int argc, const char * argv[])
{
    @autoreleasepool {
        NSString *name2escaped = @"Nu\\u0161a Florjan\\u010di\\u010d";
        NSString *name2 = [NSString
            stringWithCString:[name2escaped cStringUsingEncoding:NSUTF8StringEncoding]
            encoding:NSNonLossyASCIIStringEncoding];
        NSLog(@"name2 = %@", name2);
    }
    return 0;
}

这篇关于将转义的UTF8字符转换回原始形式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆