目标C读取文件错误的编码 [英] Objective C read file wrong encoding

查看:200
本文介绍了目标C读取文件错误的编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所有我有一个问题,当我从互联网下载文件,我需要挖掘一些数据。我打开它并尝试缓冲它,但它给我错误的字符,因为这个文件是在捷克...
我的代码:

   - (void)sync {

NSString * path = @/ Users / syky / Documents / stats.csv;
NSFileHandle * fileHandle = [NSFileHandle fileHandleForReadingAtPath:path];
NSData * buffer = nil;
while((buffer = [fileHandle readDataOfLength:1024])){
//用缓冲区

NSString * s = [[NSString alloc] initWithData:缓冲区编码:零];
NSLog(s);
break;

}

无论我选择哪种编码,我总是得到破碎的字符作为

 Poø。;Jméno

我需要得到:

 Příjmení;Jméno

此文件是由Microsoft Excel生成的,如* .csv导出文件...
当我试图通过任何MAC OS X文本编辑器打开这个文件我也坏了的字符,但是当我打开它在其他基于Windows的maschine与Microsoft Excel它工作得很好...



感谢您的帮助



解决方案:

   - (void)sync {

NSString * path = @/ Users / syky / Documents / stats.csv;
NSFileHandle * fileHandle = [NSFileHandle fileHandleForReadingAtPath:path];
NSData * buffer = nil;
while((buffer = [fileHandle readDataOfLength:1024])){

NSStringEncoding encoding = CFStringConvertEncodingToNSStringEncoding(kCFStringEncodingWindowsLatin2);
NSString * string = [[NSString alloc] initWithData:buffer encoding:encoding];

NSLog(string);

break;

}


解决方案

我不是捷克人。其次,我认为使用UTF-8类似于说扔桶。它是同样的方式。



从我所研究的,你可以使用ISO拉丁语2或苹果的中欧罗马编码。你会发现前者代表 NSStringEncoding s,但不是后者,所以看看Core Foundation的支持:

  NSStringEncoding encoding = CFStringConvertEncodingToNSStringEncoding(kCFStringEncodingMacCentralEurRoman); 
NSString * string = [[NSString alloc] initWithData:buffer encoding:encoding];

否则,你可以(也可能已经从你所说的)使用: p>

  NSString * string = [[NSString alloc] initWithData:缓冲区编码:NSISOLatin2StringEncoding]; 

我真的好奇,看看是否使用 CFStringEncoding 编码改善了你的情况。



编辑:



如果你的源代码是由Microsoft Excel生成的, code> kCFStringEncodingWindowsLatin2 将工作,而不是 kCFStringEncodingMacCentralEurRoman 。像以前一样,您需要使用CFStringConvertEncodingToNSStringEncoding进行转换。



还有一种您可能想尝试的方法。因为 CFStringRef 是toll-bridged到 NSString (所以也是 CFDataRef NSData ),可能完全在Core Foundation工作:

  CFStringRef stringRef = CFStringCreateFromExternalRepresentation(kCFAllocatorDefault,(CFDataRef)buffer,kCFStringEncodingMacCentralEurRoman); 
NSString * string =(NSString *)stringRef;

在这种情况下,不要忘记 stringRef 必须被释放。



祝你好运。


Hi all I have a problem when I download file from internet from which I need to mine some data. I open it and try to buffer it, but it gives me wrong chars because this file is in Czech... My code:

- (void) sync {

    NSString * path = @"/Users/syky/Documents/stats.csv";
    NSFileHandle * fileHandle = [NSFileHandle fileHandleForReadingAtPath:path];
    NSData * buffer = nil;
    while ((buffer = [fileHandle readDataOfLength:1024])) {
    //do something with the buffer

    NSString * s = [[NSString alloc]initWithData:buffer encoding:nil];
    NSLog(s);
    break;

}

No matter which encoding I choose I always get broken chars such as

"Poø.";"Jméno"

I need to get:

"Příjmení";"Jméno"

This file is originaly generated by Microsoft Excel such as *.csv export file... When I try to open this file by any MAC OS X Text editor I get broken chars as well, but when I open it on other Windows based maschine with Microsoft Excel it works just fine...

Thank you for your help

Solution:

- (void) sync {

    NSString * path = @"/Users/syky/Documents/stats.csv";
    NSFileHandle * fileHandle = [NSFileHandle fileHandleForReadingAtPath:path];
    NSData * buffer = nil;
    while ((buffer = [fileHandle readDataOfLength:1024])) {

    NSStringEncoding encoding = CFStringConvertEncodingToNSStringEncoding(kCFStringEncodingWindowsLatin2);
    NSString *string = [[NSString alloc] initWithData:buffer encoding:encoding];

    NSLog(string);

    break;

}

解决方案

First, I'm not a Czech speaker. Second, I think "use UTF-8" is akin to saying "throw a barrel at it." It's heavy-handed in the same way.

From what I've researched, you could use ISO Latin 2 or Apple's Central European Roman encoding. You'll find the former represented among NSStringEncodings, but not the latter, so look to Core Foundation's support:

NSStringEncoding encoding = CFStringConvertEncodingToNSStringEncoding(kCFStringEncodingMacCentralEurRoman);
NSString *string = [[NSString alloc] initWithData:buffer encoding:encoding];

Otherwise, you could (and probably already have, from what you've said) use:

NSString *string = [[NSString alloc] initWithData:buffer encoding:NSISOLatin2StringEncoding];

I'm really curious to see if using CFStringEncoding encodings improves your situation.

EDIT:

If your source was generated by Microsoft Excel, perhaps kCFStringEncodingWindowsLatin2 will work instead of kCFStringEncodingMacCentralEurRoman. Like before, you'll need to convert it using CFStringConvertEncodingToNSStringEncoding.

There's one other approach you might want to try. Since CFStringRef is "toll-bridged" to NSString (and so is CFDataRef to NSData), perhaps working entirely in Core Foundation might work:

CFStringRef stringRef = CFStringCreateFromExternalRepresentation(kCFAllocatorDefault, (CFDataRef)buffer, kCFStringEncodingMacCentralEurRoman);
NSString *string = (NSString *)stringRef;

In this case, don't forget that stringRef has to be released.

Good luck to you in your endeavors.

这篇关于目标C读取文件错误的编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆