目标C读取文件错误的编码 [英] Objective C read file wrong encoding
问题描述
我的代码:
- (void)sync {
NSString * path = @/ Users / syky / Documents / stats.csv;
NSFileHandle * fileHandle = [NSFileHandle fileHandleForReadingAtPath:path];
NSData * buffer = nil;
while((buffer = [fileHandle readDataOfLength:1024])){
//用缓冲区
NSString * s = [[NSString alloc] initWithData:缓冲区编码:零];
NSLog(s);
break;
}
无论我选择哪种编码,我总是得到破碎的字符作为
Poø。;Jméno
我需要得到:
Příjmení;Jméno
此文件是由Microsoft Excel生成的,如* .csv导出文件...
当我试图通过任何MAC OS X文本编辑器打开这个文件我也坏了的字符,但是当我打开它在其他基于Windows的maschine与Microsoft Excel它工作得很好...
感谢您的帮助
解决方案:
- (void)sync {
NSString * path = @/ Users / syky / Documents / stats.csv;
NSFileHandle * fileHandle = [NSFileHandle fileHandleForReadingAtPath:path];
NSData * buffer = nil;
while((buffer = [fileHandle readDataOfLength:1024])){
NSStringEncoding encoding = CFStringConvertEncodingToNSStringEncoding(kCFStringEncodingWindowsLatin2);
NSString * string = [[NSString alloc] initWithData:buffer encoding:encoding];
NSLog(string);
break;
}
我不是捷克人。其次,我认为使用UTF-8类似于说扔桶。它是同样的方式。
从我所研究的,你可以使用ISO拉丁语2或苹果的中欧罗马编码。你会发现前者代表 NSStringEncoding
s,但不是后者,所以看看Core Foundation的支持:
NSStringEncoding encoding = CFStringConvertEncodingToNSStringEncoding(kCFStringEncodingMacCentralEurRoman);
NSString * string = [[NSString alloc] initWithData:buffer encoding:encoding];
否则,你可以(也可能已经从你所说的)使用: p>
NSString * string = [[NSString alloc] initWithData:缓冲区编码:NSISOLatin2StringEncoding];
我真的好奇,看看是否使用 CFStringEncoding
编码改善了你的情况。
编辑:
如果你的源代码是由Microsoft Excel生成的, code> kCFStringEncodingWindowsLatin2 将工作,而不是 kCFStringEncodingMacCentralEurRoman
。像以前一样,您需要使用CFStringConvertEncodingToNSStringEncoding进行转换。
还有一种您可能想尝试的方法。因为 CFStringRef
是toll-bridged到 NSString
(所以也是 CFDataRef
到 NSData
),可能完全在Core Foundation工作:
CFStringRef stringRef = CFStringCreateFromExternalRepresentation(kCFAllocatorDefault,(CFDataRef)buffer,kCFStringEncodingMacCentralEurRoman);
NSString * string =(NSString *)stringRef;
在这种情况下,不要忘记 stringRef
必须被释放。
祝你好运。
Hi all I have a problem when I download file from internet from which I need to mine some data. I open it and try to buffer it, but it gives me wrong chars because this file is in Czech... My code:
- (void) sync {
NSString * path = @"/Users/syky/Documents/stats.csv";
NSFileHandle * fileHandle = [NSFileHandle fileHandleForReadingAtPath:path];
NSData * buffer = nil;
while ((buffer = [fileHandle readDataOfLength:1024])) {
//do something with the buffer
NSString * s = [[NSString alloc]initWithData:buffer encoding:nil];
NSLog(s);
break;
}
No matter which encoding I choose I always get broken chars such as
"Poø.";"Jméno"
I need to get:
"Příjmení";"Jméno"
This file is originaly generated by Microsoft Excel such as *.csv export file... When I try to open this file by any MAC OS X Text editor I get broken chars as well, but when I open it on other Windows based maschine with Microsoft Excel it works just fine...
Thank you for your help
Solution:
- (void) sync {
NSString * path = @"/Users/syky/Documents/stats.csv";
NSFileHandle * fileHandle = [NSFileHandle fileHandleForReadingAtPath:path];
NSData * buffer = nil;
while ((buffer = [fileHandle readDataOfLength:1024])) {
NSStringEncoding encoding = CFStringConvertEncodingToNSStringEncoding(kCFStringEncodingWindowsLatin2);
NSString *string = [[NSString alloc] initWithData:buffer encoding:encoding];
NSLog(string);
break;
}
First, I'm not a Czech speaker. Second, I think "use UTF-8" is akin to saying "throw a barrel at it." It's heavy-handed in the same way.
From what I've researched, you could use ISO Latin 2 or Apple's Central European Roman encoding. You'll find the former represented among NSStringEncoding
s, but not the latter, so look to Core Foundation's support:
NSStringEncoding encoding = CFStringConvertEncodingToNSStringEncoding(kCFStringEncodingMacCentralEurRoman);
NSString *string = [[NSString alloc] initWithData:buffer encoding:encoding];
Otherwise, you could (and probably already have, from what you've said) use:
NSString *string = [[NSString alloc] initWithData:buffer encoding:NSISOLatin2StringEncoding];
I'm really curious to see if using CFStringEncoding
encodings improves your situation.
EDIT:
If your source was generated by Microsoft Excel, perhaps kCFStringEncodingWindowsLatin2
will work instead of kCFStringEncodingMacCentralEurRoman
. Like before, you'll need to convert it using CFStringConvertEncodingToNSStringEncoding.
There's one other approach you might want to try. Since CFStringRef
is "toll-bridged" to NSString
(and so is CFDataRef
to NSData
), perhaps working entirely in Core Foundation might work:
CFStringRef stringRef = CFStringCreateFromExternalRepresentation(kCFAllocatorDefault, (CFDataRef)buffer, kCFStringEncodingMacCentralEurRoman);
NSString *string = (NSString *)stringRef;
In this case, don't forget that stringRef
has to be released.
Good luck to you in your endeavors.
这篇关于目标C读取文件错误的编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!