NSXMLParser 在 & 和 & 上窒息 [英] NSXMLParser chokes on ampersand &
问题描述
我正在用 NSXMLParser 解析一些 HTML,它在遇到 & 符号的任何时候都会遇到解析器错误.我可以在解析之前过滤掉&符号,但我宁愿解析那里的所有内容.
I'm parsing some HTML with NSXMLParser and it hits a parser error anytime it encounters an ampersand. I could filter out ampersands before I parse it, but I'd rather parse everything that's there.
它给了我错误 68,NSXMLParserNAMERequiredError: Name is required.
It's giving me error 68, NSXMLParserNAMERequiredError: Name is required.
我最好的猜测是这是一个字符集问题.我对字符集的世界有点模糊,所以我认为我的无知正在咬我的屁股.源 HTML 使用字符集 iso-8859-1,因此我使用此代码来初始化解析器:
My best guess is that it's a character set issue. I'm a little fuzzy on the world of character sets, so I'm thinking my ignorance is biting me in the ass. The source HTML uses charset iso-8859-1, so I'm using this code to initialize the Parser:
NSString *dataString = [[[NSString alloc] initWithData:data encoding:NSISOLatin1StringEncoding] autorelease];
NSData *dataEncoded = [[dataString dataUsingEncoding:NSUTF8StringEncoding allowLossyConversion:YES] autorelease];
NSXMLParser *theParser = [[NSXMLParser alloc] initWithData:dataEncoded];
有什么想法吗?
推荐答案
致其他海报:当然,XML 是无效的...它是 HTML!
To the other posters: of course the XML is invalid... it's HTML!
您可能不应该尝试将 NSXMLParser 用于 HTML,而应该使用 libxml2
You probably shouldn't be trying to use NSXMLParser for HTML, but rather libxml2
要详细了解原因,请查看这篇文章一>.
For a closer look at why, check out this article.
这篇关于NSXMLParser 在 & 和 & 上窒息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!