获取维基百科文章摘要使用NSScanner问题 [英] Getting Wikipedia Article Summary using NSScanner Problem
问题描述
我想获取文章的摘要并将其作为字符串下载。这工作伟大的一些文章,但维基百科网站是不一致的。所以NSScanner常常失败,而它对其他文章工作正常。
I am trying to get the summary of an article and download it as a string. This works great with some articles, but the wikipedia website is inconsistent. So NSScanner fails pretty often while it works fine for other articles.
这是我的NSScanner实现:
Here's my NSScanner implementation:
NSString *separatorString = @"<table id=\"toc\" class=\"toc\">";
NSScanner *aScanner = nil;
NSString *container = nil;
NSString *muString = [NSString stringWithString:@"</table>"];
aScanner = [NSScanner scannerWithString:string];
[aScanner setScanLocation:0];
[aScanner scanUpToString:muString intoString:nil];
[aScanner scanString:muString intoString:nil];
[aScanner scanUpToString:separatorString intoString:&container];
如何改善?还是有另一种方式获得这个?
How could this be improved? Or is there another way of getting this?
为了显示我想要的文章的位,以下是一个例子:
To visualize which bit of the article I want, here's an example:
http://en.wikipedia.org/wiki/Indigo
从这里我想要的一切从靛蓝是电磁谱的颜色到英语是在1289年。
from this I'd want everything from "Indigo is the color on the electromagnetic spectrum" to "in English was in 1289".
谢谢!
推荐答案
You could use WebKit's DOM API to walk the actual structure, rather than trying to parse the text blindly.
这篇关于获取维基百科文章摘要使用NSScanner问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!