获取维基百科文章摘要使用NSScanner问题 [英] Getting Wikipedia Article Summary using NSScanner Problem

查看:212
本文介绍了获取维基百科文章摘要使用NSScanner问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想获取文章的摘要并将其作为字符串下载。这工作伟大的一些文章,但维基百科网站是不一致的。所以NSScanner常常失败,而它对其他文章工作正常。

I am trying to get the summary of an article and download it as a string. This works great with some articles, but the wikipedia website is inconsistent. So NSScanner fails pretty often while it works fine for other articles.

这是我的NSScanner实现:

Here's my NSScanner implementation:

NSString *separatorString = @"<table id=\"toc\" class=\"toc\">";                                 
NSScanner *aScanner = nil;
NSString *container = nil;
NSString *muString = [NSString stringWithString:@"</table>"];

aScanner = [NSScanner scannerWithString:string];  
[aScanner setScanLocation:0];                                                   
[aScanner scanUpToString:muString intoString:nil];           
[aScanner scanString:muString intoString:nil];    

[aScanner scanUpToString:separatorString intoString:&container];

如何改善?还是有另一种方式获得这个?

How could this be improved? Or is there another way of getting this?

为了显示我想要的文章的位,以下是一个例子:

To visualize which bit of the article I want, here's an example:

http://en.wikipedia.org/wiki/Indigo

从这里我想要的一切从靛蓝是电磁谱的颜色到英语是在1289年。

from this I'd want everything from "Indigo is the color on the electromagnetic spectrum" to "in English was in 1289".

谢谢!

推荐答案

您可以使用,以便走向实际结构,而不是盲目地解析文本。

You could use WebKit's DOM API to walk the actual structure, rather than trying to parse the text blindly.

这篇关于获取维基百科文章摘要使用NSScanner问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆