HTML解析:如何从远程站点获取链接标记 [英] HTML parsing: how to get link tag from remote site
问题描述
我有一个网站(例如 apple.com ),其中包含链接标记,例如
< link rel =alternatetype =application / rss + xmltitle =RSShref =http:// images.apple.com/main/rss/hotnews/hotnews.rss/>
那么我怎么能从它得到标题RSS和href?
更新1:
我试图使用
NSURL请求requestWithURL:[NSURL URLWithString:@http://apple.com/]] returnsResponse:NULL error:NULL];
NSString * HTMLWithFeeds = [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding];
但我现在不知道该怎么做......
更新2:
从我的帖子不清楚,但除此之外,应该在本网站找到类型链接=application / rss + xml
您可以尝试使用正则表达式
NSError * error = NULL;
NSRegularExpression * regex = [NSRegularExpression regularExpressionWithPattern:@< link。*?href =(。*?)。*?>
选项:NSRegularExpressionCaseInsensitive
错误:& error];
NSArray * matches = [regex matchesInString:string
options:0
range:NSMakeRange(0,[string length])];;
for(NSTextCheckingResult * match in match){
NSRange matchRange = [match range];
NSRange firstHalfRange = [match rangeAtIndex:1];
NSRange secondHalfRange = [match rangeAtIndex:2];
}
苹果文档提供了一些关于如何进一步使用和访问匹配的例子:
如像下面的正则表达式应该为hrefs做:
< link。*?href =(。*?) *>?;
I have a site (for example apple.com) which contain link tag, for example
<link rel="alternate" type="application/rss+xml" title="RSS" href="http://images.apple.com/main/rss/hotnews/hotnews.rss" />
So how I can get title "RSS" and href from it?
Update 1: I've tried to convert site into string using
NSData *data = [NSURLConnection sendSynchronousRequest:[NSURLRequest requestWithURL:[NSURL URLWithString:@"http://apple.com/"]] returningResponse:NULL error:NULL];
NSString *HTMLWithFeeds = [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding];
But I dont know what to do now...
Update 2:
It is not clear from my post, but in addition in should find at this site link with type="application/rss+xml"
you might try using regular expressions
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"<link.*?href="(.*?)".*?>"
options:NSRegularExpressionCaseInsensitive
error:&error];
NSArray *matches = [regex matchesInString:string
options:0
range:NSMakeRange(0, [string length])];
for (NSTextCheckingResult *match in matches) {
NSRange matchRange = [match range];
NSRange firstHalfRange = [match rangeAtIndex:1];
NSRange secondHalfRange = [match rangeAtIndex:2];
}
Apples documentation has some examples about how to further use and access the matches:
e.g. something like the following regex should do for the hrefs:
<link.*?href="(.*?)".*?>
这篇关于HTML解析:如何从远程站点获取链接标记的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!