HTML解析:如何从远程站点获取链接标记 [英] HTML parsing: how to get link tag from remote site

查看:172
本文介绍了HTML解析:如何从远程站点获取链接标记的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个网站(例如 apple.com ),其中包含链接标记,例如

 < link rel =alternatetype =application / rss + xmltitle =RSShref =http:// images.apple.com/main/rss/hotnews/hotnews.rss/> 

那么我怎么能从它得到标题RSS和href?

更新1:
我试图使用

 NSURL请求requestWithURL:[NSURL URLWithString:@http://apple.com/]] returnsResponse:NULL error:NULL]; 
NSString * HTMLWithFeeds = [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding];

但我现在不知道该怎么做......



更新2:



从我的帖子不清楚,但除此之外,应该在本网站找到类型链接=application / rss + xml

解决方案

您可以尝试使用正则表达式

  NSError * error = NULL; 
NSRegularExpression * regex = [NSRegularExpression regularExpressionWithPattern:@< link。*?href =(。*?)。*?>
选项:NSRegularExpressionCaseInsensitive
错误:& error];

NSArray * matches = [regex matchesInString:string
options:0
range:NSMakeRange(0,[string length])];;
for(NSTextCheckingResult * match in match){
NSRange matchRange = [match range];
NSRange firstHalfRange = [match rangeAtIndex:1];
NSRange secondHalfRange = [match rangeAtIndex:2];
}

苹果文档提供了一些关于如何进一步使用和访问匹配的例子:



https://developer.apple.com/library/ios/#documentation/Foundation/Reference/NSRegularExpression_Class/Reference/Reference.html



如像下面的正则表达式应该为hrefs做:

 < link。*?href =(。*?) *>?; 


I have a site (for example apple.com) which contain link tag, for example

<link rel="alternate" type="application/rss+xml" title="RSS" href="http://images.apple.com/main/rss/hotnews/hotnews.rss" />

So how I can get title "RSS" and href from it?

Update 1: I've tried to convert site into string using

NSData *data = [NSURLConnection sendSynchronousRequest:[NSURLRequest requestWithURL:[NSURL URLWithString:@"http://apple.com/"]] returningResponse:NULL error:NULL];
NSString *HTMLWithFeeds = [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding];

But I dont know what to do now...

Update 2:

It is not clear from my post, but in addition in should find at this site link with type="application/rss+xml"

解决方案

you might try using regular expressions

NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"<link.*?href="(.*?)".*?>"
                                                                           options:NSRegularExpressionCaseInsensitive
                                                                             error:&error];

NSArray *matches = [regex matchesInString:string
                                  options:0
                                    range:NSMakeRange(0, [string length])];
for (NSTextCheckingResult *match in matches) {
     NSRange matchRange = [match range];
     NSRange firstHalfRange = [match rangeAtIndex:1];
     NSRange secondHalfRange = [match rangeAtIndex:2];
}

Apples documentation has some examples about how to further use and access the matches:

https://developer.apple.com/library/ios/#documentation/Foundation/Reference/NSRegularExpression_Class/Reference/Reference.html

e.g. something like the following regex should do for the hrefs:

<link.*?href="(.*?)".*?>

这篇关于HTML解析:如何从远程站点获取链接标记的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆