如何获得所有< img src> iOS UIWebView中的网页? [英] How to get all <img src> of a web page in iOS UIWebView?

查看:118
本文介绍了如何获得所有< img src> iOS UIWebView中的网页?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

每个人。

我正在尝试在UIWebView中获取当前页面的所有图像网址。

I'm trying to get all image urls of the current page in UIWebView.

所以,这是我的代码。

- (void)webViewDidFinishLoad:(UIWebView*)webView {
    NSString *firstImageUrl = [self.webView stringByEvaluatingJavaScriptFromString:@"var images = document.getElementsByTagName('img');images[0].src.toString();"];
    NSString *imageUrls = [self.webView stringByEvaluatingJavaScriptFromString:@"var images= document.getElementsByTagName('img');var imageUrls = "";for(var i = 0; i < images.length; i++){var image = images[i];imageUrls += image.src;imageUrls += \\’,\\’;}imageUrls.toString();"];
    NSLog(@"firstUrl : %@", firstImageUrl);
    NSLog(@"images : %@",imageUrls);
}

第一个NSLog返回正确图像的src,但第二个NSLog不返回任何内容。

1st NSLog returns correct image's src, but 2nd NSLog returns nothing.

2013-01-25 00:51:23.253 WebDemo[3416:907] firstUrl: https://www.paypalobjects.com/en_US/i/scr/pixel.gif
2013-01-25 00:51:23.254 WebDemo[3416:907] images :

我不知道为什么。
请帮帮我...

I don't know why. Please help me...

谢谢。

推荐答案

Perrohunter指出了一个 NSRegularExpression 解决方案,这很棒。如果您不想枚举匹配数组,可以使用基于块的 enumerateMatchesInString 方法:

Perrohunter pointed out one NSRegularExpression solution, which is great. If you don't want to enumerate the array of matches, you can use the block-based enumerateMatchesInString method, too:

NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"(<img\\s[\\s\\S]*?src\\s*?=\\s*?['\"](.*?)['\"][\\s\\S]*?>)+?"
                                                                       options:NSRegularExpressionCaseInsensitive
                                                                         error:&error];

[regex enumerateMatchesInString:yourHTMLSourceCodeString
                        options:0
                          range:NSMakeRange(0, [yourHTMLSourceCodeString length])
                     usingBlock:^(NSTextCheckingResult *result, NSMatchingFlags flags, BOOL *stop) {

                         NSString *img = [yourHTMLSourceCodeString substringWithRange:[result rangeAtIndex:2]];
                         NSLog(@"img src %@",img);
                     }];

我还更新了正则表达式模式以处理以下问题:

I've also updated the regex pattern to deal with the following issues:


  • 起始 img 标签和 src 属性;

  • src 属性之后和> 之前可以有属性;

  • img 标记的中间可以有换行符(捕获除换行符之外的所有内容);

  • src 属性值可以用'以及;和

  • src 和<$ c之间可以有空格$ c> = 以及 = 和后续值之间。

  • there can be attributes between the start img tag and the src attribute;
  • there can be attributes after the src attribute and before the >;
  • there can be newline characters in the middle of an img tag (the . captures everything except newline character);
  • the src attribute value can be quoted with ' as well as "; and
  • there can be spaces between src and the = as well as between the = and the subsequent value.

我自由地认识到阅读正则表达式模式对于没有经验的人来说是痛苦的,也许其他解决方案可能更有意义(Joris的JSON建议,使用扫描仪等)。但是如果你想使用正则表达式,上面的模式可能会覆盖 img 标记的一些排列,而 enumerateMatchesInString 可能会更加高效比 matchesInString

I freely recognize that reading regex patterns is painful for the uninitiated, and perhaps other solutions might make more sense (the JSON suggestion by Joris, using scanners, etc.). But if you wanted to use regex, the above pattern might cover a few more permutations of the img tag, and enumerateMatchesInString might be ever so slightly more efficient than matchesInString.

这篇关于如何获得所有&lt; img src&gt; iOS UIWebView中的网页?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆