正则表达式的图像问题 [英] image problems with regular expressions
问题描述
当我运行以下脚本时,图像渲染效果不佳。这里有什么问题?这是代码:
When I run the following script, the image is not rendered well. What is the problem here? This is the code:
<?php
header('Content-Type: text/html; charset=utf-8');
$url = "http://www.asaphshop.nl/epages/asaphnl.sf/nl_NL/
ObjectPath=/Shops/asaphnl/Products/80203122";
$htmlcode = file_get_contents($url);
$pattern = "/class=\"noscript\"\>(.*)\<\/div\>/isU";
preg_match_all($pattern, $htmlcode, $matches);
//print_r ($matches);
$image = ($matches[0][0]);
print_r ($image);
?>
这是我需要复制的链接的一部分(data-src-l部分):
This is the part of the link I need to copy (the data-src-l part):
<div id="ProductImages" class="noscript">
<ul>
<li>
<a href="/WebRoot/products/8020/80203122/bilder/80203122.jpg">
<img itemprop="image" alt="Jesus Remember Me - Taize Songs (2CD)"
src="/WebRoot/AsaphNL/Shops/asaphnl/5422/8F43/62EE/
D698/EF8E/4DEB/AED5/3B0E/80203122_xs.jpg"
data-src-xs="/WebRoot/AsaphNL/Shops/asaphnl/5422/8F43/62EE/
D698/EF8E/4DEB/AED5/3B0E/80203122_xs.jpg"
data-src-s="/WebRoot/products/8020/80203122/bilder/80203122_s.jpg"
data-src-m="/WebRoot/products/8020/80203122/bilder/80203122_m.jpg"
data-src-l="/WebRoot/products/8020/80203122/bilder/80203122.jpg"
/>
</a>
</li>
</ul>
</div>
推荐答案
$pattern = "#class=\"noscript\">.*data-src-l=([\"'])(?<url>.*)\\1.*</div>#isU";
但是,与DOM结构一样处理页面会更好,而不是作为一个字符串 \\ 1
是对([\'])
的反向引用,这样在字符串的末尾使用相同的引号。对于网址来说不是那么必要的,因为它们中不应该有直接引号(未转义),但是它对于一般目的是有用的。
But it is better to deal with the page as with the DOM structure, not as a string. \\1
is a backreference to ([\"'])
so that the same quotes are used at the end of the string. Not so necessary for the URLs as there should be no direct quotes (unescaped) in them, but it is good for general purpose.
ps:如果您需要< img
和 />
(包括它们) - $ pattern ='#class = noscript>。*(< img。*>)。*< / div> #isU';
ps: if you need everything between <img
and />
(including them) - $pattern = '#class="noscript">.*(<img.*>).*</div>#isU';
这篇关于正则表达式的图像问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!