我如何解析 <img src>用正则表达式? [英] How can I parse <img src> with a regex?

查看:40
本文介绍了我如何解析 <img src>用正则表达式?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要一个聪明的正则表达式来匹配这些中的 ...:

I need a clever regex to match ... in these:

<img src="..."
<img src='...'
<img src=...

我想匹配src的内部内容,但如果它被,'或none包围.这意味着<img src=..." 不能被接受.

I want to match the inner content of src, but only if it is surrounded by ", ' or none. This means that <img src=..." or <img src='... must not be accepted.

任何想法如何将这 3 种情况与一个正则表达式匹配.

Any ideas how to match these 3 cases with one regex.

到目前为止,我使用的是这样的 ("|'|[\s\S])(.*?)\1 并且我想要松散的部分是 hacky [\S\s] 我用它来匹配 ... 开头和结尾的缺失符号".

So far I use something like this ("|'|[\s\S])(.*?)\1 and the part that I want to get loose is the hacky [\S\s] which I use to match "missing symbol" on the beginning and the end of the ....

推荐答案

哇,我今天要回答的第二个.

Wow, second one I'm answering today.

不要使用正则表达式解析 HTML.使用 HTML/XML 解析器,您的生活会轻松很多.Tidy 会为您清理 HTML 代码,因此您可以先通过 Tidy 运行 HTML,然后通过解析器.除了清理之外,一些基于 tidy 的库还会执行解析,因此您甚至可能不必通过另一个解析器运行它.

Don't parse HTML with regex. Use an HTML/XML parser and your life will be much easier. Tidy will clean up your HTML code for you, so you can run the HTML through Tidy first and then through a parser. Some tidy-based libraries will perform parsing in addition to santizing, and so you may not even have to run it through another parser.

Java,例如有 JTidy 而 PHP 有 PHP 整理.

Java, for example has JTidy and PHP has PHP Tidy.

更新

根据我更好的判断,我给你这个:

Against my better judgement, I'm giving you this:

/]+)>/

这仅适用于您的特定情况.即便如此,它也不会考虑图像源名称中转义的 "'> 字符.可能有还有一堆其他限制.捕获组为您提供图像名称(如果名称被单引号或双引号括起来,它也会为您提供这些名称,但您可以将其去掉).

Which works only for your specific case. Even so, it will not take into account escaped " or ' in your image-source names, or the > character. There are probably a bunch of other limitations as well. The capturing group gives you your image names (in the case of names surrounded by single or double quotes, it gives you those as well, but you can strip those out).

这篇关于我如何解析 &lt;img src&gt;用正则表达式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆