去除除链接之外的所有 HTML 标签 [英] Strip all HTML tags except links

查看:23
本文介绍了去除除链接之外的所有 HTML 标签的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试编写一个正则表达式来去除除链接之外的所有 HTML(分别为 <a href</a> 标签.它不必 100% 安全(我不担心注入攻击或任何事情,因为我正在解析已经批准并发布到 SWF 电影).

I am trying to write a regular expression to strip all HTML with the exception of links (the <a href and </a> tags respectively. It does not have to be 100% secure (I am not worried about injection attacks or anything as I am parsing content that has already been approved and published into a SWF movie).

我使用的原始strip tags"正则表达式是 <(.| )+?>,我尝试将其修改为 <([^a]| )+?>,但这当然会允许任何带有 a 的标签,而不是开头带有空格的标签.

The original "strip tags" regular expression I'm using was <(.| )+?>, and I tried to modify it to <([^a]| )+?>, but that of course will allow any tag that has an a in it rather than one that has it in the beginning, with a space.

并不是说它真的很重要,但如果有人想知道我在 ActionScript 3.0 用于 Flash 电影.

Not that it should really matter, but in case anyone cares to know I am writing this in ActionScript 3.0 for a Flash movie.

推荐答案

<(?!/?a(?=>|s.*>))/?.*?>

试试这个.p 标签有类似的东西.为他们工作,所以不明白为什么不.使用负前瞻来检查它是否与 a(以可选/字符为前缀)不匹配,其中(使用正前瞻)a(带有可选/前缀)后跟 > 或空格、东西然后是 >.然后匹配直到下一个 > 字符.用

Try this. Had something similar for p tags. Worked for them so don't see why not. Uses negative lookahead to check that it doesn't match a (prefixed with an optional / character) where (using positive lookahead) a (with optional / prefix) is followed by a > or a space, stuff and then >. This then matches up until the next > character. Put this in a subst with

s/<(?!/?a(?=>|s.*>))/?.*?>//g;

这应该只留下开始和结束标签

This should leave only the opening and closing a tags

这篇关于去除除链接之外的所有 HTML 标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆