从< a>,< br>,< b>和< img> [英] Remove all HTML tags from a html body except <a>, <br>, <b> and <img>
问题描述
如何从Javascript中的字符串中移除所有HTML标记,如:
<任何...>
或
< /&任何GT;
除了这几个例子< x ...>
,< / x>
,< x ... />
$ c> x 存在:
-
a
-
br
-
b
-
img
例如:
s.replace(/< [^ a]。*> / g,'');
但我不确定如何去做。
示例:
< div id =hello> Hello< / div>< a href =test> Youhou< / a>`
应该变成
您好< a href =test> Youhou< / a>
注意:我正在寻找几行代码解决方案可以在90%的时间内工作(电子邮件正文来自我自己的电子邮件,所以我没有包含任何恶意),而不是需要第三方工具/库的完整解决方案。
尝试替换
< \ ?(?!(一个| BR | b | IMG)\b)/ \w + [^>] * GT;
with nothing 。
< \ /?
匹配开始<
,可选地后跟一个 /
(?!(a | br | b | img)\ b)
负前视确保我们不匹配 a
, br
, b
或 img
标签。 $ b
\w + [ ^>] *>
匹配标记的其余部分。 在这里regex101 。
When reading some email HTML body, I often have lots of HTML tags, that I don't want anymore.
How to remove from a string, in Javascript, all HTML tags like:
<anything ...>
or
</anything>
except these few cases <x ...>
, </x>
, <x ... />
for x
being:
a
br
b
img
I thought about something like:
s.replace(/<[^a].*>/g, '');
but I'm not sure how to do it.
Example:
<div id="hello">Hello</div><a href="test">Youhou</a>`
should become
Hello<a href="test">Youhou</a>
Note: I'm looking for a few lines-of-code solution that would work for 90% of the times (the email body comes from my own emails, so I didn't include anything malicious), not for a full solution that would require third-party tool/library.
Try replacing
<\/?(?!(a|br|b|img)\b)\w+[^>]*>
with nothing.
<\/?
Match the start <
, optionally followed by a /
(?!(a|br|b|img)\b)
Negative look-ahead ensuring we don't match a
, br
, b
or img
tags.
\w+[^>]*>
Match the rest of the tag.
这篇关于从< a>,< br>,< b>和< img>的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!