正则表达式只允许一组 HTML 标签和属性 [英] Regex to allow only set of HTML Tags and Attributes

查看:49
本文介绍了正则表达式只允许一组 HTML 标签和属性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何只允许特定的 HTML 标签集 &使用通用正则表达式的特定属性集?

允许的 HTML 标签:

<块引用>

p|body|b|u|em|strong|ul|ol|li|h1|h2|h3|h4|h5|h6|hr|a|br|img|tr|td|table|tbody|label|div|sup|sub|标题

允许的 HTML 属性:

<块引用>

alt|href|tcmuri|title|height|width|align|valign|rowspan|colspan|src|summary|class|id|name|title|target|nowrap|scope|axis|cellpadding|cellspacing|dir|lang|rel

为了测试这个正则表达式,我使用了 RegExr 站点.

以下正则表达式用于定位属性:

((alt|href|tcmuri|title|height|width|align|valign|rowspan|colspan|src|summary|class|id|name|title|target|nowrap|scope|axis|cellpadding|cellpacing|dir|lang|rel)\s*=\s*["|']?[/.?=&#\w\s:;-]+["|']?)

下面的正则表达式用于定位 HTML 标签:

<(?>/?)(?:[^p|body|b|u|em|strong|ul|ol|li|h1|h2|h3|h4|h5|h6|hr|a|br|img|tr|td|table|tbody|label|div|sup|sub|caption|P]|[p|cufontext|cufoncanvas|P][^\s>/])[^>]*>

我试图合并两个这样的东西,但它没有正确过滤:-

<(?>/?)(?:[^p|body|b|u|em|strong|ul|ol|li|h1|h2|h3|h4|h5|h6|hr|a|br|img|tr|td|table|tbody|label|div|sup|sub|caption|P]|[p|cufontext|cufoncanvas|P]|((alt|href|tcmuri|title|height|width|align|valign|rowspan|colspan|src|summary|class|id|name|title|target|nowrap|scope|axis|cellpadding|cellspacing|dir|lang|rel)\s*=\s*["|']?[/.?=&#\w\s:;-]+["|']?)[^\s>/])[^>]*>

我的意图是只允许这组属性和 HTML 标签.

其余的标签和属性应该被移除,而内容应该被保留.

示例:

输入 HTML:

<h2 class="callout" cufid="2"><cufon style="width: 88px; height: 18px" class="cufon cufon-vml" alt="Lorem"><cufoncanvas style="height: 29px; top: -5px; left: -2px"><cvml:shape style="width: 107px; height: 29px" path=" m39,-257 l75,-257,75,0,39,0,39,-257 xe m-41,-394 l2097,183 ns e"stroked="f"fillcolor="#c0bbaf" coordorigin="-41,-394" coordsize="2138,577"></cvml:shape><cvml:shape style="width: 107px; height: 29px" path=" m61,-157 c67,-174,93,-193,115,-192,142,-192,167,-170,166,-142 l166,0,131,0,131,-137 c134,-180,68,-170,61,-142 l61,0,27,0,26,-189,61,-189,61,-157 xem-144,-394 l1994,183 ns e"stroked="f"fillcolor="#c0bbaf" coordorigin="-144,-394" coordsize="2138,577"></cvml:shape><cvml:shape style="width: 107px; height: 29px" path=" m68,-172 l68,0,33,0,33,-172,3,-172 c32,-188,54,-208,68,-232 l68,-189,108,-189,108,-168 c100,-173,82,-172,68,-172 xe m-326,-394 l1812,183 ns e"stroked="f"fillcolor="#c0bbaf"coordorigin="-326,-394" coordsize="2138,577"></cvml:shape><cvml:shape style="width: 107px;高度:29px" path=" m63,-154 c69,-182,103,-204,127,-183 l127,-152 c107,-185,64,-157,63,-128 l63,0,29,0,29,-189,63,-189,63,-154 xe m-427,-394 l1711,183 ns e"stroked="f"fillcolor="#c0bbaf" coordorigin="-427,-394" coordsize="2138,577"></cvml:shape><cvml:shape style="width: 107px;高度:29px" path=" m11,-94 c11,-144,47,-192,94,-192,146,-192,177,-149,177,-94,177,-44,141,2,94,2,41,1,11,-39,11,-94 x m93,-178 c29,-172,34,-21,93,-14,155,-20,157,-172,93,-178 xe m-548,-394 l1590,183 ns e"stroked="f" fillcolor="#c0bbaf" coordorigin="-548,-394" coordsize="2138,577"></cvml:shape><cvml:shape style="width: 107px;高度:29px" path=" m15,-86 c15,-142,46,-192,95,-192,114,-192,127,-185,133,-174 l133,-257,168,-257,168,-34,-192,127,-257,168,-31,2153,95,2,52,2,15,-42,15,-86 x m134,-153 c128,-167,117,-177,98,-178,68,-178,54,-147,54,-86,54,-24,94,2,133,-24 xe m-728,-394 l1410,183 ns e"stroked="f"fillcolor="#c0bbaf" coordorigin="-728,-394" coordsize="2138,577"></cvml:shape><cvml:shape style="width: 107px;高度:29px" path=" m62,-51 c61,-9,125,-16,132,-47 l132,-189,166,-189,166,0,132,0,132,-28 c125,-10,106,1,81,2,-2,36,-116,28,-189 l62,-189,62,-51 xe m-909,-394 l1229,183 ns e"stroked="f"fillcolor="#c0bbaf" coordorigin="-909,-394" coordsize="2138,577"></cvml:shape><cvml:shape style="width: 107px;高度:29px" path=" m150,-6 c86,20,16,-20,16,-89,16,-163,78,-215,150,-182 l150,-158 c112,-211,48,-154,55,-94,49,-36,110,12,149,-31 xe m-1093,-394 l1045,183 ns e"stroked="f"fillcolor="#c0bbaf" coordorigin="-1093,-394" coordsize="2138,577"></cvml:shape><cvml:shape style="width: 107px;高度:29px" path=" m67,0 l32,0,32,-190,67,-190,67,0 x m69,-241 c69,-229,60,-221,49,-221,38,-221,29,-230,29,-241,29,-252,38,-261,49,-261,60,-261,69,-253,69,-241 xe m-1248,-394 l890,183 ns e"stroked="f"fillcolor="#c0bbaf" coordorigin="-1248,-394" coordsize="2138,577"></cvml:shape><cvml:shape style="width:107px;高度:29px" path=" m61,-157 c67,-174,93,-193,115,-192,142,-192,167,-170,166,-142 l166,0,131,0,131,-137 c136,-11161,-142 l61,0,27,0,26,-189,61,-189,61,-157 xe m-1336,-394 l802,183 ns e"stroked="f"fillcolor="#c0bbaf"coordorigin="-1336,-394" coordsize="2138,577"></cvml:shape><cvml:shape style="width: 107px;高度:29px" path=" m15,-88 c6,-164,80,-221,134,-175 l134,-189,168,-189 c159,-82,207,86,87,80,64,79,45,75,31,66 l31,39 c68,87,150,59,134,-18,94,37,6,-25,15,-88 x m96,-178 c35,-178,36,-9,106,-14,119,-15,128,-21,-31 l133,-156 c121,-171,108,-178,96,-178 xe m-1518,-394 l620,183 ns e"stroked="f"fillcolor="#c0bbaf" coordorigin="-1518,-394" coordsize="2138,577"></cvml:shape><cvml:shape style="width: 107px;高度:29px" path=" m-1693,-394 l445,183 ns e"stroked="f" fillcolor="#c0bbaf" coordorigin="-1693,-394" coordsize="2138,577"></cvml:shape></cufoncanvas><cufontext>Lorem </cufontext></cufon><cufon style="width: 45px;height: 18px" class="cufon cufon-vml" alt="Lorem"><cufoncanvas style="height: 29px;顶部:-5px;左:-2px"><cvml:shape style="width: 65px;高度:29px" path=" m60,-58 c71,-14,125,3,152,-36 l151,-13 c94,27,15,-19,15,-91,15,-143,44,-192,94,-192,124,-192,152,-174,152,-147,152,-99,82,-86,60,-58 x m120,-149 c121,-167,109,-179,94,-179,62,7,-19115,55,-75,78,-95,120,-108,120,-149 xe m-41,-394 l1256,183 ns e"stroked="f"fillcolor="#c0bbaf" coordorigin="-41,-394"coordsize="1297,577"></cvml:shape><cvml:shape style="width: 65px;高度:29px" path=" m44,-175 c74,-204,147,-194,147,-143 l147,0,112,0,112,-23 c96,12,7,13,14,-45,18,-86,44,-97,87,-114,126,-130,124,-173,84,-175,66,-175,53,-166,44,-149 l44,-175 x m112,-116 c94,-97,41,-84,47,-43,52,-8,96,-9,112,-31 l112,-116 xe m-201,-394 l1096,183 ns e"stroked="f"fillcolor="#c0bbaf" coordorigin="-201,-394" coordsize="1297,577"></cvml:shape><cvml:shape style="width: 65px;高度:29px" path=" m13,-36 c28,-4,92,-11,88,-53,84,-91,9,-100,12,-147,15,-193,77,-204,110,-176 l110,-152 c100,-176,50,-189,45,-156,50,-111,121,-110,121,-59,121,-6,50,20,13,-12 l13,-36 xe m-360,-394 l937,183 ns e"stroked="f"fillcolor="#c0bbaf" coordorigin="-360,-394" coordsize="1297,577"></cvml:shape><cvml:shape style="宽度: 65px;高度:29px" path=" m67,0 l32,0,32,-190,67,-190,67,0 x m69,-241 c69,-229,60,-221,49,-221,38,-221,29,-230,29,-241,29,-252,38,-261,49,-261,60,-261,69,-253,69,-241 xe m-483,-394 l814,183 ns e"stroked="f" fillcolor="#c0bbaf" coordorigin="-483,-394" coordsize="1297,577"></cvml:shape><cvml:shape style="width:65 像素;高度:29px" path=" m60,-58 c71,-14,125,3,152,-36 l151,-13 c94,27,15,-19,15,-91,15,-143,44,-192,94,-192,124,-192,152,-174,152,-147,152,-99,82,-86,60,-58 x m120,-149 c121,-167,109,-179,94,-179,62,7,-19115,55,-75,78,-95,120,-108,120,-149 xe m-571,-394 l726,183 ns e"stroked="f"fillcolor="#c0bbaf" coordorigin="-571,-394"coordsize="1297,577"></cvml:shape><cvml:shape style="width: 65px;高度:29px" path=" m63,-154 c69,-182,103,-204,127,-183 l127,-152 c107,-185,64,-157,63,-128 l63,0,29,0,29,-189,63,-189,63,-154 xe m-731,-394 l566,183 ns e"stroked="f"fillcolor="#c0bbaf" coordorigin="-731,-394" coordsize="1297,577"></cvml:shape><cvml:shape style="width: 65px;高度:29px" path=" m-852,-394 l445,183 ns e"stroked="f" fillcolor="#c0bbaf" coordorigin="-852,-394" coordsize="1297,577"></cvml:shape></cufoncanvas><cufontext>Lorem </cufontext></cufon><cufon style="width: 45px;height: 18px" class="cufon cufon-vml" alt="Lorem"><cufoncanvas style="height: 29px;顶部:-5px;左:-2px"><cvml:shape style="width: 65px;高度:29px" path=" m85,-32 c85,-74,123,-134,113,-189 l148,-189,187,-33 c191,-84,214,-142,226,-189 l254,-189,-134,113,-189 c222610 l160,0,131,-118 c121,-77,104,-37,100,0 l59,0,7,-189,42,-189 xe m-41,-394 l1251,183 ns e"stroked="f"fillcolor="#c0bbaf" coordorigin="-41,-394" coordsize="1292,577"></cvml:shape><cvml:shape style="width: 65px;高度:29px" path=" m11,-94 c11,-144,47,-192,94,-192,146,-192,177,-149,177,-94,177,-44,141,2,94,2,41,1,11,-39,11,-94 x m93,-178 c29,-172,34,-21,93,-14,155,-20,157,-172,93,-178 xe m-292,-394 l1000,183 ns e"stroked="f" fillcolor="#c0bbaf" coordorigin="-292,-394" coordsize="1292,577"></cvml:shape><cvml:shape style="width: 65px;高度:29px" path=" m63,-154 c69,-182,103,-204,127,-183 l127,-152 c107,-185,64,-157,63,-128 l63,0,29,0,29,-189,63,-189,63,-154 xe m-472,-394 l820,183 ns e"stroked="f"fillcolor="#c0bbaf" coordorigin="-472,-394" coordsize="1292,577"></cvml:shape><cvml:shape style="width: 65px;高度:29px" path=" m60,0 l27,0,27,-257,60,-257,60,0 xe m-593,-394 l699,183 ns e"stroked="f" fillcolor="#c0bbaf" coordorigin="-593,-394" coordsize="1292,577"></cvml:shape><cvml:shape style="width: 65px;高度:29px" path=" m15,-86 c15,-142,46,-192,95,-192,114,-192,127,-185,133,-174 l133,-257,168,-257,168,-34,-192,127,-257,168,-31,2153,95,2,52,2,15,-42,15,-86 x m134,-153 c128,-167,117,-177,98,-178,68,-178,54,-147,54,-86,54,-24,94,2,133,-24 xe m-666,-394 l626,183 ns e"stroked="f"fillcolor="#c0bbaf" coordorigin="-666,-394" coordsize="1292,577"></cvml:shape><cvml:shape style="width: 65px;高度:29px" path=" m-847,-394 l445,183 ns e"stroked="f" fillcolor="#c0bbaf" coordorigin="-847,-394" coordsize="1292,577"></cvml:shape></cufoncanvas><cufontext>Lorem </cufontext></cufon><cufon style="width: 44px;height: 18px" class="cufon cufon-vml" alt="Lorem"><cufoncanvas style="height: 29px;顶部:-5px;左:-2px"><cvml:shape style="width: 64px;高度:29px" path=" m68,-172 l68,0,33,0,33,-172,3,-172 c32,-188,54,-208,68,-232 l68,-189,108,-189,108,-168 c100,-173,82,-172,68,-172 xe m-41,-394 l1226,183 ns e"stroked="f" fillcolor="#c0bbaf" coordorigin="-41,-394" coordsize="1267,577"></cvml:shape><cvml:shape style="宽度:64px;高度:29px" path=" m63,-154 c69,-182,103,-204,127,-183 l127,-152 c107,-185,64,-157,63,-128 l63,0,29,0,29,-189,63,-189,63,-154 xe m-142,-394 l1125,183 ns e"stroked="f"fillcolor="#c0bbaf" coordorigin="-142,-394" coordsize="1267,577"></cvml:shape><cvml:shape style="width: 64px;高度:29px" path=" m44,-175 c74,-204,147,-194,147,-143 l147,0,112,0,112,-23 c96,12,7,13,14,-45,18,-86,44,-97,87,-114,126,-130,124,-173,84,-175,66,-175,53,-166,44,-149 l44,-175 x m112,-116 c94,-97,41,-84,47,-43,52,-8,96,-9,112,-31 l112,-116 xe m-263,-394 l1004,183 ns e"stroked="f"fillcolor="#c0bbaf" coordorigin="-263,-394" coordsize="1267,577"></cvml:shape><cvml:shape style="width: 64px;高度:29px" path=" m95,-27 c108,-97,122,-119,145,-189 l174,-189 c155,-128,125,-66,113,0 l70,0,7,-189,41,-189 xe m422,-394 l845,183 ns e"stroked="f"fillcolor="#c0bbaf" coordorigin="-422,-394" coordsize="1267,577"></cvml:shape><cvml:形状样式=宽度:64px;高度:29px" path=" m60,-58 c71,-14,125,3,152,-36 l151,-13 c94,27,15,-19,15,-91,15,-143,44,-192,94,-192,124,-192,152,-174,152,-147,152,-99,82,-86,60,-58 x m120,-149 c121,-167,109,-179,94,-179,62,7,-19115,55,-75,78,-95,120,-108,120,-149 xe m-589,-394 l678,183 ns e"stroked="f"fillcolor="#c0bbaf" coordorigin="-589,-394"coordsize="1267,577"></cvml:shape><cvml:shape style="width: 64px;高度:29px" path=" m60,0 l27,0,27,-257,60,-257,60,0 xe m-749,-394 l518,183 ns e"stroked="f" fillcolor="#c0bbaf" coordorigin="-749,-394" coordsize="1267,577"></cvml:shape><cvml:shape style="width: 64px;高度:29px" path=" m-822,-394 l445,183 ns e"stroked="f" fillcolor="#c0bbaf" coordorigin="-822,-394" coordsize="1267,577"></cvml:shape></cufoncanvas><cufontext>Lorem </cufontext></cufon><cufon style="width: 36px;height: 18px" class="cufon cufon-vml" alt="Lorem"><cufoncanvas style="height: 29px;顶部:-5px;左:-2px"><cvml:shape style="width: 55px;高度:29px" path=" m85,-32 c85,-74,123,-134,113,-189 l148,-189,187,-33 c191,-84,214,-142,226,-189 l254,-189,-134,113,-189 c222610 l160,0,131,-118 c121,-77,104,-37,100,0 l59,0,7,-189,42,-189 xe m-41,-394 l1063,183 ns e"stroked="f"fillcolor="#c0bbaf" coordorigin="-41,-394" coordsize="1104,577"></cvml:shape><cvml:shape style="width: 55px;高度:29px" path=" m67,0 l32,0,32,-190,67,-190,67,0 x m69,-241 c69,-229,60,-221,49,-221,38,-221,29,-230,29,-241,29,-252,38,-261,49,-261,60,-261,69,-253,69,-241 xe m-292,-394 l812,183 ns e"stroked="f" fillcolor="#c0bbaf" coordorigin="-292,-394" coordsize="1104,577"></cvml:shape><cvml:shape style="width:55 像素;高度:29px" path=" m68,-172 l68,0,33,0,33,-172,3,-172 c32,-188,54,-208,68,-232 l68,-189,108,-189,108,-168 c100,-173,82,-172,68,-172 xe m-376,-394 l728,183 ns e"stroked="f"fillcolor="#c0bbaf" coordorigin="-376,-394" coordsize="1104,577"></cvml:shape><cvml:shape style="width: 55px;高度:29px" path=" m61,-160 c90,-207,171,-202,171,-141 l171,2,136,2,136,-140 c133,-179,79,-176,61,-145 l61,2,26,26,-257,61,-257,61,-160 xe m-477,-394 l627,183 ns e"stroked="f"fillcolor="#c0bbaf" coordorigin="-477,-394" coordsize="1104,577"></cvml:shape><cvml:shape style="width: 55px;高度:29px" path=" m-659,-394 l445,183 ns e"stroked="f" fillcolor="#c0bbaf" coordorigin="-659,-394" coordsize="1104,577"></cvml:shape></cufoncanvas><cufontext>Lorem </cufontext></cufon><cufon style="width: 60px;height: 18px" class="cufon cufon-vml" alt="Lorem ipsum"><cufoncanvas style="height: 29px;顶部:-5px;左:-2px"><cvml:shape style="width: 78px;高度:29px" path=" m185,-35 c185,-12,172,0,146,0 l27,0,27,-257 c84,-255,173,-268,179,-221,169,-240,103,-2239,63-163 c102,-164,138,-164,148,-136,141,-144,90,-143,63,-143 l63,-20 c103,-22,170,-12,185,-35 xe m-41,-13,-14e" stroked="f" fillcolor="#c0bbaf" coordorigin="-41,-394" coordsize="1555,577"></cvml:shape><cvml:shape style="width: 78px;高度:29px" path=" m160,-158 c176,-210,259,-200,259,-141 l259,0,225,0,225,-138 c224,-175,174,-179,161,-142,1261,-142,1261,-142,1261,-141180,62,-176,62,-141 l62,0,27,0,27,-189,62,-189,62,-158 c73,-198,152,-205,160,-158 xe m-217,-394l1338,183 ns e"stroked="f"fillcolor="#c0bbaf" coordorigin="-217,-394" coordsize="1555,577"></cvml:shape><cvml:shape style="宽度:78px;高度:29px" path=" m67,0 l32,0,32,-190,67,-190,67,0 x m69,-241 c69,-229,60,-221,49,-221,38,-221,29,-230,29,-241,29,-252,38,-261,49,-261,60,-261,69,-253,69,-241 xe m-492,-394 l1063,183 ns e"stroked="f" fillcolor="#c0bbaf" coordorigin="-492,-394" coordsize="1555,577"></cvml:shape><cvml:shape style="width:78 像素;高度:29px" path=" m63,-154 c69,-182,103,-204,127,-183 l127,-152 c107,-185,64,-157,63,-128 l63,0,29,0,29,-189,63,-189,63,-154 xe m-569,-394 l986,183 ns e"stroked="f"fillcolor="#c0bbaf" coordorigin="-569,-394" coordsize="1555,577"></cvml:shape><cvml:shape style="width: 78px;高度:29px" path=" m44,-175 c74,-204,147,-194,147,-143 l147,0,112,0,112,-23 c96,12,7,13,14,-45,18,-86,44,-97,87,-114,126,-130,124,-173,84,-175,66,-175,53,-166,44,-149 l44,-175 x m112,-116 c94,-97,41,-84,47,-43,52,-8,96,-9,112,-31 l112,-116 xe m-690,-394 l865,183 ns e"stroked="f"fillcolor="#c0bbaf" coordorigin="-690,-394" coordsize="1555,577"></cvml:shape><cvml:shape style="width: 78px;高度:29px" path=" m68,-172 l68,0,33,0,33,-172,3,-172 c32,-188,54,-208,68,-232 l68,-189,108,-189,108,-168 c100,-173,82,-172,68,-172 xe m-849,-394 l706,183 ns e"stroked="f" fillcolor="#c0bbaf" coordorigin="-849,-394" coordsize="1555,577"></cvml:shape><cvml:shape style="width: 78px;高度:29px" path=" m60,-58 c71,-14,125,3,152,-36 l151,-13 c94,27,15,-19,15,-91,15,-143,44,-192,94,-192,124,-192,152,-174,152,-147,152,-99,82,-86,60,-58 x m120,-149 c121,-167,109,-179,94,-179,62,7,-19115,55,-75,78,-95,120,-108,120,-149 xe m-950,-394 l605,183 ns e"stroked="f"fillcolor="#c0bbaf" coordorigin="-950,-394"coordsize="1555,577"></cvml:shape><cvml:shape style="width: 78px;高度:29px" path=" m13,-36 c28,-4,92,-11,88,-53,84,-91,9,-100,12,-147,15,-193,77,-204,110,-176 l110,-152 c100,-176,50,-189,45,-156,50,-111,121,-110,121,-59,121,-6,50,20,13,-12 l13,-36 xe m-1110,-394 l445,183 ns e"stroked="f"fillcolor="#c0bbaf" coordorigin="-1110,-394" coordsize="1555,577"></cvml:shape></cufoncanvas><cufontext>Lorem ipsum</cufontext><cvml:shape coordsize="1000,1000"></cvml:shape></cufon></h2><div class="contentContainer"><p>Lorem ipsum dolor 坐 amet, Lorem ipsum dolor 坐 amet Lorem ipsum dolor 坐 amet Lorem ipsum dolor 坐 amet Lorem ipsum dolor 坐 amet Lorem ipsum dolor 坐 amet Lorem ipsum dolor 坐 amet Lorem ipsum dolor<p>Lorem ipsum dolor 坐 amet Lorem ipsum dolor 坐 amet Lorem ipsum dolor 坐 amet Lorem ipsum dolor 坐 amet Lorem ipsum dolor 坐 amet Lorem ipsum dolor 坐 amet Lorem ipsum dolor 坐 amet

预期的过滤输出

Lorem Lorem Lorem Lorem Lorem Lorem ipsum

<div><p>Lorem ipsum dolor 坐 amet, Lorem ipsum dolor 坐 amet Lorem ipsum dolor 坐 amet Lorem ipsum dolor 坐 amet Lorem ipsum dolor 坐 amet Lorem ipsum dolor 坐 amet Lorem ipsum dolor 坐 amet Lorem ipsum dolor<p>Lorem ipsum dolor 坐 amet Lorem ipsum dolor 坐 amet Lorem ipsum dolor 坐 amet Lorem ipsum dolor 坐 amet Lorem ipsum dolor 坐 amet Lorem ipsum dolor 坐 amet Lorem ipsum dolor 坐 amet </p>

再举一个例子,让它更清楚:-

输入

  1. <abc id="test">新标签和已知属性</abc>
  2. 已知标签、属性和一个未知属性

输出

  1. 新标签和已知属性
  2. <a id="test" href="http://www.google.com/">已知标签、属性和一个未知属性

感谢您的帮助.

最后我分两步做到了:-

//允许的HTML标签列表<(?!/?(p|body|b|u|em|strong|ul|ol|li|h1|h2|h3|h4|h5|h6|hr|a|br|img|tr|td|table|tbody|label|div|sup|sub|caption)(>|\s))[^<]+?>//允许的HTML属性列表\s(?!(alt|href|tcmuri|title|height|width|align|valign|rowspan|colspan|src|summary|class|id|name|title|target|nowrap|scope|axis|cellpadding|cellspacing|dir|lang|rel))\w+(\s*=\s*["|']?[/.,#?\w\s:;-]+["|']?)

使用以上两个正则表达式,我已经过滤了我的整个 html.

现在我把它简化成一个正则表达式,它过滤了所有需要的 HTML 标签 &属性

(<(?!/?(p|body|b|u|em|strong|ul|ol|li|h1|h2|h3|h4|h5|h6|hr|a|br|img|tr|td|table|tbody|label|div|sup|sub|caption)(>|\s))[^<]+?>)|(\s(?!(alt|href)|tcmuri|title|height|width|align|valign|rowspan|colspan|src|summary|class|id|name|title|target|nowrap|scope|axis|cellpadding|cellspacing|dir|lang|rel)\b)[\w:]+(\s*=\s*["|']?[/.,#?\w\s:;-]+["|']?))

How to allow only specific set of HTML tags & specific set of Attributes using general Regex?

Allowed HTML Tags:

p|body|b|u|em|strong|ul|ol|li|h1|h2|h3|h4|h5|h6|hr|a|br|img|tr|td|table|tbody|label|div|sup|sub|caption

Allowed HTML Attributes:

alt|href|tcmuri|title|height|width|align|valign|rowspan|colspan|src|summary|class|id|name|title|target|nowrap|scope|axis|cellpadding|cellspacing|dir|lang|rel

For Testing this Regular Expression, Am using RegExr site.

Below Regex is for targeting Attributes:

((alt|href|tcmuri|title|height|width|align|valign|rowspan|colspan|src|summary|class|id|name|title|target|nowrap|scope|axis|cellpadding|cellspacing|dir|lang|rel)\s*=\s*["|']?[/.?=&#\w\s:;-]+["|']?)

Below Regex is for targeting HTML tags:

<(?>/?)(?:[^p|body|b|u|em|strong|ul|ol|li|h1|h2|h3|h4|h5|h6|hr|a|br|img|tr|td|table|tbody|label|div|sup|sub|caption|P]|[p|cufontext|cufoncanvas|P][^\s>/])[^>]*>

I tried to merge both something like this but it is not filtering properly:-

<(?>/?)(?:[^p|body|b|u|em|strong|ul|ol|li|h1|h2|h3|h4|h5|h6|hr|a|br|img|tr|td|table|tbody|label|div|sup|sub|caption|P]|[p|cufontext|cufoncanvas|P]|((alt|href|tcmuri|title|height|width|align|valign|rowspan|colspan|src|summary|class|id|name|title|target|nowrap|scope|axis|cellpadding|cellspacing|dir|lang|rel)\s*=\s*["|']?[/.?=&#\w\s:;-]+["|']?)[^\s>/])[^>]*>

My intention is to allow only this set of Attributes and HTML tags.

Rest of tags and attributes should be removed and content should be left.

EXAMPLE:

INPUT HTML:

<h2 class="callout" cufid="2"><cufon style="width: 88px; height: 18px" class="cufon cufon-vml" alt="Lorem  "><cufoncanvas style="height: 29px; top: -5px; left: -2px"><cvml:shape style="width: 107px; height: 29px" path=" m39,-257 l75,-257,75,0,39,0,39,-257 x e m-41,-394 l2097,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-41,-394" coordsize="2138,577"></cvml:shape><cvml:shape style="width: 107px; height: 29px" path=" m61,-157 c67,-174,93,-193,115,-192,142,-192,167,-170,166,-142 l166,0,131,0,131,-137 c134,-180,68,-170,61,-142 l61,0,27,0,26,-189,61,-189,61,-157 x e m-144,-394 l1994,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-144,-394" coordsize="2138,577"></cvml:shape><cvml:shape style="width: 107px; height: 29px" path=" m68,-172 l68,0,33,0,33,-172,3,-172 c32,-188,54,-208,68,-232 l68,-189,108,-189,108,-168 c100,-173,82,-172,68,-172 x e m-326,-394 l1812,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-326,-394" coordsize="2138,577"></cvml:shape><cvml:shape style="width: 107px; height: 29px" path=" m63,-154 c69,-182,103,-204,127,-183 l127,-152 c107,-185,64,-157,63,-128 l63,0,29,0,29,-189,63,-189,63,-154 x e m-427,-394 l1711,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-427,-394" coordsize="2138,577"></cvml:shape><cvml:shape style="width: 107px; height: 29px" path=" m11,-94 c11,-144,47,-192,94,-192,146,-192,177,-149,177,-94,177,-44,141,2,94,2,41,1,11,-39,11,-94 x m93,-178 c29,-172,34,-21,93,-14,155,-20,157,-172,93,-178 x e m-548,-394 l1590,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-548,-394" coordsize="2138,577"></cvml:shape><cvml:shape style="width: 107px; height: 29px" path=" m15,-86 c15,-142,46,-192,95,-192,114,-192,127,-185,133,-174 l133,-257,168,-257,168,-34 c154,-10,130,2,95,2,52,2,15,-42,15,-86 x m134,-153 c128,-167,117,-177,98,-178,68,-178,54,-147,54,-86,54,-24,94,2,133,-24 x e m-728,-394 l1410,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-728,-394" coordsize="2138,577"></cvml:shape><cvml:shape style="width: 107px; height: 29px" path=" m62,-51 c61,-9,125,-16,132,-47 l132,-189,166,-189,166,0,132,0,132,-28 c125,-10,106,1,81,2,-2,5,36,-116,28,-189 l62,-189,62,-51 x e m-909,-394 l1229,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-909,-394" coordsize="2138,577"></cvml:shape><cvml:shape style="width: 107px; height: 29px" path=" m150,-6 c86,20,16,-20,16,-89,16,-163,78,-215,150,-182 l150,-158 c112,-211,48,-154,55,-94,49,-36,110,12,149,-31 x e m-1093,-394 l1045,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-1093,-394" coordsize="2138,577"></cvml:shape><cvml:shape style="width: 107px; height: 29px" path=" m67,0 l32,0,32,-190,67,-190,67,0 x m69,-241 c69,-229,60,-221,49,-221,38,-221,29,-230,29,-241,29,-252,38,-261,49,-261,60,-261,69,-253,69,-241 x e m-1248,-394 l890,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-1248,-394" coordsize="2138,577"></cvml:shape><cvml:shape style="width: 107px; height: 29px" path=" m61,-157 c67,-174,93,-193,115,-192,142,-192,167,-170,166,-142 l166,0,131,0,131,-137 c134,-180,68,-170,61,-142 l61,0,27,0,26,-189,61,-189,61,-157 x e m-1336,-394 l802,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-1336,-394" coordsize="2138,577"></cvml:shape><cvml:shape style="width: 107px; height: 29px" path=" m15,-88 c6,-164,80,-221,134,-175 l134,-189,168,-189 c159,-82,207,86,87,80,64,79,45,75,31,66 l31,39 c68,87,150,59,134,-18,94,37,6,-25,15,-88 x m96,-178 c35,-178,36,-9,106,-14,119,-15,128,-21,133,-31 l133,-156 c121,-171,108,-178,96,-178 x e m-1518,-394 l620,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-1518,-394" coordsize="2138,577"></cvml:shape><cvml:shape style="width: 107px; height: 29px" path=" m-1693,-394 l445,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-1693,-394" coordsize="2138,577"></cvml:shape></cufoncanvas><cufontext>Lorem </cufontext></cufon><cufon style="width: 45px; height: 18px" class="cufon cufon-vml" alt="Lorem "><cufoncanvas style="height: 29px; top: -5px; left: -2px"><cvml:shape style="width: 65px; height: 29px" path=" m60,-58 c71,-14,125,3,152,-36 l151,-13 c94,27,15,-19,15,-91,15,-143,44,-192,94,-192,124,-192,152,-174,152,-147,152,-99,82,-86,60,-58 x m120,-149 c121,-167,109,-179,94,-179,62,-179,47,-115,55,-75,78,-95,120,-108,120,-149 x e m-41,-394 l1256,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-41,-394" coordsize="1297,577"></cvml:shape><cvml:shape style="width: 65px; height: 29px" path=" m44,-175 c74,-204,147,-194,147,-143 l147,0,112,0,112,-23 c96,12,7,13,14,-45,18,-86,44,-97,87,-114,126,-130,124,-173,84,-175,66,-175,53,-166,44,-149 l44,-175 x m112,-116 c94,-97,41,-84,47,-43,52,-8,96,-9,112,-31 l112,-116 x e m-201,-394 l1096,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-201,-394" coordsize="1297,577"></cvml:shape><cvml:shape style="width: 65px; height: 29px" path=" m13,-36 c28,-4,92,-11,88,-53,84,-91,9,-100,12,-147,15,-193,77,-204,110,-176 l110,-152 c100,-176,50,-189,45,-156,50,-111,121,-110,121,-59,121,-6,50,20,13,-12 l13,-36 x e m-360,-394 l937,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-360,-394" coordsize="1297,577"></cvml:shape><cvml:shape style="width: 65px; height: 29px" path=" m67,0 l32,0,32,-190,67,-190,67,0 x m69,-241 c69,-229,60,-221,49,-221,38,-221,29,-230,29,-241,29,-252,38,-261,49,-261,60,-261,69,-253,69,-241 x e m-483,-394 l814,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-483,-394" coordsize="1297,577"></cvml:shape><cvml:shape style="width: 65px; height: 29px" path=" m60,-58 c71,-14,125,3,152,-36 l151,-13 c94,27,15,-19,15,-91,15,-143,44,-192,94,-192,124,-192,152,-174,152,-147,152,-99,82,-86,60,-58 x m120,-149 c121,-167,109,-179,94,-179,62,-179,47,-115,55,-75,78,-95,120,-108,120,-149 x e m-571,-394 l726,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-571,-394" coordsize="1297,577"></cvml:shape><cvml:shape style="width: 65px; height: 29px" path=" m63,-154 c69,-182,103,-204,127,-183 l127,-152 c107,-185,64,-157,63,-128 l63,0,29,0,29,-189,63,-189,63,-154 x e m-731,-394 l566,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-731,-394" coordsize="1297,577"></cvml:shape><cvml:shape style="width: 65px; height: 29px" path=" m-852,-394 l445,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-852,-394" coordsize="1297,577"></cvml:shape></cufoncanvas><cufontext>Lorem </cufontext></cufon><cufon style="width: 45px; height: 18px" class="cufon cufon-vml" alt="Lorem "><cufoncanvas style="height: 29px; top: -5px; left: -2px"><cvml:shape style="width: 65px; height: 29px" path=" m85,-32 c85,-74,123,-134,113,-189 l148,-189,187,-33 c191,-84,214,-142,226,-189 l254,-189 c238,-128,211,-64,202,0 l160,0,131,-118 c121,-77,104,-37,100,0 l59,0,7,-189,42,-189 x e m-41,-394 l1251,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-41,-394" coordsize="1292,577"></cvml:shape><cvml:shape style="width: 65px; height: 29px" path=" m11,-94 c11,-144,47,-192,94,-192,146,-192,177,-149,177,-94,177,-44,141,2,94,2,41,1,11,-39,11,-94 x m93,-178 c29,-172,34,-21,93,-14,155,-20,157,-172,93,-178 x e m-292,-394 l1000,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-292,-394" coordsize="1292,577"></cvml:shape><cvml:shape style="width: 65px; height: 29px" path=" m63,-154 c69,-182,103,-204,127,-183 l127,-152 c107,-185,64,-157,63,-128 l63,0,29,0,29,-189,63,-189,63,-154 x e m-472,-394 l820,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-472,-394" coordsize="1292,577"></cvml:shape><cvml:shape style="width: 65px; height: 29px" path=" m60,0 l27,0,27,-257,60,-257,60,0 x e m-593,-394 l699,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-593,-394" coordsize="1292,577"></cvml:shape><cvml:shape style="width: 65px; height: 29px" path=" m15,-86 c15,-142,46,-192,95,-192,114,-192,127,-185,133,-174 l133,-257,168,-257,168,-34 c154,-10,130,2,95,2,52,2,15,-42,15,-86 x m134,-153 c128,-167,117,-177,98,-178,68,-178,54,-147,54,-86,54,-24,94,2,133,-24 x e m-666,-394 l626,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-666,-394" coordsize="1292,577"></cvml:shape><cvml:shape style="width: 65px; height: 29px" path=" m-847,-394 l445,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-847,-394" coordsize="1292,577"></cvml:shape></cufoncanvas><cufontext>Lorem </cufontext></cufon><cufon style="width: 44px; height: 18px" class="cufon cufon-vml" alt="Lorem "><cufoncanvas style="height: 29px; top: -5px; left: -2px"><cvml:shape style="width: 64px; height: 29px" path=" m68,-172 l68,0,33,0,33,-172,3,-172 c32,-188,54,-208,68,-232 l68,-189,108,-189,108,-168 c100,-173,82,-172,68,-172 x e m-41,-394 l1226,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-41,-394" coordsize="1267,577"></cvml:shape><cvml:shape style="width: 64px; height: 29px" path=" m63,-154 c69,-182,103,-204,127,-183 l127,-152 c107,-185,64,-157,63,-128 l63,0,29,0,29,-189,63,-189,63,-154 x e m-142,-394 l1125,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-142,-394" coordsize="1267,577"></cvml:shape><cvml:shape style="width: 64px; height: 29px" path=" m44,-175 c74,-204,147,-194,147,-143 l147,0,112,0,112,-23 c96,12,7,13,14,-45,18,-86,44,-97,87,-114,126,-130,124,-173,84,-175,66,-175,53,-166,44,-149 l44,-175 x m112,-116 c94,-97,41,-84,47,-43,52,-8,96,-9,112,-31 l112,-116 x e m-263,-394 l1004,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-263,-394" coordsize="1267,577"></cvml:shape><cvml:shape style="width: 64px; height: 29px" path=" m95,-27 c108,-97,122,-119,145,-189 l174,-189 c155,-128,125,-66,113,0 l70,0,7,-189,41,-189 x e m-422,-394 l845,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-422,-394" coordsize="1267,577"></cvml:shape><cvml:shape style="width: 64px; height: 29px" path=" m60,-58 c71,-14,125,3,152,-36 l151,-13 c94,27,15,-19,15,-91,15,-143,44,-192,94,-192,124,-192,152,-174,152,-147,152,-99,82,-86,60,-58 x m120,-149 c121,-167,109,-179,94,-179,62,-179,47,-115,55,-75,78,-95,120,-108,120,-149 x e m-589,-394 l678,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-589,-394" coordsize="1267,577"></cvml:shape><cvml:shape style="width: 64px; height: 29px" path=" m60,0 l27,0,27,-257,60,-257,60,0 x e m-749,-394 l518,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-749,-394" coordsize="1267,577"></cvml:shape><cvml:shape style="width: 64px; height: 29px" path=" m-822,-394 l445,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-822,-394" coordsize="1267,577"></cvml:shape></cufoncanvas><cufontext>Lorem </cufontext></cufon><cufon style="width: 36px; height: 18px" class="cufon cufon-vml" alt="Lorem "><cufoncanvas style="height: 29px; top: -5px; left: -2px"><cvml:shape style="width: 55px; height: 29px" path=" m85,-32 c85,-74,123,-134,113,-189 l148,-189,187,-33 c191,-84,214,-142,226,-189 l254,-189 c238,-128,211,-64,202,0 l160,0,131,-118 c121,-77,104,-37,100,0 l59,0,7,-189,42,-189 x e m-41,-394 l1063,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-41,-394" coordsize="1104,577"></cvml:shape><cvml:shape style="width: 55px; height: 29px" path=" m67,0 l32,0,32,-190,67,-190,67,0 x m69,-241 c69,-229,60,-221,49,-221,38,-221,29,-230,29,-241,29,-252,38,-261,49,-261,60,-261,69,-253,69,-241 x e m-292,-394 l812,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-292,-394" coordsize="1104,577"></cvml:shape><cvml:shape style="width: 55px; height: 29px" path=" m68,-172 l68,0,33,0,33,-172,3,-172 c32,-188,54,-208,68,-232 l68,-189,108,-189,108,-168 c100,-173,82,-172,68,-172 x e m-376,-394 l728,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-376,-394" coordsize="1104,577"></cvml:shape><cvml:shape style="width: 55px; height: 29px" path=" m61,-160 c90,-207,171,-202,171,-141 l171,2,136,2,136,-140 c133,-179,79,-176,61,-145 l61,2,26,2,26,-257,61,-257,61,-160 x e m-477,-394 l627,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-477,-394" coordsize="1104,577"></cvml:shape><cvml:shape style="width: 55px; height: 29px" path=" m-659,-394 l445,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-659,-394" coordsize="1104,577"></cvml:shape></cufoncanvas><cufontext>Lorem </cufontext></cufon><cufon style="width: 60px; height: 18px" class="cufon cufon-vml" alt="Lorem ipsum"><cufoncanvas style="height: 29px; top: -5px; left: -2px"><cvml:shape style="width: 78px; height: 29px" path=" m185,-35 c185,-12,172,0,146,0 l27,0,27,-257 c84,-255,173,-268,179,-221,169,-240,103,-239,63,-238 l63,-163 c102,-164,138,-164,148,-136,141,-144,90,-143,63,-143 l63,-20 c103,-22,170,-12,185,-35 x e m-41,-394 l1514,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-41,-394" coordsize="1555,577"></cvml:shape><cvml:shape style="width: 78px; height: 29px" path=" m160,-158 c176,-210,259,-200,259,-141 l259,0,225,0,225,-138 c224,-175,174,-179,161,-142 l161,0,126,0,126,-139 c127,-180,62,-176,62,-141 l62,0,27,0,27,-189,62,-189,62,-158 c73,-198,152,-205,160,-158 x e m-217,-394 l1338,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-217,-394" coordsize="1555,577"></cvml:shape><cvml:shape style="width: 78px; height: 29px" path=" m67,0 l32,0,32,-190,67,-190,67,0 x m69,-241 c69,-229,60,-221,49,-221,38,-221,29,-230,29,-241,29,-252,38,-261,49,-261,60,-261,69,-253,69,-241 x e m-492,-394 l1063,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-492,-394" coordsize="1555,577"></cvml:shape><cvml:shape style="width: 78px; height: 29px" path=" m63,-154 c69,-182,103,-204,127,-183 l127,-152 c107,-185,64,-157,63,-128 l63,0,29,0,29,-189,63,-189,63,-154 x e m-569,-394 l986,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-569,-394" coordsize="1555,577"></cvml:shape><cvml:shape style="width: 78px; height: 29px" path=" m44,-175 c74,-204,147,-194,147,-143 l147,0,112,0,112,-23 c96,12,7,13,14,-45,18,-86,44,-97,87,-114,126,-130,124,-173,84,-175,66,-175,53,-166,44,-149 l44,-175 x m112,-116 c94,-97,41,-84,47,-43,52,-8,96,-9,112,-31 l112,-116 x e m-690,-394 l865,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-690,-394" coordsize="1555,577"></cvml:shape><cvml:shape style="width: 78px; height: 29px" path=" m68,-172 l68,0,33,0,33,-172,3,-172 c32,-188,54,-208,68,-232 l68,-189,108,-189,108,-168 c100,-173,82,-172,68,-172 x e m-849,-394 l706,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-849,-394" coordsize="1555,577"></cvml:shape><cvml:shape style="width: 78px; height: 29px" path=" m60,-58 c71,-14,125,3,152,-36 l151,-13 c94,27,15,-19,15,-91,15,-143,44,-192,94,-192,124,-192,152,-174,152,-147,152,-99,82,-86,60,-58 x m120,-149 c121,-167,109,-179,94,-179,62,-179,47,-115,55,-75,78,-95,120,-108,120,-149 x e m-950,-394 l605,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-950,-394" coordsize="1555,577"></cvml:shape><cvml:shape style="width: 78px; height: 29px" path=" m13,-36 c28,-4,92,-11,88,-53,84,-91,9,-100,12,-147,15,-193,77,-204,110,-176 l110,-152 c100,-176,50,-189,45,-156,50,-111,121,-110,121,-59,121,-6,50,20,13,-12 l13,-36 x e m-1110,-394 l445,183 ns e" stroked="f" fillcolor="#c0bbaf" coordorigin="-1110,-394" coordsize="1555,577"></cvml:shape></cufoncanvas><cufontext>Lorem ipsum</cufontext><cvml:shape coordsize="1000,1000"></cvml:shape></cufon></h2>

<div class="contentContainer">
<p>Lorem ipsum dolor sit amet, Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet</p>
<p>Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet</p>
</div>

EXPECTED FILTERED OUTPUT

<h2>Lorem Lorem Lorem Lorem Lorem Lorem ipsum</h2>

<div>
<p>Lorem ipsum dolor sit amet, Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet</p>
<p>Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet</p>
</div>

One more example, to make it more clear:-

Input

  1. <abc id="test">new tag and known attribute</abc>
  2. <a id="test" href="http://www.google.com/" xyz="testattr">known tag, attribute and one unknown attr</a>

Output

  1. new tag and known attribute
  2. <a id="test" href="http://www.google.com/">known tag, attribute and one unknown attr</a>

Appreciate for the help.

解决方案

Finally I have achieved this in two steps:-

//Allowed list of HTML Tags

<(?!/?(p|body|b|u|em|strong|ul|ol|li|h1|h2|h3|h4|h5|h6|hr|a|br|img|tr|td|table|tbody|label|div|sup|sub|caption)(>|\s))[^<]+?>

//Allowed list of HTML Attributes

\s(?!(alt|href|tcmuri|title|height|width|align|valign|rowspan|colspan|src|summary|class|id|name|title|target|nowrap|scope|axis|cellpadding|cellspacing|dir|lang|rel))\w+(\s*=\s*["|']?[/.,#?\w\s:;-]+["|']?)

Using above two regex, I have filtered my whole html.

EDIT:

Now I have reduced it into one regex, which filter all required HTML tags & attributes

(<(?!/?(p|body|b|u|em|strong|ul|ol|li|h1|h2|h3|h4|h5|h6|hr|a|br|img|tr|td|table|tbody|label|div|sup|sub|caption)(>|\s))[^<]+?>)|(\s(?!(alt|href|tcmuri|title|height|width|align|valign|rowspan|colspan|src|summary|class|id|name|title|target|nowrap|scope|axis|cellpadding|cellspacing|dir|lang|rel)\b)[\w:]+(\s*=\s*["|']?[/.,#?\w\s:;-]+["|']?))

这篇关于正则表达式只允许一组 HTML 标签和属性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
其他开发最新文章
热门教程
热门工具
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆