使用Regex替换特定的HTML标签 [英] Replacing specific HTML tags using Regex
问题描述
ActiveReport支持的HTML标签可以在这里找到: http://www.datadynamics.com/Help/ARNET3/ar3conSupportedHtmlTagsInRichText.html
我想要做的一个例子是将< div style =text-align:*< / div>
code>< p style = \text-align:*< / p> ,以便使用受支持的标签进行文本对齐。
我发现以下正则表达式可以在我的html输入中找到正确的匹配:
< div style = \text-align:(。*?)< / div>
然而, ,我无法找到一种方法来保留在更换后标签中包含的以前的文本。任何线索?是我还是正则表达式通常是PITA?:)
p rivate static readonly IDictionary< string,string> _replaceMap =
新词典< string,string>
{< div style = \text-align:(。*?)< / div>,< p style = \text-align :(。 *?)< / p>}
};
public static string FormatHtml(string html)
{
foreach(var对在_replaceMap中)
{
html = Regex.Replace(html,pair .Key,pair.Value);
}
return html;
谢谢!
使用 $ 1
:
{< div style = \text-align:(。*?)< / div>,< p style = \text-align:$ 1< / p>}
请注意,您可以将其简化为:
{< div(style = \text-align:(?:。*?))< / div>,< p $ 1< / p> }
另外,使用HTML解析器通常更好一些,比如 HtmlAgilityPack ,而不是试图用正则表达式来解析HTML,这是你如何做到的:
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);
foreach(doc.DocumentNode.Descendants(div)中的var e)
e.Name =p;
doc.Save(Console.Out);
结果:
Alright, an easy one for you guys. We are using ActiveReport's RichTextBox to display some random bits of HTML code.
The HTML tags supported by ActiveReport can be found here : http://www.datadynamics.com/Help/ARNET3/ar3conSupportedHtmlTagsInRichText.html
An example of what I want to do is replace any match of <div style="text-align:*</div>
by <p style=\"text-align:*</p>
in order to use a supported tag for text-alignment.
I have found the following regex expression to find the correct match in my html input:
<div style=\"text-align:(.*?)</div>
However, I can't find a way to keep the previous text contained in the tags after my replacement. Any clue? Is it me or Regex are generally a PITA? :)
private static readonly IDictionary<string, string> _replaceMap =
new Dictionary<string, string>
{
{"<div style=\"text-align:(.*?)</div>", "<p style=\"text-align:(.*?)</p>"}
};
public static string FormatHtml(string html)
{
foreach(var pair in _replaceMap)
{
html = Regex.Replace(html, pair.Key, pair.Value);
}
return html;
}
Thanks!
Use $1
:
{"<div style=\"text-align:(.*?)</div>", "<p style=\"text-align:$1</p>"}
Note that you could simplify this to:
{"<div (style=\"text-align:(?:.*?))</div>", "<p $1</p>"}
Also it is generally a better idea to use an HTML parser like HtmlAgilityPack than trying to parse HTML using regular expressions. Here's how you could do it:
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);
foreach (var e in doc.DocumentNode.Descendants("div"))
e.Name = "p";
doc.Save(Console.Out);
Result:
<p style="text-align:center">foo</p><p style="text-align:center">bar</p>
这篇关于使用Regex替换特定的HTML标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!