如何从 CDATA 中删除 href 标签 [英] How to remove href tag from CDATA
问题描述
我在xml文档中有以下CDATA:
I have following CDATA inside xml document:
<![CDATA[ <p xmlns="">Refer to the below: <br/>
</p>
<table xmlns:abc="http://google.com pic.xsd" cellspacing="1" class="c" type="custom" width="100%">
<tbody>
<tr xmlns="">
<th style="text-align: left">Basic offers...</th>
</tr>
<tr xmlns="">
<td style="text-align: left">Faster network</td>
<td style="text-align: left">
<ul>
<li>Session</li>
</ul>
</td>
</tr>
<tr xmlns="">
<td style="text-align: left">capabilities</td>
<td style="text-align: left">
<ul>
<li>Navigation,</li>
<li>message, and</li>
<li>contacts</li>
</ul>
</td>
</tr>
<tr xmlns="">
<td style="text-align: left">Data</td>
<td style="text-align: left">
<p>Here visit google for more info <a href="http://www.google.com" target="_blank"><font color="#0033cc">www.google.com</font></a>.</p>
<p>Remove this href tag <a href="/abc/def/{T}/t/1" target="_blank">Information</a> remove the tag.</p>
</td>
</tr>
</tbody>
</table>
<p xmlns=""><br/>
</p>
]]>
我想知道如何扫描 href="/abc/def 并删除以 abc/def 开头的 href 标签.在上面的示例中,删除 href 标签并只在标签内留下信息"文本.CDATA 可以有不止一个带有abc/def..."的href标签.我正在为此应用程序使用 C#.有人可以帮助我并告诉我如何做到这一点吗?我应该使用正则表达式还是有办法用 xml 本身来做?
I want to some how scan for href="/abc/def and remove the href tag which starts with abc/def. In above example remove the href tag and just leave "Information" text inside the tag. CDATA can have more than one href tags with "abc/def... in it. I am using C# for this application. Can someone please help me and tell me how this can be done? Should i use regex or is there a way to do it with xml itself?
这是我正在尝试的正则表达式:
This is the regex i am trying:
"<a href=\"/abc/def/.*></a>"
我想保留 a href 标签的内部文本,只需删除标签即可.但上面的正则表达式不起作用.
I want to keep inner text of the a href tag just remove the tags. But above regex is not working.
推荐答案
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);
var nodes = doc.DocumentNode
.Descendants("a")
.Where(n => n.Attributes.Any(a => a.Name == "href" && a.Value.StartsWith("/abc/def")))
.ToArray();
foreach(var node in nodes)
{
node.ParentNode.RemoveChild(node,true);
}
var newHtml = doc.DocumentNode.InnerHtml;
这篇关于如何从 CDATA 中删除 href 标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!