.NET中的字符串处理 [英] String manipulation in .NET
本文介绍了.NET中的字符串处理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
大家好,
在.cs文件中,我有一个字符串变量,如
Hi all,
In a .cs file I have a string variable like
string body = " <p>
Maximize the quality, efficiency and value of your applications with enterprise
application modernization, testing and management software from Micro Focus.</p>
<p>
Products A - Z: <a title="Products A - Z (A)">
href=''/products/productsa-z/index.aspx''>
</a> J K L <a title="Products A - Z (M)" href="/products/productsa-z/productsa-zM.aspx">
M</a> <a title="Products A - Z (N)" href="/products/productsa-z/productsa-zN.aspx">
</a> U <a title="Products A - Z (V)" href="/products/productsa-z/productsa-zV.aspx">
V</a> <a title="Products A - Z (W)" href="/products/productsa-z/productsa-zW.aspx"> W</a> <a title="Products A - Z (X)" href="/products/productsa-z/productsa-zx.aspx">
X</a>
Y Z</p>
<div style="border-removed #cecece 1px; border-removed #cecece 1px solid; border-removed #cecece 1px solid; padding-removed 10px">
<p>
<a href="/products/isight/index.aspx">Application Portfolio Management and Analysis</a><br />
<img style=''width: 11px; height: 11px; vertical-align: middle'' title=''arrow_badge_icon.gif''
alt=''arrow_badge_icon.gif'' src=''/assets/arrow_badge_icon.gif'' width=''11'' height=''11'' /><a>
href=''/products/isight/index.aspx''>i.Sight</a></p>
<p>
Tools applications.</p>
</div>
<div style="border-removed #cecece 1px solid;padding-removed 10px">
<p>
<a href="/products/micro-focus-developer/index.aspx">COBOL and Software Developer Tools</a><br />
<img style=''width: 11px; height: 11px; vertical-align: middle''
title=''arrow_badge_icon.gif'' alt=''arrow_badge_icon.gif'' src=''/assets/arrow_badge_icon.gif'' width=''11'' height=''11'' /><a href="/products/micro-focus-developer/index.aspx">Micro Focus Developer</a></p>
<p>
Industry applications.</p>
</div> <br />
<br />
<p>
</p>";
上面的字符串是摘录,其中包含许多标签.我想操纵这个字符串.我想在其中修改锚标记,我必须阅读锚标记标题,并向每个锚标记添加一个"onclick =事件(每个锚标记将具有不同的标题,并且该标题将包含在onclick事件中,如下所示:动态变量),有人可以帮我吗?我应该选择哪种方法?
The above string is an extract and there are lots of a tags in it. I want to manipulate this string. I want to modify the anchor tags in it , i have to read the anchor tags title and add a " onclick=" event to each anchor tag(Each anchor tag will have a different title and the title will be included in the onclick event as a dynamic variable) Can anyone help me on this. What approach should i opt for?
推荐答案
请检查给定的链接:
使用正则表达式搜索HTML标记.
HTML元标记解析器 [将HTML转换为纯文本 [ http://haacked.com/archive/2004/10/25/usingregularexpressionstomatchhtml.aspx [ ^ ]
http://stackoverflow.com/questions/787932/using-c- Regular-expressions-to-remove-html-tags [ ^ ]
Please check the given links:
Use regex to search the HTML tags.
HTML Meta Tag Parser[^]
Convert HTML to Plain Text[^]
http://haacked.com/archive/2004/10/25/usingregularexpressionstomatchhtml.aspx[^]
http://stackoverflow.com/questions/787932/using-c-regular-expressions-to-remove-html-tags[^]
解决此问题的一种可能方法:
您的字符串看起来像是格式正确的XML,但不会太大.您可以将其作为XML传递.您可以使用类System.Xml.XmlTextReader
获取DOM结构,或使用类System.Xml.Linq.XDocument
获取文档树结构.在这两种情况下,您都将获得一个结构化文档,该文档可以访问HTML节点.根据需要修改节点,然后将文档序列化为XML文本.
参见:
http://msdn.microsoft.com/en-us/library/system.xml. xmldocument.aspx [ ^ ],
http://msdn.microsoft.com/en-us/library/bb387063.aspx [ ^ ],
http://msdn.microsoft.com/en-us/library/system. xml.linq.xdocument.aspx [ ^ ].
如果HTML不能是格式正确的XML,则可以使用适当的HTML解析器.尝试以下操作: http://www.majestic12.co.uk/projects/html_parser.php [ ^ ].
—SA
One of the possible way to address this problem:
You string looks like a well-formed XML which is not too big. You can pass it as XML. You can do it using the classSystem.Xml.XmlTextReader
to get a DOM structure or the classSystem.Xml.Linq.XDocument
to get a document tree structure. In both cases, you will get a structured document with access to your HTML nodes. Modify the nodes as you need and serialize the document back to XML text.
See:
http://msdn.microsoft.com/en-us/library/system.xml.xmldocument.aspx[^],
http://msdn.microsoft.com/en-us/library/bb387063.aspx[^],
http://msdn.microsoft.com/en-us/library/system.xml.linq.xdocument.aspx[^].
If HTML can be not a well-formed XML, you can use appropriate HTML parser. Try this: http://www.majestic12.co.uk/projects/html_parser.php[^].
—SA
这篇关于.NET中的字符串处理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文