.NET中的字符串处理 [英] String manipulation in .NET

查看:108
本文介绍了.NET中的字符串处理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大家好,

在.cs文件中,我有一个字符串变量,如

Hi all,

In a .cs file I have a string variable like

string body = " <p>
            Maximize the quality, efficiency and value of your applications with enterprise
            application modernization, testing and management software from Micro Focus.</p>
        <p>
            Products A - Z: <a title="Products A - Z (A)">
href=''/products/productsa-z/index.aspx''>
                </a> J K L <a title="Products A - Z (M)" href="/products/productsa-z/productsa-zM.aspx">
                                        M</a> <a title="Products A - Z (N)" href="/products/productsa-z/productsa-zN.aspx">
                </a> U <a title="Products A - Z (V)" href="/products/productsa-z/productsa-zV.aspx">
                                                                V</a> <a title="Products A - Z (W)" href="/products/productsa-z/productsa-zW.aspx">                                                                    W</a> <a title="Products A - Z (X)" href="/products/productsa-z/productsa-zx.aspx">
                                                                        X</a>
            Y Z</p>
        <div style="border-removed #cecece 1px; border-removed #cecece 1px solid; border-removed #cecece 1px solid; padding-removed 10px">
            <p>
                <a href="/products/isight/index.aspx">Application Portfolio Management and Analysis</a><br />
                <img style=''width: 11px; height: 11px; vertical-align: middle'' title=''arrow_badge_icon.gif''
                    alt=''arrow_badge_icon.gif'' src=''/assets/arrow_badge_icon.gif'' width=''11'' height=''11'' /><a>
                        href=''/products/isight/index.aspx''>i.Sight</a></p>
            <p>
                Tools applications.</p>
        </div>        
        <div style="border-removed #cecece 1px solid;padding-removed 10px">
            <p>
                <a href="/products/micro-focus-developer/index.aspx">COBOL and Software Developer Tools</a><br />
                <img style=''width: 11px; height: 11px; vertical-align: middle'' 
title=''arrow_badge_icon.gif''  alt=''arrow_badge_icon.gif'' src=''/assets/arrow_badge_icon.gif'' width=''11'' height=''11'' /><a href="/products/micro-focus-developer/index.aspx">Micro Focus Developer</a></p>
            <p>
                Industry applications.</p>
        </div>        <br />
        <br />
        <p>
        </p>";  



上面的字符串是摘录,其中包含许多标签.我想操纵这个字符串.我想在其中修改锚标记,我必须阅读锚标记标题,并向每个锚标记添加一个"onclick =事件(每个锚标记将具有不同的标题,并且该标题将包含在onclick事件中,如下所示:动态变量),有人可以帮我吗?我应该选择哪种方法?



The above string is an extract and there are lots of a tags in it. I want to manipulate this string. I want to modify the anchor tags in it , i have to read the anchor tags title and add a " onclick=" event to each anchor tag(Each anchor tag will have a different title and the title will be included in the onclick event as a dynamic variable) Can anyone help me on this. What approach should i opt for?

推荐答案

请检查给定的链接:

使用正则表达式搜索HTML标记.

HTML元标记解析器 [将HTML转换为纯文本 [ http://haacked.com/archive/2004/10/25/usingregularexpressionstomatchhtml.aspx [ ^ ]

http://stackoverflow.com/questions/787932/using-c- Regular-expressions-to-remove-html-tags [ ^ ]
Please check the given links:

Use regex to search the HTML tags.

HTML Meta Tag Parser[^]

Convert HTML to Plain Text[^]

http://haacked.com/archive/2004/10/25/usingregularexpressionstomatchhtml.aspx[^]

http://stackoverflow.com/questions/787932/using-c-regular-expressions-to-remove-html-tags[^]


解决此问题的一种可能方法:

您的字符串看起来像是格式正确的XML,但不会太大.您可以将其作为XML传递.您可以使用类System.Xml.XmlTextReader获取DOM结构,或使用类System.Xml.Linq.XDocument获取文档树结构.在这两种情况下,您都将获得一个结构化文档,该文档可以访问HTML节点.根据需要修改节点,然后将文档序列化为XML文本.

参见:
http://msdn.microsoft.com/en-us/library/system.xml. xmldocument.aspx [ ^ ],
http://msdn.microsoft.com/en-us/library/bb387063.aspx [ ^ ],
http://msdn.microsoft.com/en-us/library/system. xml.linq.xdocument.aspx [ ^ ].



如果HTML不能是格式正确的XML,则可以使用适当的HTML解析器.尝试以下操作: http://www.majestic12.co.uk/projects/html_parser.php [ ^ ].

—SA
One of the possible way to address this problem:

You string looks like a well-formed XML which is not too big. You can pass it as XML. You can do it using the class System.Xml.XmlTextReader to get a DOM structure or the class System.Xml.Linq.XDocument to get a document tree structure. In both cases, you will get a structured document with access to your HTML nodes. Modify the nodes as you need and serialize the document back to XML text.

See:
http://msdn.microsoft.com/en-us/library/system.xml.xmldocument.aspx[^],
http://msdn.microsoft.com/en-us/library/bb387063.aspx[^],
http://msdn.microsoft.com/en-us/library/system.xml.linq.xdocument.aspx[^].



If HTML can be not a well-formed XML, you can use appropriate HTML parser. Try this: http://www.majestic12.co.uk/projects/html_parser.php[^].

—SA


这篇关于.NET中的字符串处理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆