REGEX仅用于数据和结束标记 [英] REGEX for only data and end tag
本文介绍了REGEX仅用于数据和结束标记的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
<$ p $ <$ p
例如
p>输入:
-----------------
< p> ABC< p>
-----------------
输出将是
-----------------
ABC< p>
-----------------
它只会删除第一个para
para标签,而不是用于第二个para
标签,其间的所有文本都是相同的。
我想在此提及,我正在寻找
< p> ABC< p>
不适用
< p为H. ABC< / p为H.
它用于具有不规则
标签的特定文本文件
示例:
我有很大的xhtml文件,例如...
< p为H. SCET< / p为H.
< p>晴天< / p>
< p> <! - 此标签将被移除 - >
< p> <! - 此标签将被移除 - >
< p>标记< / p>
< p>托马斯< / p>
它是一个完整的XHTML file.having body head标签
这里只有问题是
额外标签
i期待这样的输出
< p> scet< / p>
< p>晴天< / p>
< p>标记< / p>
< p>托马斯< / p>
解决方案
/ p>
public static class XHTMLCleanerUpperThingy
{
private const string p =< p>;
private const string closingp =< / p>;
public static string CleanUpXHTML(string xhtml)
{
StringBuilder builder = new StringBuilder(xhtml);
for(int idx = 0; idx< xhtml.Length; idx ++)
{
int current;如果((current,xhtml.IndexOf(p,idx))!= -1)
{
int idxofnext = xhtml.IndexOf(p,current + p.Length);
int idxofclose = xhtml.IndexOf(closingp,current);
//如果有下一个< p>标记
if(idxofnext> 0)
{
//如果下一个结束标记比下一个< p>更远,标记
if(idxofnext< idxofclose)
{
for(int j = 0; j< p.Length; j ++)
{
builder [current + j] ='';
}
}
}
//如果没有最终结束标记
else if(idxofclose <0)
{
for(int j = 0; j {
builder [current + j] ='';
return builder.ToString();
}
}
I am looking for REGEX which will give me data along with the end tag
e.g.
input:
-----------------
<p>ABC<p>
-----------------
Output would be
-----------------
ABC<p>
-----------------
it will only remove the first para
para tag,Not for the second para
tag and all text in between would be same.
I want to mention here that i am looking for
<p>ABC<p>
not for
<p>ABC</p>
Its for specific text file having irregular
tags
Example:
i have big xhtml file like...
<p>scet</p>
<p>sunny </p>
<p> <!--this tag is to be removed -->
<p> <!--this tag is to be removed -->
<p>mark</p>
<p>Thomas </p>
its a complete XHTML file.having body head etc tags Only problem here is extra tags i am expecting output like this
<p>scet</p>
<p>sunny </p>
<p>mark</p>
<p>Thomas </p>
解决方案
This will work, take html document in string xhtml
public static class XHTMLCleanerUpperThingy
{
private const string p = "<p>";
private const string closingp = "</p>";
public static string CleanUpXHTML(string xhtml)
{
StringBuilder builder = new StringBuilder(xhtml);
for (int idx = 0; idx < xhtml.Length; idx++)
{
int current;
if ((current = xhtml.IndexOf(p, idx)) != -1)
{
int idxofnext = xhtml.IndexOf(p, current + p.Length);
int idxofclose = xhtml.IndexOf(closingp, current);
// if there is a next <p> tag
if (idxofnext > 0)
{
// if the next closing tag is farther than the next <p> tag
if (idxofnext < idxofclose)
{
for (int j = 0; j < p.Length; j++)
{
builder[current + j] = ' ';
}
}
}
// if there is not a final closing tag
else if (idxofclose < 0)
{
for (int j = 0; j < p.Length; j++)
{
builder[current + j] = ' ';
}
}
}
}
return builder.ToString();
}
}
这篇关于REGEX仅用于数据和结束标记的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文