解析< a>之间的html段落< p为H. <预> < H1,...,6个使用正则表达式的标签 [英] Parsing html paragraph between <a> <p> <pre> <h1,...,6> tags with using regex
问题描述
大家好我需要帮助的是我的代码;
Hello everyone i need some help here is my code;
private void button1_Click(object sender, EventArgs e)
{
string s = KaynakKodunuCek("http://tr.wikipedia.org/wiki/Lale");
// <a ... > </a> tagları arasını alıyor.(taglar dahil)
Regex regex = new Regex("(?i)<a([^>]+)>(.+?)</a>");
string gelen = s;
string inside = null;
Match match = regex.Match(gelen);
if (match.Success)
{
inside= match.Value;
richTextBox2.Text = inside;
}
string outputStr = "";
foreach (Match ItemMatch in regex.Matches(gelen))
{
Console.WriteLine(ItemMatch);
inside = ItemMatch.Value;
//boşluk bırakıp al satır yazıyor
outputStr += inside + "\r\n";
}
richTextBox2.Text = outputStr;
}
当我点击button2时,它将html代码解析为richtextbox2,但结果如这个。
when i click button2 it parsing the html codes to richtextbox2 but the result is like this.
< a class =external texthref =// tr.wikipedia.org/w/index.php?title =%C3%96zel:G%C3%BCnl%C3%BCk& amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; page =
< a class =external texthref =// tr.wikipedia.org/w/index.php?title=Lale&oldid=13373007&diff=cur\"> 1değişiklik< ; / A>
< a href =#mw-navigation> kullan< / a>
但我想看看我的仅输出标签之间的段落,例如> kontroledilmiş
but i want to see my output only the paragraphs between tags for example >kontrol edilmiş<
推荐答案
HTML不能用正则表达式解析。您最好使用类似 HTML Agility Pack 的内容。
HTML wasn't designed to be parsed with regex. You're better off using something like the HTML Agility Pack.
这篇关于解析< a>之间的html段落< p为H. <预> < H1,...,6个使用正则表达式的标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!