HTML正则表达式< tr>标签 [英] Regex for HTML <tr> tag

查看：132 发布时间：2018/6/22 21:32:03 c# html regex parsing

本文介绍了HTML正则表达式< tr>标签的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个带有< tr> 类的HTML页面，我需要捕获这些标签之间的文本。

我试过正则表达式：

 （？i）< T R [^>]？*>（[^<] *）< / TR>

但它不起作用。

这是我在C＃中的所有代码：

  string patternPost = @（？i）< tr [^>] ？*>（[^<] *）< / TR>中; 
 MatchCollection m1 = Regex.Matches（html，patternPost，RegexOptions.Multiline）; 
 foreach（在m1中匹配m）
 {
 MessageBox.Show（m.Groups [1] .Value）; 
}

在这里你可以找到一个HTML页面的例子： http://pastebin.com/ewN5NZis

你可以看到2块，我需要存储每个块，三个信息在三个不同的列表中：

列表1：标题1，标题2 列表2：约翰，安东尼清单3：29/04/14，28/04/14
使用我的第一个正则表达式，我想先尝试捕获所有块并跳过像 tr 标签不同的无用信息，接下来我想尝试使用3种不同的正则表达式为每个块捕获3个信息。
这是正确的吗？我希望现在你明白我的意思。
编辑：在你最后的评论中，你说过：< tr ....> <标记> ...< / tag> < TAG2> ...< / TAG2> < / tr> 这是对原始问题的相当大的扩展。在这个阶段，我同意所有其他建议：您将需要一个dom解析器。

旧编辑：最初您要求匹配< tr> 标签。对于简单的< tr> 标签：摘录第1组来自

（？i）< tr>（[^<] *）< / tr>
或者< tr with stuff> ：

（？i）< tr> *>（[^<] *）< ; / TR>
或者< tr stuff>< td stuff> Grab Me< / td>
（？i）< tr [^>]>> \s *< td [^>] *？>（。*）< / td
以下是一个代码示例：

using System; 使用System.Text.RegularExpressions; class Program { static void Main（）{ string s1 =< tr stuff>< td stuff>抓住我< / td>; var r = new Regex（（？i）< tr> *> \\\ s *< td [^>] *？>（。*）< ; / TD）; string capture = r.Match（s1）.Groups [1] .Value; Console.WriteLine（capture）; Console.WriteLine（\ n按任意键退出）; Console.ReadKey（）; } // END主要 } //结束程序
输出：抓住我

I have an HTML page with <tr> classes and I need to capture the text inbetween those tags.

I tried with Regex:
(?i)<tr[^>]*?>([^<]*)</tr>
But it doesn't work.

This is all my code in C#:
string patternPost = @"(?i)<tr[^>]*?>([^<]*)</tr>"; MatchCollection m1 = Regex.Matches(html, patternPost, RegexOptions.Multiline); foreach (Match m in m1) { MessageBox.Show(m.Groups[1].Value); }
Here you can find an example of HTML page: http://pastebin.com/ewN5NZis

You can see 2 block, I need to store for each of blocks, three info in three different list:
List 1: Title1, Title2 List 2: John, Antony List 3: 29/04/14, 28/04/14
With my first regex I wanna try first to catch all blocks and skip useless information like tags differents from tr and next I wanna try to catch 3 infos for each blocks with 3 different regex. Is this right? I hope now you understand me.
解决方案
EDIT: In your last comment, you said: <tr ....> <tag> ... </tag> <tag2>...</tag2> </tr> which is quite an expansion on the original problem. At this stage, I concur with all other advice: you are going to need a dom parser.

Older Edit: Originally you asked to match contents of <tr> tags. Specs have changed, so this answer contains the evolving versions.

For a plain <tr> tag: extract Group 1 from
(?i)<tr>([^<]*)</tr>
or for a <tr with stuff>:
(?i)<tr[^>]*>([^<]*)</tr>
or for <tr stuff><td stuff>Grab Me</td>
(?i)<tr[^>]*?>\s*<td[^>]*?>(.*)</td
Here is a code sample:
using System; using System.Text.RegularExpressions; class Program { static void Main() { string s1 = "<tr stuff><td stuff>Grab Me</td>"; var r = new Regex("(?i)<tr[^>]*?>\\s*<td[^>]*?>(.*)</td"); string capture = r.Match(s1).Groups[1].Value; Console.WriteLine(capture); Console.WriteLine("\nPress Any Key to Exit."); Console.ReadKey(); } // END Main } // END Program
Output: Grab Me

这篇关于HTML正则表达式< tr>标签的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

HTML正则表达式< tr>标签 [英] Regex for HTML <tr> tag

问题描述

相关文章

C#/.NET最新文章

热门教程

热门工具

登录关闭

HTML正则表达式&lt; tr&gt;标签 [英] Regex for HTML &lt;tr&gt; tag

问题描述

相关文章

C#/.NET最新文章

热门教程

热门工具

登录 关闭

HTML正则表达式< tr>标签 [英] Regex for HTML <tr> tag

登录关闭