C#正则表达式-获取第二个数字,而不是第一个 [英] C# Regular Expressions - Get Second Number, not First
问题描述
我有以下HTML代码:
I have the following HTML code:
<td class="actual">106.2% </td>
我通过两个阶段获得数字:
Which I get the number through two phases:
Regex.Matches(html, "<td class=\"actual\">\\s*(.*?)\\s*</td>", RegexOptions.Singleline);
Regex.Match(m.Groups[1].Value, @"-?\d+.\d+").Value
上面的代码行给了我我想要的106.2
The above code lines gives me what I want, the 106.2
问题在于,有时HTML可能会有些不同,例如:
The problem is that sometimes the HTML can be a little different, like this:
<td class="actual"><span class="revised worse" title="Revised From 107.2%">106.4%</span></td>
在最后一种情况下,我只能得到107.2,而我想得到106.4是否有一些正则表达式的技巧要说,我想要句子中的第二个数字而不是第一个?
In this last case, I can only get the 107.2, and I would like to get the 106.4 Is there some regular expression trick to say, I want the second number in the sentence and not the first?
推荐答案
我想分享我针对问题找到的解决方案.
I want to share the solution I have found for my problem.
因此,我可以使用如下所示的HTML标签:
So, I can have HTML tags like the following:
<td class="previous"><span class="revised worse" title="Revised From 1.3">0.9</span></td>
<td class="previous"><span class="revised worse" title="Revised From 107.2%">106.4%</span></td>
或更简单:
<td class="previous">51.4</td>
首先,我通过以下代码完成整行:
First, I take the entire line, throught the following code:
MatchCollection mPrevious = Regex.Matches(html, "<td class=\"previous\">\\s*(.*?)\\s*</td>", RegexOptions.Singleline);
第二,我使用以下代码仅提取数字:
And second, I use the following code to extract the numbers only:
foreach (Match m in mPrevious)
{
if (m.Groups[1].Value.Contains("span"))
{
string stringtemp = Regex.Match(m.Groups[1].Value, "-?\\d+.\\d+.\">-?\\d+.\\d+|-?\\d+.\\d+\">-?\\d+.\\d+|-?\\d+.\">-?\\d+|-?\\d+\">-?\\d+").Value;
int indextemp = stringtemp.IndexOf(">");
if (indextemp <= 0) break;
lPrevious.Add(stringtemp.Remove(0, indextemp + 1));
}
else lPrevious.Add(Regex.Match(m.Groups[1].Value, @"-?\d+.\d+|-?\d+").Value);
}
首先,我开始确定是否有SPAN标记,如果有,我将两个数字加在一起,并考虑了正则表达式的不同可能性.确定从哪里删除非重要信息并删除我不想要的字符.
First I start to identify if there is a SPAN tag, if there is, I take the two number together, and I have considered diferent posibilities with the regular expression. Identify a character from where to remove non important information, and remove what I don't want.
工作完美.
谢谢大家的支持和快速解答.
Thank you all for the support and quick answers.
这篇关于C#正则表达式-获取第二个数字,而不是第一个的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!