C#正则表达式-获取第二个数字,而不是第一个 [英] C# Regular Expressions - Get Second Number, not First

查看:153
本文介绍了C#正则表达式-获取第二个数字,而不是第一个的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下HTML代码:

I have the following HTML code:

<td class="actual">106.2% </td> 

我通过两个阶段获得数字:

Which I get the number through two phases:

Regex.Matches(html, "<td class=\"actual\">\\s*(.*?)\\s*</td>", RegexOptions.Singleline);
Regex.Match(m.Groups[1].Value, @"-?\d+.\d+").Value

上面的代码行给了我我想要的106.2

The above code lines gives me what I want, the 106.2

问题在于,有时HTML可能会有些不同,例如:

The problem is that sometimes the HTML can be a little different, like this:

<td class="actual"><span class="revised worse" title="Revised From 107.2%">106.4%</span></td>

在最后一种情况下,我只能得到107.2,而我想得到106.4是否有一些正则表达式的技巧要说,我想要句子中的第二个数字而不是第一个?

In this last case, I can only get the 107.2, and I would like to get the 106.4 Is there some regular expression trick to say, I want the second number in the sentence and not the first?

推荐答案

我想分享我针对问题找到的解决方案.

I want to share the solution I have found for my problem.

因此,我可以使用如下所示的HTML标签:

So, I can have HTML tags like the following:

<td class="previous"><span class="revised worse" title="Revised From 1.3">0.9</span></td>
<td class="previous"><span class="revised worse" title="Revised From 107.2%">106.4%</span></td>

或更简单:

<td class="previous">51.4</td>

首先,我通过以下代码完成整行:

First, I take the entire line, throught the following code:

MatchCollection mPrevious = Regex.Matches(html, "<td class=\"previous\">\\s*(.*?)\\s*</td>", RegexOptions.Singleline);

第二,我使用以下代码仅提取数字:

And second, I use the following code to extract the numbers only:

foreach (Match m in mPrevious)
        {


            if (m.Groups[1].Value.Contains("span"))
            {
                string stringtemp = Regex.Match(m.Groups[1].Value, "-?\\d+.\\d+.\">-?\\d+.\\d+|-?\\d+.\\d+\">-?\\d+.\\d+|-?\\d+.\">-?\\d+|-?\\d+\">-?\\d+").Value;
                int indextemp = stringtemp.IndexOf(">");
                if (indextemp <= 0) break;
                lPrevious.Add(stringtemp.Remove(0, indextemp + 1));
            }
            else lPrevious.Add(Regex.Match(m.Groups[1].Value, @"-?\d+.\d+|-?\d+").Value);
        }

首先,我开始确定是否有SPAN标记,如果有,我将两个数字加在一起,并考虑了正则表达式的不同可能性.确定从哪里删除非重要信息并删除我不想要的字符.

First I start to identify if there is a SPAN tag, if there is, I take the two number together, and I have considered diferent posibilities with the regular expression. Identify a character from where to remove non important information, and remove what I don't want.

工作完美.

谢谢大家的支持和快速解答.

Thank you all for the support and quick answers.

这篇关于C#正则表达式-获取第二个数字,而不是第一个的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆