StreamReader 上的 C# RegEx 不会返回匹配项 [英] C# RegEx on a StreamReader will not return matches

查看:32
本文介绍了StreamReader 上的 C# RegEx 不会返回匹配项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在为自己编写一个简单的屏幕抓取应用程序来使用 HTMLAgilityPack 库,在让它在几种不同类型的 HtmlNodes 上工作之后,我想我会喜欢并为电子邮件地址添加一个正则表达式作为出色地.唯一的问题是应用程序从未找到任何匹配项,或者可能找到了但没有正确返回.即使在已知包含电子邮件地址的站点上也会发生这种情况.谁能发现我在这里做错了什么?

I'm writing myself a simple screen scraping application to play around with the HTMLAgilityPack library, and after getting it to work on several different types of HtmlNodes, I figured I'd get fancy and throw in a Regex for Email addresses as well. The only problem is that the application never finds any matches, or maybe it is but not returning properly. This takes place even on sites known to contain email addresses. Can anyone spot what I'm doing wrong here?

      string url = String.Format("http://{0}", mainForm.Target);
      string reg = "\b[A-Z0-9._%+-]+@[A-Z0-9.-]+.[A-Z]{2,4}\b";
      try
            {
                WebClient wClient = new WebClient();
                Stream data = wClient.OpenRead(url);
                StreamReader read = new StreamReader(data);
                MatchCollection matches = Regex.Matches(read.ReadToEnd(), reg, RegexOptions.IgnoreCase|RegexOptions.Multiline);
                foreach (Match match in matches)
                {
                    textBox1.AppendText(match.ToString() + Environment.NewLine);
                }

推荐答案

使用原始字符串:

string reg = @"\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b";

没有那个,\b 变成退格.此外,您的最后一个句点应该是 \.,因此它只匹配文字句点.

Without that, \b becomes backspace. Also, your last period should be \., so it only matches a literal period.

这篇关于StreamReader 上的 C# RegEx 不会返回匹配项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆