无法在VBA IE中应用正则表达式 [英] Unable to apply regex within vba IE

查看:175
本文介绍了无法在VBA IE中应用正则表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经使用vba结合IE编写了一个脚本,以解析应用了 regex 的网页中的联系信息.我进行了很多搜索,但找不到任何可以满足我要求的示例. pattern可能不是找到phone编号的理想方法,但是这里主要要考虑的是如何在vba IE中使用pattern.

I've written a script using vba in combination with IE to parse the contact information from a webpage applying regex on it. I searched a lot but could not find any example that can satiate my requirement. The pattern may not be ideal to find the phone number but the main concern here is how I can use the pattern within vba IE.

再一次:我的目的是在vba IE中应用regex来解析该网页中的电话号码661-421-5861.

Once again: my intention here is to parse the phone number 661-421-5861 from that webpage applying regex within vba IE.

这是我到目前为止尝试过的:

This is what I've tried so far:

Sub FetchItems()
    Const URL$ = "https://www.nafe.com/bakersfield-nafe-network"
    Dim IE As New InternetExplorer, HTML As HTMLDocument
    Dim rxp As New RegExp, email As Object, Row&

    With IE
        .Visible = True
        .navigate URL
        While .Busy = True Or .readyState < 4: DoEvents: Wend
        Set HTML = .document
    End With

    With rxp
        .Pattern = "(?<=Phone:)\s*?.*?([^\s]+)"
        Set email = .Execute(HTML.body.innerText) 'I'm getting here an error
        If email.Count > 0 Then
            Row = Row + 1: Cells(Row, 1) = email.Item(0)
        End If
    End With
    IE.Quit
End Sub

执行上述脚本时,遇到包含Set email = .Execute(HTML.body.innerText)的行时,对象"IRegExp2"的方法执行"失败 ,我遇到了错误 方法. .我怎样才能成功?

When I execute the above script I encounter an error method "Execute" of object "IRegExp2" failed when it hits the line containing Set email = .Execute(HTML.body.innerText). How can I make it a go successfully?

推荐答案

请注意,VBA正则表达式不支持lookbehinds.在这里,您可能想捕获Phone:之后的任意数字,后跟任意数量的数字和连字符.

Note that lookbehinds are not supported by VBA regex. Here, you probably want to capture any digit followed with any amount of digits and hyphens after Phone:.

您需要将模式重新定义为

You need to re-define the pattern as

rxp.Pattern = "Phone:\s*(\d[-\d]+)"

然后,您需要获取第一个比赛并访问其.SubMatches(0):

Then, you need to grab the first match and access its .SubMatches(0):

Set email = .Execute(HTML.body.innerText)
If email.Count > 0 Then
    Cells(Row+1, 1) = email.Item(0).SubMatches(0)
 End If

请参见实际使用的正则表达式. c的绿色突出显示的部分是.SubMatches(0)所在的位置.

See the regex in action. The green-highlighted part of sting is what .SubMatches(0) holds.

模式详细信息

  • Phone:-文字子字符串
  • \s*-0+空格
  • (\d[-\d]+)-捕获组1:一个数字,后跟1+(由于+,您可以替换为*以匹配零个或多个)数字或连字符.
  • Phone: - a literal substring
  • \s* - 0+ whitespaces
  • (\d[-\d]+) - Capturing group 1: a digit, followed with 1+ (due to +, you may replace with * to match zero or more) digits or/and hyphens.

这篇关于无法在VBA IE中应用正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆