从IE对象中返回整个页面文本 [英] Return entire page text from IE object

查看:119
本文介绍了从IE对象中返回整个页面文本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用VBA的正则表达式来获取网页上的电子邮件,这些电子邮件的格式都非常不同。由于这些格式的差异,我正在努力访问整个页面文本。

I'm using regex with VBA to pick up e-mails on webpages, all of which are formatted very differently. I'm struggling to access the entire page text owing to these differences in formats.

目前我的方法是使用

Currently my approach is just to use

Dim retStr as String
retStr = ie.document.body.innerText

其中即来自 Set ie = CreateObject(InternetExplorer.Application)

看起来很简单,但在一些页面上,比如这一个并非所有的页面文本都被返回。通过所有页面文本,我的意思是任何 ctrl + f 会作用于例如。在链接页面中,每个步骤的文本似乎都不会被返回。我想象不同网页之间会有差异,尤其是如果它们没有用HTML格式化的话。

Seems simple enough, but on some pages such as this one not all of the page text is being returned. By "all of the page text", I mean anything that ctrl+f would act on for example. In the linked page, the text of each 'step' doesn't seem to be returned. I imagine there will be a variation between different webpages, especially if they aren't formatted in HTML.

ctrl + a 在网页上返回我想要的文本,是否有某种方式访问​​此文本而不使用 sendkeys

Pressing ctrl+a on the webpage returns the text I'd like, is there some way of accessing this text without using sendkeys?

推荐答案

它对我来说工作得很好。我有一种感觉,你正在将它写入Excel单元格,因此文本被截断。

It is working just fine for me. I have a feeling that you are writing that to an Excel cell and hence the text is getting truncated.

我把它写到一个文本文件中,我得到了完整的文本。

I wrote it to a text file and I got the complete text.

Sub Sample()
    Dim ie As Object
    Dim retStr As String

    Set ie = CreateObject("internetexplorer.application")

    With ie
        .Navigate "http://www.wikihow.com/Choose-an-Email-Address"
        .Visible = True
    End With

    Do While ie.readystate <> 4: Wait 5: Loop

    DoEvents

    retStr = ie.document.body.innerText

    '~> Write the above to a text file
    Dim filesize As Integer
    Dim FlName As String

    '~~> Change this to the relevant path
    FlName = "C:\Users\Siddharth\Desktop\Sample.Txt"

    filesize = FreeFile()

    Open FlName For Output As #filesize

    Print #filesize, retStr
    Close #filesize
End Sub

Private Sub Wait(ByVal nSec As Long)
    nSec = nSec + Timer
    While nSec > Timer
        DoEvents
    Wend
End Sub

这篇关于从IE对象中返回整个页面文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆