从IE对象中返回整个页面文本 [英] Return entire page text from IE object
问题描述
我使用VBA的正则表达式来获取网页上的电子邮件,这些电子邮件的格式都非常不同。由于这些格式的差异,我正在努力访问整个页面文本。
I'm using regex with VBA to pick up e-mails on webpages, all of which are formatted very differently. I'm struggling to access the entire page text owing to these differences in formats.
目前我的方法是使用
Currently my approach is just to use
Dim retStr as String
retStr = ie.document.body.innerText
其中即来自
Set ie = CreateObject(InternetExplorer.Application)
看起来很简单,但在一些页面上,比如这一个并非所有的页面文本都被返回。通过所有页面文本,我的意思是任何 ctrl + f 会作用于例如。在链接页面中,每个步骤的文本似乎都不会被返回。我想象不同网页之间会有差异,尤其是如果它们没有用HTML格式化的话。
Seems simple enough, but on some pages such as this one not all of the page text is being returned. By "all of the page text", I mean anything that ctrl+f would act on for example. In the linked page, the text of each 'step' doesn't seem to be returned. I imagine there will be a variation between different webpages, especially if they aren't formatted in HTML.
按 ctrl + a 在网页上返回我想要的文本,是否有某种方式访问此文本而不使用 sendkeys
?
Pressing ctrl+a on the webpage returns the text I'd like, is there some way of accessing this text without using sendkeys
?
推荐答案
它对我来说工作得很好。我有一种感觉,你正在将它写入Excel单元格,因此文本被截断。
It is working just fine for me. I have a feeling that you are writing that to an Excel cell and hence the text is getting truncated.
我把它写到一个文本文件中,我得到了完整的文本。
I wrote it to a text file and I got the complete text.
Sub Sample()
Dim ie As Object
Dim retStr As String
Set ie = CreateObject("internetexplorer.application")
With ie
.Navigate "http://www.wikihow.com/Choose-an-Email-Address"
.Visible = True
End With
Do While ie.readystate <> 4: Wait 5: Loop
DoEvents
retStr = ie.document.body.innerText
'~> Write the above to a text file
Dim filesize As Integer
Dim FlName As String
'~~> Change this to the relevant path
FlName = "C:\Users\Siddharth\Desktop\Sample.Txt"
filesize = FreeFile()
Open FlName For Output As #filesize
Print #filesize, retStr
Close #filesize
End Sub
Private Sub Wait(ByVal nSec As Long)
nSec = nSec + Timer
While nSec > Timer
DoEvents
Wend
End Sub
这篇关于从IE对象中返回整个页面文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!