从HTML表格单元格中提取数据 [英] Extract data out of HTML table cell

查看:120
本文介绍了从HTML表格单元格中提取数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用VBScript从网站(非公共Intranet)中提取信息.这个时间总是在网站上的同一字段中,但是每个日期都不同,之后我想在当天的当前时间与此信息一起在我的视野中创建一个会议(持续时间始终为1小时).我可以使用VBScript打开网站并召开会议,但我在读取网站字段的时间信息时遇到问题.

I want to extract an information from a website (non-public intranet) with VBScript. It is a time which is always in the same field on the website, but each date different and afterwards I want to create with this information a meeting in my outlook (duration always exactly 1 hour) at exactly this time at the current day. I am fine with opening the website with VBScript and putting the meeting, but I have problems to read out the time information of the website field.

不幸的是,我无法在此处粘贴完整结构的图片,因此我向您粘贴了我要查找的信息的DOM元素:

Unfortunately I cannot paste the picture of the full structure here, so I paste you the DOM element of the information I am looking for:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML Strict//EN"><META http-equiv="Content-Type" content="text/html; charset=iso-8859-1">

<HTML><BODY id="info_body" aLink="#ff0000" link="#0000ff" bgColor="#ebf7ff" text="#000000" vLink="#800080"><TABLE class="topMargin" border="0" cellSpacing="2" cellPadding="2" width="600" align="center"><TBODY>    <TR class="data">

<TD>
11:42
</TD>

</TR></TBODY></TABLE></BODY></HTML>

这对您有帮助吗?我需要将信息(在这种情况下为"11:42")设置为一个变体,并且/或者将其显示在Msgbox中.结构为:

Is this helping you? I would need to set the information (in this case "11:42") to a variant and/or to show it in a Msgbox. And the structure is:

html --> body --> iframe (3rd) --> html --> body --> table (1st) --> tbody
--> tr class = "data" --> td (4th)

打开网站时,我使用:

SET ie1 = WScript.CreateObject("InternetExplorer.Application", "IE_")
myUrl1="http://xxxx" 'website
hwnd = ie1.hwnd
ie1.Navigate myUrl1

Set oShell = CreateObject("Shell.Application")
For Each Wnd In oShell.Windows
  If hwnd = Wnd.hwnd Then Set ie1 = Wnd
Next 

为了获得所需的信息,我尝试了:

For getting the required information I tried:

  1. MsgBox ie.document.body.innerhtml

可以正常工作并提供完整的html代码.

which works and gives me the full html code.

getElementsByTagName

我在互联网上找到了这些代码段,显然它们可以为其他代码段工作,但是很抱歉,我无法适应我的特定要求和案例.

I found these code snippets in the internet and apparently they work for other, but I am sorry, I am not able to adapt to my specific requirements and case.

更新27.05.2015:

使用此新版本,大多数情况下都可以使用,但不幸的是并非总是如此...

With this new versions it works most of the times, but unfortunately not always...

Option Explicit

Dim ie, b, url
Dim hwnd, oshell, wnd
Dim tbl, iframe, td
Dim c, Zeit1, start_punkt
Dim outl, a, myNameSpace, myFolder, myitem, alle_items, myitem_new, olmeeting
Dim ie_exist

URL = "http://xxxx" 'website

SET ie = WScript.CreateObject("InternetExplorer.Application", "IE_")
hwnd = ie.hwnd
ie.Navigate url

Set oShell = CreateObject("Shell.Application")
For Each Wnd In oShell.Windows
If hwnd = Wnd.hwnd Then Set ie = Wnd
Next

DO WHILE ie.ReadyState <> 4
LOOP

'get 3rd iframe in page
Set iframe = ie.document.getElementsByTagName("iframe").Item(2).contentWindow

'get 1st table in iframe
Set tbl = iframe.document.getElementsByTagName("table").Item(0)

'get 4th cell in table
Set td = tbl.getElementsByTagName("td").Item(8)

Msgbox td.innerText

当它不起作用时,我在行中收到一个未知的错误消息:

When it is not working I get an unkown error message in line:

If hwnd = wnd.hwnd then set ie = Wnd

@Ansgar:如果删除 Set oShell = ... 部分,则会收到错误消息:

@Ansgar: If I remove the Set oShell = ... part I get the error message:

所调用的对象已与其客户端断开连接.

The object invoked has disconnected from its clients.

如果我删除 Set oShell = ... 部分以及 DO WHILE ie.ReadyState 循环,则会在行中得到未知错误:

And if I remove the Set oShell = ... part as well as the DO WHILE ie.ReadyState loop, I get unkown error in line:

Set iframe = ie.document.getElementsByTagName("iframe").Item(2).contentWindow

有什么想法吗?至少上述版本在大多数情况下都有效(每次尝试尝试的替代方法都会导致错误),但是是否有可能解决该问题并使它每次都起作用?

Any ideas? At least the version above works most of the times (the tried alternatives result each time in an error), but is there any possibility to fix the problem and make it working EACH time?

推荐答案

应该执行以下操作:

url = "http://www.example.com/"

Set ie = CreateObject("InternetExplorer.Application")
ie.Navigate url

While ie.ReadyState <> 4
  WScript.Sleep 100
Wend

'get 3rd iframe in page
Set iframe = ie.document.getElementsByTagName("iframe").Item(2).contentWindow
'get 1st table in iframe
Set tbl = iframe.document.getElementsByTagName("table").Item(0)
'get 4th cell in table
Set td  = tbl.getElementsByTagName("td").Item(3)

MsgBox td.innerText

顺便说一句,此代码毫无意义,因此请将其删除:

BTW, this code is pointless, so drop it:

Set oShell = CreateObject("Shell.Application")
For Each Wnd In oShell.Windows
  If hwnd = Wnd.hwnd Then Set ie1 = Wnd
Next

当您已经有一个Internet Explorer实例的引用时,就无需重新附加该实例.

You don't need to re-attach to an Internet Explorer instance when you already have a reference to that instance.

这篇关于从HTML表格单元格中提取数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆