如何从vbscript中的html页面提取值-我尝试了MSXML2.DOMDocument [英] How can I extract a value from an html page in vbscript - I tried MSXML2.DOMDocument

查看:132
本文介绍了如何从vbscript中的html页面提取值-我尝试了MSXML2.DOMDocument的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

下面是一些我试图从网页中的节点获取值的代码。但是在尝试设置objNode时失败了……感激不尽的帮助。

Below is some code I tried to get the value from a node in webpage. But it fails when trying to set the objNode... any help gratefully appreciated.

Dim objHttp, sWebPage, objNode, objDoc

Set objDoc = CreateObject("MSXML2.DOMDocument")
objDoc.Load "http://www.hl.co.uk/shares/shares-search-results/a/aveva-group-plc-ordinary-3.555p"

' objDoc.setProperty "SelectionLanguage", "XPath"

' Find a particular element using XPath:
Set objNode = objDoc.selectSingleNode("span[@id='ls-bid-AVV-L']")
MsgBox objNode.getAttribute("value")


推荐答案


  1. 期望XML解析器处理 clean 非常乐观HTML;对于有缺陷的HTML,您可以将其忘记(参考)。

  2. 在不检查错误的情况下,切勿加载.a(另请参见)。在您的情况下,引发的.reason是未在DTD /架构中定义此元素的属性'property'。

  3. 您可以使用<$ c $关闭验证。 c> objDoc.validateOnParse = False 并使用 objDoc.async = False 避免怪物页面出现问题(至少没有 msxml3.dll:数据

  4. 要在任意位置搜索跨度(不知道其在层次结构中的位置),您需要 // span [@ id = 'ls-bid-AVV-L']而不是 span [@ id ='ls-bid-AVV-L']。

  5. 要查找的跨度没有名为值;要获得 1,334.00p,您需要要求 objNode.text

  6. 但是所有这些都无济于事:该页面的格式不正确。 .parseError.reason为结束标记div不匹配开始标记input。。

  1. It's very optimistic to expect an XML parser to handle clean HTML; for flawed HTML, you can forget it (ref).
  2. You should never .load without checking for errors (see also). In your case, the .reason thrown is "The attribute 'property' on this element is not defined in the DTD/Schema."
  3. You can switch off the validation with objDoc.validateOnParse = False and avoid problems with monster pages with objDoc.async = False (at least no "msxml3.dll: The data necessary to complete this operation is not yet available." error).
  4. To search for a span anywhere (without knowing its place in the hierarchy) you need "//span[@id='ls-bid-AVV-L']" instead of "span[@id='ls-bid-AVV-L']".
  5. The span to find has no attribute named value; to get the "1,334.00p" you'd need to ask for objNode.text.
  6. But all this is to no avail: The page is not even well-formed. The .parseError.reason is "End tag 'div' does not match the start tag 'input'.".

这篇关于如何从vbscript中的html页面提取值-我尝试了MSXML2.DOMDocument的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆