如何从vbscript中的html页面提取值-我尝试了MSXML2.DOMDocument [英] How can I extract a value from an html page in vbscript - I tried MSXML2.DOMDocument
本文介绍了如何从vbscript中的html页面提取值-我尝试了MSXML2.DOMDocument的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
下面是一些我试图从网页中的节点获取值的代码。但是在尝试设置objNode时失败了……感激不尽的帮助。
Below is some code I tried to get the value from a node in webpage. But it fails when trying to set the objNode... any help gratefully appreciated.
Dim objHttp, sWebPage, objNode, objDoc
Set objDoc = CreateObject("MSXML2.DOMDocument")
objDoc.Load "http://www.hl.co.uk/shares/shares-search-results/a/aveva-group-plc-ordinary-3.555p"
' objDoc.setProperty "SelectionLanguage", "XPath"
' Find a particular element using XPath:
Set objNode = objDoc.selectSingleNode("span[@id='ls-bid-AVV-L']")
MsgBox objNode.getAttribute("value")
推荐答案
- 期望XML解析器处理 clean 非常乐观HTML;对于有缺陷的HTML,您可以将其忘记(参考)。
- 在不检查错误的情况下,切勿加载.a(另请参见)。在您的情况下,引发的.reason是未在DTD /架构中定义此元素的属性'property'。
- 您可以使用<$ c $关闭验证。 c> objDoc.validateOnParse = False 并使用
objDoc.async = False
避免怪物页面出现问题(至少没有 msxml3.dll:数据 - 要在任意位置搜索跨度(不知道其在层次结构中的位置),您需要 // span [@ id = 'ls-bid-AVV-L']而不是 span [@ id ='ls-bid-AVV-L']。
- 要查找的跨度没有名为值;要获得 1,334.00p,您需要要求
objNode.text
。 - 但是所有这些都无济于事:该页面的格式不正确。 .parseError.reason为结束标记div不匹配开始标记input。。
- It's very optimistic to expect an XML parser to handle clean HTML; for flawed HTML, you can forget it (ref).
- You should never .load without checking for errors (see also). In your case, the .reason thrown is "The attribute 'property' on this element is not defined in the DTD/Schema."
- You can switch off the validation with
objDoc.validateOnParse = False
and avoid problems with monster pages withobjDoc.async = False
(at least no "msxml3.dll: The data necessary to complete this operation is not yet available." error). - To search for a span anywhere (without knowing its place in the hierarchy) you need "//span[@id='ls-bid-AVV-L']" instead of "span[@id='ls-bid-AVV-L']".
- The span to find has no attribute named value; to get the "1,334.00p" you'd need to ask for
objNode.text
. - But all this is to no avail: The page is not even well-formed. The .parseError.reason is "End tag 'div' does not match the start tag 'input'.".
这篇关于如何从vbscript中的html页面提取值-我尝试了MSXML2.DOMDocument的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文