如何提取<! - - >之间的内容使用VBA? [英] How to extract something between <!-- --> using VBA?
问题描述
我试图用VBA刮页。我知道如何通过 id class 和 tag 名称获取元素。但现在我遇到了这个Tag
<! - < b> IE CODE:3407004044< / b> - >
现在在互联网上搜索后,我知道这是HTML中的一条评论,无法找到该元素的标签名称是什么,如果它有资格作为标签。我应该使用
documnet.getelementsbytagname(!)?
如果不是,我还能如何提取这些评论?
编辑:
我在 tr 元素中有一堆 td 元素,我想提取 IE代码:3407004044
下面是一组更大的HTML代码:
< tr align =left>
< td width =50%class =subhead1>
'这是我想要提取的部分
<! - < b> IE代码:3108011111< / b> - >
< / td>
< td rowspan =9valign =top>
< span id =datalist1_ctl00_lbl_p>< / span>
< / td>
< / tr>
谢谢!
您可以使用XPath:
substring-before(substring-after(// tr // comment(),< b>),< / b>)
pre>
获取所需数据
I'm trying to scrape a page using VBA. I know how to get elements by id class and tag names. But now I have come across this Tag
<!-- <b>IE CODE : 3407004044</b> -->
Now after searching on the internet I know that this is a comment in the HTML, but what I'm unable to find is what is the tag name of this element ,if it qualifies as a tag at all. Should I use
documnet.getelementsbytagname("!") ?
If not, how else can I extract these comments ?
EDIT: I have a bunch of these td elements within tr elements and I want to extract
IE Code : 3407004044
Below is a larger set of HTML code:<tr align="left"> <td width="50%" class="subhead1"> ' this is the part that I want to extract <!-- <b>IE CODE : 3108011111</b> --> </td> <td rowspan="9" valign="top"> <span id="datalist1_ctl00_lbl_p"></span> </td> </tr>
Thanks!
解决方案You can use XPath:
substring-before(substring-after(//tr//comment(), "<b>"), "</b>")
to get required data
这篇关于如何提取<! - - >之间的内容使用VBA?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!