从html获取属性字符串值 [英] Get attribute string value from html

查看:391
本文介绍了从html获取属性字符串值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在构建一个宏,以使用 vba从网站中提取数据。目前,我可以使用元素语法(如 obj.getElementsByTagName(td)。innerText 轻松获取表格内容的值。但是,当某些单元格中有一些非innerText数据时,我遇到麻烦。这是这样的:

 < img src =/ images / amber_pending.gifborder =0alt = title =Pending> 

我尝试使用从其他人发现的语法从标题中提取属性值:


对于每个tbObj在doc.getElementsByClassName(report removeTdBorder)
i = 1
对于每个trObj在tbObj中。 getElementsByTagName(tr)
如果i> = 3然后
j = 1
对于每个tdObj在trObj.getElementsByTagName(td)
如果j = 1然后
设置imgObj = tdObj.getElementsByTagName(img)
dataArray(i,j)= imgObj.getAttribute(title)
Debug.Print imgObj.getAttribute(title)
ActiveCell.Offset(0,j)= dataArray(i,j)
ActiveCell.Offset(0,j).WrapText = False
Else
dataArray(i,j)= tdObj。 innerText
Debug.Print i& ,& j& :& dataArray(i,j)
ActiveCell.Offset(0,j)= dataArray(i,j)
ActiveCell.Offset(0,j).WrapText = False
End If
j = j + 1
下一个tdObj
ActiveCell.Offset(1,0).Activate
End If
i = i + 1
下一个trObj
下一个但是这个代码每次出现错误,并且表示运行时错误438':对象不支持此属性或方法在行$ dataArray(i,j)= imgObj.getAttribute(title)。可能有人帮我吗?

解决方案

 设置imgObj = tdObj.getElementsByTagName(img )

返回一组图像(即使只有一个可以找到),所以你可以解决使用(例如)的特定图像:

  dataArray(i,j)= imgObj(0).getAttribute(title) 


I am building a macro to extract data from website using . Currently I can easily get value from table content using element syntax like obj.getElementsByTagName("td").innerText. However, when there are some non-innerText data in some cells, I am getting trouble. It's like this:

<img src="/images/amber_pending.gif" border="0" alt="Pending" title="Pending">

I attempted to extract the attribute value from "title" using syntax I found from others:

For Each tbObj In doc.getElementsByClassName("report removeTdBorder")
    i = 1
    For Each trObj In tbObj.getElementsByTagName("tr")
        If i >= 3 Then
            j = 1
            For Each tdObj In trObj.getElementsByTagName("td")
                If j = 1 Then
                    Set imgObj = tdObj.getElementsByTagName("img")
                    dataArray(i, j) = imgObj.getAttribute("title")
                    Debug.Print imgObj.getAttribute("title")
                    ActiveCell.Offset(0, j) = dataArray(i, j)
                    ActiveCell.Offset(0, j).WrapText = False
                Else
                    dataArray(i, j) = tdObj.innerText
                    Debug.Print i & ", " & j & ": " & dataArray(i, j)
                    ActiveCell.Offset(0, j) = dataArray(i, j)
                    ActiveCell.Offset(0, j).WrapText = False
                End If
                j = j + 1
            Next tdObj
            ActiveCell.Offset(1, 0).Activate
        End If
        i = i + 1
    Next trObj
Next tbObj

But this code goes error every time and it said "Run-time error '438': Object doesn't support this property or method" at the line dataArray(i, j) = imgObj.getAttribute("title"). Could some one help me?

解决方案

Set imgObj = tdObj.getElementsByTagName("img")

returns a collection of images (even if there's only one to be found), so you can address a specific image using (eg):

dataArray(i, j) = imgObj(0).getAttribute("title")

这篇关于从html获取属性字符串值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆