使用CSS选择器提取文本 [英] Extract text using css selector
问题描述
我正在尝试使用CSS选择器提取特定文本.这是我要提取的部分的屏幕截图
I am trying to extract specific text using a CSS selector. Here's a screenshot of the part that I would like to extract
我尝试了
div[id="Section3"]:first-child
但这不会返回任何内容.我不能依靠文本来定位元素,因为我需要如图所示提取文本.
but this doesn't return anything. I can't depend on locating the element by the text because I need to extract that text as shown.
这是相关的HTML
<div class="ad24123fa4-c17c-4dc5-9aa5-ea007a8db30e-5" style="top:8px;left:218px;width:124px;height:31px;text-align:center;">
<table width="113px" border="0" cellpadding="0" cellspacing="0">
<tbody>
<tr>
<td>
<table width="100%" border="0" cellpadding="0" cellspacing="0">
<tbody>
<tr>
<td align="center">
<span class="fcb900b29f-64d7-453d-babf-192e86f17d6f-7">نظامي</span>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</div>
完整的HTML可以在此处.
The full HTML is here.
这是我的尝试
On Error Resume Next
Set ele = .FindElementByXPath("//span[text()='ãäÇÒá']")
If ele Is Nothing Then sStatus = "äÙÇãí" Else sStatus = "ãäÇÒá"
On Error GoTo 0
在检查该元素时,我发现在控制台中有使用$ 0的提示..这有用吗?
While inspecting the element I noticed that there is a hint of using $0 in the console .. Can this be useful?
至于两个可能的文本نظامي"和منازل"
As for the two possible texts "نظامي" and "منازل"
推荐答案
要将xpath与多个可能的搜索值一起使用,请使用以下语法:
To use xpath with multiple possible search values use the following syntax:
//*[text()='نظامي' or text()='منازل']
CSS选择器(对我有用):
CSS selectors (that work for me):
driver.findElementByCss("#ctl00_ContentPlaceHolder1_CrystalReportViewer1 div.ad071889d2-8e6f-4755-ad7d-c44ae0ea9fca-5 table span").text
是完整选择器的缩写:
#ctl00_ContentPlaceHolder1_CrystalReportViewer1 > tbody > tr > td > div > div.crystalstyle > div.ad071889d2-8e6f-4755-ad7d-c44ae0ea9fca-5 > table > tbody > tr > td > table > tbody > tr > td > span
您还可以索引表nodeList
You can also index into table nodeList
Set matches = html.querySelectorAll("#ctl00_ContentPlaceHolder1_CrystalReportViewer1 div.crystalstyle table")
ActiveSheet.Cells(1, 1) = matches.item(80).innerText
否则:
从html文件中读取内容,我可以根据类选择器获取匹配项的最后一个索引.对于硒,您将切换到:
Reading in from html file I can take the last index of the matches based on class selector. For selenium you would switch to:
driver.FindElementsByCss(".fc180999a8-04b5-46bc-bf86-f601317d19c8-7").count
VBA:
Option Explicit
Public Sub test()
Dim html As HTMLDocument, matches As Object
Dim fStream As ADODB.Stream
Set html = New HTMLDocument
Set fStream = New ADODB.Stream
With fStream
.Charset = "UTF-8"
.Open
.LoadFromFile "C:\Users\User\Desktop\Output6.html"
html.body.innerHTML = .ReadText
.Close
End With
Set matches = html.querySelectorAll(".fc180999a8-04b5-46bc-bf86-f601317d19c8-7")
ActiveSheet.Cells(1, 1) = matches.item(matches.Length - 1).innerText
End Sub
这篇关于使用CSS选择器提取文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!