在 Selenium (Python) 中遍历表行 [英] Iterating Through Table Rows in Selenium (Python)

查看:31
本文介绍了在 Selenium (Python) 中遍历表行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含表格的网页,该网页仅在我单击检查元素"时才会出现,并且在查看源代码"页面中不可见.表格只包含两行,每行有几个单元格,看起来类似于:

I have a webpage with a table that only appears when I click 'Inspect Element' and is not visible through the View Source page. The table contains only two rows with several cells each and looks similar to this:

<table class="datadisplaytable">
<tbody>
<tr>
<td class="dddefault">16759</td>
<td class="dddefault">MATH</td>
<td class="dddefault">123</td>
<td class="dddefault">001</td>
<td class="dddefault">Calculus</td>
<td class="dddefault"></td>
<td class="dddead"></td>
<td class="dddead"></td>
</tr>
<tr>
<td class="dddefault">16449</td>
<td class="dddefault">PHY</td>
<td class="dddefault">456</td>
<td class="dddefault">002</td>
<td class="dddefault">Physics</td>
<td class="dddefault"></td>
<td class="dddead"></td>
<td class="dddead"></td>
</tr>
</tbody>
</table>

我想要做的是遍历行并返回每个单元格中包含的文本.我似乎无法用 Selenium 做到这一点.这些元素不包含 ID,我不确定如何获取它们.我对使用 xpaths 等不是很熟悉.

What I'm trying to do is to iterate through the rows and return the text contained in each cell. I can't really seem to do it with Selenium. The elements contain no IDs and I'm not sure how else to get them. I'm not very familiar with using xpaths and such.

这是一个返回 TypeError 的调试尝试:

Here is a debugging attempt that returns a TypeError:

def check_grades(self):
    table = []
    for i in self.driver.find_element_by_class_name("dddefault"):
        table.append(i)
    print(table)

从行中获取文本的简单方法是什么?

What is an easy way to get the text from the rows?

推荐答案

如果您想使用 xpath 逐行浏览,您可以使用以下内容:

If you want to go row by row using an xpath, you can use the following:

h  = """<table class="datadisplaytable">
<tr>
<td class="dddefault">16759</td>
<td class="dddefault">MATH</td>
<td class="dddefault">123</td>
<td class="dddefault">001</td>
<td class="dddefault">Calculus</td>
<td class="dddefault"></td>
<td class="dddead"></td>
<td class="dddead"></td>
</tr>
<tr>
<td class="dddefault">16449</td>
<td class="dddefault">PHY</td>
<td class="dddefault">456</td>
<td class="dddefault">002</td>
<td class="dddefault">Physics</td>
<td class="dddefault"></td>
<td class="dddead"></td>
<td class="dddead"></td>
</tr>
</table>"""

from lxml import html
xml = html.fromstring(h)
# gets the table
table =  xml.xpath("//table[@class='datadisplaytable']")[0]


# iterate over all the rows   
for row in table.xpath(".//tr"):
     # get the text from all the td's from each row
    print([td.text for td in row.xpath(".//td[@class='dddefault'][text()])

输出:

['16759', 'MATH', '123', '001', 'Calculus']
['16449', 'PHY', '456', '002', 'Physics']

使用 td[text()] 将避免为没有文本的 td 返回任何 None.

Using td[text()] will avoid getting any Nones returned for the td's that hold no text.

所以要使用 selenium 做同样的事情,你会:

So to do the same using selenium you would:

table =  driver.find_element_by_xpath("//table[@class='datadisplaytable']")

for row in table.find_elements_by_xpath(".//tr"):
    print([td.text for td in row.find_elements_by_xpath(".//td[@class='dddefault'][1]"])

对于多个表:

def get_row_data(table):
   for row in table.find_elements_by_xpath(".//tr"):
        yield [td.text for td in row.find_elements_by_xpath(".//td[@class='dddefault'][text()]"])


for table in driver.find_elements_by_xpath("//table[@class='datadisplaytable']"):
    for data in get_row_data(table):
        # use the data

这篇关于在 Selenium (Python) 中遍历表行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆