无法使用 Selenium 在网站中获取表格元素 [英] Unable to get table element in website using Selenium

查看：30 发布时间：2021/9/24 19:05:52 python python-3.x selenium web-scraping

本文介绍了无法使用 Selenium 在网站中获取表格元素的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

下面的网站有几个表格，但我的代码无法获得特定的表格(也没有任何其他表格).

该代码旨在从表Ações em Circulação no Mercado"中获取数据 -> 网页中的最后一个表.

我尝试了下面的代码和一些替代方法，但没有一个对我有用:

将pandas导入为pd从硒导入网络驱动程序从时间导入睡眠url = "http://bvmf.bmfbovespa.com.br/cias-Listadas/Empresas-Listadas/BuscaEmpresaListada.aspx?idioma=pt-br"股票代码='ITUB4'浏览器 = webdriver.Chrome()browser.get(url)sleep(2) #等待网页加载browser.find_element_by_xpath(('//*[@id="ctl00_contentPlaceHolderConteudo_BuscaNomeEmpresa1_txtNomeEmpresa_txtNomeEmpresa_text"]')).send_keys(Ticker)browser.find_element_by_xpath(('//*[@id="ctl00_contentPlaceHolderConteudo_BuscaNomeEmpresa1_btnBuscar"]')).click();sleep(2) #等待网页加载browser.find_element_by_xpath(('//*[@id="ctl00_contentPlaceHolderConteudo_BuscaNomeEmpresa1_grdEmpresa_ctl01"]/tbody/tr/td[1]/a')).click();sleep(5) #等待网页加载#这不起作用content = browser.find_element_by_css_selector('//div[@id="div1"]')#这也行不通#browser.find_element_by_xpath('//*[@id="div1"]/div/div/div[1]/table/tbody/tr[1]/td[1]').text

表格和完整的 HTML 可以在这里找到:

HTML 是:


<div><h3>Ações em Circulação no Mercado</h3><div class="table-wrapper"><div class="scrollable"><table class="responsive"><头><tr><th colspan="3" class="text-center">19/04/2017</th></tr><tr><td>Tipos de Investidores/Ações</td><td class="text-center">Quantidade</td><td class="text-center">百分比</td></tr></thead><tbody><tr><td>Pessoas Físicas</td><td class="text-right">108.853</td><td class="text-right">- </td></tr><tr><td>Pessoas Jurídicas</td><td class="text-right">11.591</td><td class="text-right">- </td></tr><tr><td>Investidores Institucionais</td><td class="text-right">1.039</td><td class="text-right">- </td></tr><tr><td>Quantidade de Ações Ordinárias</td><td class="text-right">272.710.309</td><td class="text-right">8,21</td></tr><tr><td>Quantidade de Ações Preferenciais</td><td class="text-right">3.141.058.175</td><td class="text-right">97,23</td></tr><tr><td>Total de Ações</td><td class="text-right">3.413.768.484</td><td class="text-right">52,11</td></tr></tbody></table></div><div class="pinned"></div></div>

解决方案

要定位 WebElement 并提取文本 Pessoas Fisicas，您可以使用以下代码行:

content = driver.find_element_by_xpath("//h3[.,'Ações em Circulação no Mercado']//following::div[1]//table[@class='responsive']//tr//following-sibling::td[1]").get_attribute("innerHTML")

<小时>

更新(无代码更改)

xpath 表达式:

//h3[.,'Ações em Circulação no Mercado']//following::div[1]//table[@class='responsive']//tr//following-sibling::td[1]

不应在单引号内，例如'xpath_here'.将 xpression 放在双引号中，例如"xpath_here"

查看工作快照:

The website below has several tables, but my code is not being able to get a specific one (nor any other table).

The code aims to get data from table "Ações em Circulação no Mercado" -> one of the last tables from webpage.

I have tried the code below and some alternatives, but none worked for me:

import pandas as pd
from selenium import webdriver
from time import sleep

url = "http://bvmf.bmfbovespa.com.br/cias-Listadas/Empresas-Listadas/BuscaEmpresaListada.aspx?idioma=pt-br"
Ticker='ITUB4'
browser = webdriver.Chrome()
browser.get(url)
sleep(2) #Wait webpage to load
browser.find_element_by_xpath(('//*[@id="ctl00_contentPlaceHolderConteudo_BuscaNomeEmpresa1_txtNomeEmpresa_txtNomeEmpresa_text"]')).send_keys(Ticker)
browser.find_element_by_xpath(('//*[@id="ctl00_contentPlaceHolderConteudo_BuscaNomeEmpresa1_btnBuscar"]')).click();
sleep(2) #Wait webpage to load
browser.find_element_by_xpath(('//*[@id="ctl00_contentPlaceHolderConteudo_BuscaNomeEmpresa1_grdEmpresa_ctl01"]/tbody/tr/td[1]/a')).click();
sleep(5) #Wait webpage to load

#This is not working
content = browser.find_element_by_css_selector('//div[@id="div1"]')

#This is not working as well
#browser.find_element_by_xpath('//*[@id="div1"]/div/div/div[1]/table/tbody/tr[1]/td[1]').text

The Table and Full HTML can be found here:

HTML is:

<div id="div1">
                <div>
                    <h3>Ações em Circulação no Mercado</h3>
                    <div class="table-wrapper"><div class="scrollable"><table class="responsive">

                        <thead>
                            <tr>
                                <th colspan="3" class="text-center">19/04/2017</th>
                            </tr>
                            <tr>
                                <td>Tipos de Investidores / Ações</td>
                                <td class="text-center">Quantidade</td>
                                <td class="text-center">Percentual</td>
                            </tr>
                        </thead>

                            <tbody><tr>
                                <td>Pessoas Físicas</td>
                                <td class="text-right">108.853</td>
                                <td class="text-right"> - </td>
                            </tr>

                            <tr>
                                <td>Pessoas Jurídicas</td>
                                <td class="text-right">11.591</td>
                                <td class="text-right"> - </td>
                            </tr>

                            <tr>
                                <td>Investidores Institucionais</td>
                                <td class="text-right">1.039</td>
                                <td class="text-right"> - </td>
                            </tr>

                            <tr>
                                <td>Quantidade de Ações Ordinárias</td>
                                <td class="text-right">272.710.309</td>
                                <td class="text-right">8,21</td>
                            </tr>

                            <tr>
                                <td>Quantidade de Ações Preferenciais</td>
                                <td class="text-right">3.141.058.175</td>
                                <td class="text-right">97,23</td>
                            </tr>

                            <tr>
                                <td>Total de Ações</td>
                                <td class="text-right">3.413.768.484</td>
                                <td class="text-right">52,11</td>
                            </tr>

                            </tbody></table></div><div class="pinned"></div></div>
                </div>
                </div>

解决方案

To locate the WebElement and extract the text Pessoas Fisicas you can use the following line of code :

content = driver.find_element_by_xpath("//h3[.,'Ações em Circulação no Mercado']//following::div[1]//table[@class='responsive']//tr//following-sibling::td[1]").get_attribute("innerHTML")

Update (no code change)

The xpath expression :

//h3[.,'Ações em Circulação no Mercado']//following::div[1]//table[@class='responsive']//tr//following-sibling::td[1]

Shouldn't be within single quotes e.g. 'xpath_here'. Put the xpression with in double quote e.g. "xpath_here"

See the working snapshot :

这篇关于无法使用 Selenium 在网站中获取表格元素的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

无法使用 Selenium 在网站中获取表格元素 [英] Unable to get table element in website using Selenium

问题描述

更新(无代码更改)

Update (no code change)

相关文章

Python最新文章

热门教程

热门工具

登录关闭

无法使用 Selenium 在网站中获取表格元素 [英] Unable to get table element in website using Selenium

问题描述

更新(无代码更改)

Update (no code change)

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭