Python BeautifulSoup，遍历标签和属性 [英] Python BeautifulSoup, iterating through tags and attributes

查看：35 发布时间：2021/9/4 19:19:51 python html selenium beautifulsoup tags

本文介绍了Python BeautifulSoup，遍历标签和属性的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想遍历我在 html 页面的某些部分中的所有标签.我应用了 BeautifulSoup，但我可以没有它，只有 Selenium 库.假设我有以下 html 代码:

I would like to iterate through all the tag I have in certain section of the html page. I applied the BeautifulSoup, but I could live without it and just the Selenium library. Let's say I have the following html code:

<table id="myBSTable">   
    <tr>
        <th>Column A1</th>
        <th>Column B1</th>
        <th>Column C1</th>
        <th>Column D1</th>
        <th>Column E1</th>
    </tr>
    <tr>
        <td data="First Column Data"></td>
        <td data="Second Column Data"></td>
        <td title="Title of the First Row">Value of Row 1</td>
        <td>Beautiful 1</td>
        <td>Soup 1</td>
    </tr>
    <tr>
        <td></td>
        <td data-g="Second Column Data"></td>
        <td title="Title of the Second Row">Value of Row 2</td>
        <td>Selenium 1</td>
        <td>Rocks 1</td>
    </tr>
    <tr>
        <td></td>
        <td></td>
        <td title="Title of the Third Row">Value of Row 3</td>
        <td>Pyhon 1</td>
        <td>Boulder 1</td>
    </tr>
    <tr>
        <th>Column A2</th>
        <th>Column B2</th>
        <th>Column C2</th>
        <th>Column D2</th>
        <th>Column E2</th>
    </tr>
    <tr>
        <td data="First Column Data"></td>
        <td data="Second Column Data"></td>
        <td title="Title of the First Row">Value of Row 1</td>
        <td>Beautiful 2</td>
        <td>Soup 2</td>
    </tr>
    <tr>
        <td></td>
        <td data-g="Second Column Data"></td>
        <td title="Title of the Second Row">Value of Row 2</td>
        <td>Selenium 2</td>
        <td>Rocks 2</td>
    </tr>
    <tr>
        <td></td>
        <td></td>
        <td title="Title of the Third Row">Value of Row 3 2</td>
        <td>Pyhon 2</td>
        <td>Boulder 2</td>
    </tr>
</table>

我让这部分工作完美:

#Selenium libraries
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import NoSuchElementException

#BeautifulSoup
from bs4 import BeautifulSoup

browser = webdriver.Firefox()
browser.get('http://urltoget.com')   

table = browser.find_element_by_id('myBSTable')
bs_table = BeautifulSoup(table.get_attribute('innerHTML'), 'lxml')
#So far so good
rows = bs_table.findAll('tr')
for tr in rows:
    #Here is where I need help
    #I want to iterate through all tags
    #but I don't know if is going to be a th or a td
    #At the same time I need to do something
    #if is a td or a th

这就是我想要完成的:

    #The following is a pseudo code
    for col in tr.tags:
        print col.name, col.value
        for attribute in col.attrs:
            print "    ", attribute.name, attribute.value
    #End pseudo code

谢谢，文艺

Python BeautifulSoup，遍历标签和属性 [英] Python BeautifulSoup, iterating through tags and attributes

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

Python BeautifulSoup，遍历标签和属性 [英] Python BeautifulSoup, iterating through tags and attributes

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭