BeautifulSoup:获取特定表的内容 [英] BeautifulSoup: Get the contents of a specific table

查看:362
本文介绍了BeautifulSoup:获取特定表的内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

<一个href=\"http://www.iaa.gov.il/Rashat/he-IL/Airports/BenGurion/informationForTravelers/OnlineFlights.aspx?flightsType=arr\">My当地机场可耻块用户无需IE,看起来可怕。我想写一个Python脚本,会得到到达和离开页面每隔几分钟的内容,并显示他们在一个更可读的方式。

My local airport disgracefully blocks users without IE, and looks awful. I want to write a Python scripts that would get the contents of the Arrival and Departures pages every few minutes, and show them in a more readable manner.

我选择的工具是机械化作弊网站相信我使用IE浏览器,和的BeautifulSoup 解析页面获得航班数据表。

My tools of choice are mechanize for cheating the site to believe I use IE, and BeautifulSoup for parsing page to get the flights data table.

坦白说,我迷路了BeautifulSoup文档中,并且无法理解如何获得表(其标题我知道)从整个文件,以及如何从该表行的列表。

Quite honestly, I got lost in the BeautifulSoup documentation, and can't understand how to get the table (whose title I know) from the entire document, and how to get a list of rows from that table.

任何想法?

亚当

推荐答案

这是不特定的code你需要,只是一个如何与BeautifulSoup工作演示。它发现谁的ID表是表1,并得到其所有TR元素。

This is not the specific code you need, just a demo of how to work with BeautifulSoup. It finds the table who's id is "Table1" and gets all of its tr elements.

html = urllib2.urlopen(url).read()
bs = BeautifulSoup(html)
table = bs.find(lambda tag: tag.name=='table' and tag.has_key('id') and tag['id']=="Table1") 
rows = table.findAll(lambda tag: tag.name=='tr')

这篇关于BeautifulSoup:获取特定表的内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆