使用Python和BeautifulSoup解析表 [英] Using Python and BeautifulSoup to Parse a Table
问题描述
我试图访问与Python和BeautifulSoup一定TD标签的内容。我可以拿到第一td标签符合条件(与find),或全部(带的findAll)。
现在,我可以只使用的findAll,让他们所有,并得到了我想要从他们的内容,但看来似乎是低效率的(即使我把限制在搜索)。反正是有一定要去td标签符合我想要的标准是什么?说第三个,还是10号?
下面是我的code迄今:
从__future__进口部
从__future__进口UNI code_literals
从__future__进口print_function
从机械化导入浏览器
从BeautifulSoup进口BeautifulSoupBR =浏览器()
URL =http://finance.yahoo.com/q/ks?s=goog+Key+Statistics
页= br.open(URL)
HTML = page.read()
汤= BeautifulSoup(HTML)
TD = soup.findAll(TD,{'类':'yfnc_tablehead1'})对于x的范围(LEN(TD)):
VAR1 = TD [X]
VAR2 = var1.contents [0]
打印(VAR2)
找到
和的findAll
是非常灵活的,在<一个href=\"http://www.crummy.com/software/BeautifulSoup/documentation.html#The%20basic%20find%20method%3a%20findAll%28name,%20attrs,%20recursive,%20text,%20limit,%20%2a%2akwargs%29\"相对=nofollow> BeautifulSoup.findAll 文档说
5。你可以通过在一个可调用对象
这需要一个标签对象作为其唯一
参数,返回一个布尔值。一切
标记对象的findAll遭遇
将被传递到该对象,
如果调用返回true,则标签
被认为是匹配的。
块引用>I am trying to access content in certain td tags with Python and BeautifulSoup. I can either get the first td tag meeting the criteria (with find), or all of them (with findAll).
Now, I could just use findAll, get them all, and get the content I want out of them, but that seems like it is inefficient (even if I put limits on the search). Is there anyway to go to a certain td tag meeting the criteria I want? Say the third, or the 10th?
Here's my code so far:
from __future__ import division from __future__ import unicode_literals from __future__ import print_function from mechanize import Browser from BeautifulSoup import BeautifulSoup br = Browser() url = "http://finance.yahoo.com/q/ks?s=goog+Key+Statistics" page = br.open(url) html = page.read() soup = BeautifulSoup(html) td = soup.findAll("td", {'class': 'yfnc_tablehead1'}) for x in range(len(td)): var1 = td[x] var2 = var1.contents[0] print(var2)
解决方案
find
andfindAll
are very flexible, the BeautifulSoup.findAll docs say5. You can pass in a callable object which takes a Tag object as its only argument, and returns a boolean. Every Tag object that findAll encounters will be passed into this object, and if the call returns True then the tag is considered to match.
这篇关于使用Python和BeautifulSoup解析表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!