提取无序列表特定&LT; DIV&GT ;: BeautifulSoup [英] Extracting unordered list for a particular <div>: BeautifulSoup
问题描述
我刮这个网页需要我的Android应用程序。我想这样做是从的href
属性提取的国家。这是相同的,因为这<一个href=\"http://stackoverflow.com/questions/16267768/beautiful-soup-extracting-href-from-html-ordered-list\">one.
I'm scraping this webpage needed for my android app. What I would like to do is to extract the countries from href
attribute. This is the same as this one.
下面是我的code:
from bs4 import BeautifulSoup
import urllib2
import re
html_page = urllib2.urlopen("http://www.howtocallabroad.com/a.html")
soup = BeautifulSoup(html_page)
li = soup.select("ul > li > a")
for link in li:
print link.get('href')
我得到的问题是,结果返回所有包括来自其他
DIV
取值标签p>
The problem i'm getting is that the result returns all a
tag including from other div
s
afghanistan/
albania/
algeria/
american-samoa/
andorra/
angola/
anguilla/
antigua/
argentina/
armenia/
aruba/
ascension/
australia/
austria/
azerbaijan/
codes.html # not needed
nanp.html # not needed
qa/ # not needed
forums/ # not needed
我想知道是什么函数S需要做到这一点/。我想在&LT过滤
而已。该文档可是没有多的信息。的href
S; DIV ID =内容&GT;
I'd like to know on what function/s needed to accomplish this. I want to filter href
s in <div id="content">
only. The docs doesnt have much info.
很抱歉,这是我第一次写的蟒蛇。
Sorry this is the first time i write python.
推荐答案
使用的findAll()
:
>>> for i in soup.find('div',{'id':'content'}).findAll('a'):
... print i['href']
...
afghanistan/
albania/
algeria/
american-samoa/
andorra/
angola/
anguilla/
antigua/
argentina/
armenia/
aruba/
ascension/
australia/
austria/
azerbaijan/
soup.find('格',{'ID':'内容'})
是否说的话。它发现其中有内容的
( ID
div标签&LT; DIV ID =内容的方式&gt;
将匹配)
soup.find('div',{'id':'content'})
Does what it says. It finds the div tag which has an id
of content
(<div id="content">
would be matched).
.findAll()
...查找所有! 'A'
作为参数来找到所有的一个标签。它返回每个标签的列表。
.findAll()
... finds all! 'a'
is used as a parameter to find all the a tags. It returns a list of each a tag.
然后,我只是打印每一个标签的的href
。
Then I simply print each a-tag's href
.
这篇关于提取无序列表特定&LT; DIV&GT ;: BeautifulSoup的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!