提取无序列表特定＆LT; DIV＆GT ;: BeautifulSoup [英] Extracting unordered list for a particular <div>: BeautifulSoup

查看：202 发布时间：2016/8/5 19:22:27 python html beautifulsoup

本文介绍了提取无序列表特定＆LT; DIV＆GT ;: BeautifulSoup的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我刮这个网页需要我的Android应用程序。我想这样做是从的href 属性提取的国家。这是相同的，因为这<一个href=\"http://stackoverflow.com/questions/16267768/beautiful-soup-extracting-href-from-html-ordered-list\">one.

I'm scraping this webpage needed for my android app. What I would like to do is to extract the countries from href attribute. This is the same as this one.

下面是我的code：

from bs4 import BeautifulSoup
import urllib2
import re

html_page = urllib2.urlopen("http://www.howtocallabroad.com/a.html")
soup = BeautifulSoup(html_page)
li = soup.select("ul > li > a")
for link in li:
    print link.get('href')

我得到的问题是，结果返回所有包括来自其他 DIV 取值标签p>


The problem i'm getting is that the result returns all a tag including from other divs
afghanistan/
albania/
algeria/
american-samoa/
andorra/
angola/
anguilla/
antigua/
argentina/
armenia/
aruba/
ascension/
australia/
austria/
azerbaijan/
codes.html  # not needed
nanp.html   # not needed
qa/         # not needed
forums/     # not needed

我想知道是什么函数S需要做到这一点/。我想在＆LT过滤的href  S; DIV ID =内容＆GT; 而已。该文档可是没有多的信息。
I'd like to know on what function/s needed to accomplish this. I want to filter hrefs in <div id="content"> only. The docs doesnt have much info.
很抱歉，这是我第一次写的蟒蛇。
Sorry this is the first time i write python.
推荐答案
使用的findAll（）：
>>> for i in soup.find('div',{'id':'content'}).findAll('a'):
...     print i['href']
... 
afghanistan/
albania/
algeria/
american-samoa/
andorra/
angola/
anguilla/
antigua/
argentina/
armenia/
aruba/
ascension/
australia/
austria/
azerbaijan/

  soup.find（'格'，{'ID'：'内容'}）是否说的话。它发现其中有内容的 ID  div标签（＆LT; DIV ID =内容的方式＆gt; 将匹配）
soup.find('div',{'id':'content'}) Does what it says. It finds the div tag which has an id of content (<div id="content"> would be matched).
  .findAll（） ...查找所有！ 'A'作为参数来找到所有的一个标签。它返回每个标签的列表。
.findAll()... finds all! 'a' is used as a parameter to find all the a tags. It returns a list of each a tag.
然后，我只是打印每一个标签的的href 。
Then I simply print each a-tag's href.

                        这篇关于提取无序列表特定＆LT; DIV＆GT ;: BeautifulSoup的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

提取无序列表特定＆LT; DIV＆GT ;: BeautifulSoup [英] Extracting unordered list for a particular <div>: BeautifulSoup

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

提取无序列表特定＆LT; D​​IV＆GT ;: BeautifulSoup [英] Extracting unordered list for a particular &lt;div&gt;: BeautifulSoup

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

提取无序列表特定＆LT; DIV＆GT ;: BeautifulSoup [英] Extracting unordered list for a particular <div>: BeautifulSoup

登录关闭