使用BeautifulSoup表中提取数据 [英] Extracting data in table using BeautifulSoup

查看：261 发布时间：2016/8/5 19:07:41 python table beautifulsoup

本文介绍了使用BeautifulSoup表中提取数据的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我刮了我的Android应用此页。我想提取物对城市和区域codeS

I'm scraping this page for my android app. I'd like to extract the data on the table of cities and area codes

下面是我的code：

from bs4 import BeautifulSoup
import urllib2
import re

base_url = "http://www.howtocallabroad.com/taiwan/"
html_page = urllib2.urlopen(base_url)
soup = BeautifulSoup(html_page)
codes = soup.select("#codes tbody > tr > td")
for area_code in codes:
    # print td city and area code

我想知道什么功能用Python或用 BeautifulSoup 从获取值＆LT; TD＆GT;价值＆LT; / TD＆GT;

对不起只是一个Android开发人员学习编写Python

Sorry just an android dev learning to write python

推荐答案

您可以使用的findAll（），连同它打破了一个列表分成块<函数/ p>

You can use findAll(), along with a function which breaks up a list into chunks

>>> areatable = soup.find('table',{'id':'codes'})
>>> d = {}
>>> def chunks(l, n):
...     return [l[i:i+n] for i in range(0, len(l), n)]
>>> dict(chunks([i.text for i in areatable.findAll('td')], 2))
{u'Chunan': u'36', u'Penghu': u'69', u'Wufeng': u'4', u'Fengyuan': u'4', u'Kaohsiung': u'7', u'Changhua': u'47', u'Pingtung': u'8', u'Keelung': u'2', u'Hsinying': u'66', u'Chungli': u'34', u'Suao': u'39', u'Yuanlin': u'48', u'Yungching': u'48', u'Panchiao': u'2', u'Taipei': u'2', u'Tainan': u'62', u'Peikang': u'5', u'Taichung': u'4', u'Yungho': u'2', u'Hsinchu': u'35', u'Tsoying': u'7', u'Hualien': u'38', u'Lukang': u'47', u'Talin': u'5', u'Chiaochi': u'39', u'Fengshan': u'7', u'Sanchung': u'2', u'Tungkang': u'88', u'Taoyuan': u'33', u'Hukou': u'36'}

说明：

.find（）中找到与 $的C $ CS ID的表。使用功能块分裂列表进入<一个href=\"http://stackoverflow.com/questions/312443/how-do-you-split-a-list-into-evenly-sized-chunks-in-python\">evenly大小的块的。




Explanation:

.find() finds a table with an id of codes. The chunks function is used to split up a list into evenly sized chunks.
由于的findAll 返回一个列表，我们使用列表块创建类似：
As findAll returns a list, we use chunks on the list to create something like:
[[u'Changhua', u'47'], [u'Keelung', u'2'], etc]

  i.text为我... 用于获取每个 D 标签的文本，否则在＆LT; TD＆GT; 和＆LT; / TD＆GT; 仍将
i.text for i in... is used to get the text of each td tag, otherwise the <td> and </td> would remain.
最后，字典（）被称为列表的列表转换成一个字典，你可以用它来访问该国的区域code 
Finally, dict() is called to convert the list of lists into a dictionary, which you can use to access the country's area code.

                        这篇关于使用BeautifulSoup表中提取数据的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

使用BeautifulSoup表中提取数据 [英] Extracting data in table using BeautifulSoup

问题描述

推荐答案

说明：

Explanation:

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用BeautifulSoup表中提取数据 [英] Extracting data in table using BeautifulSoup

问题描述

推荐答案

说明：

Explanation:

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭