Python以表格形式下载数据 [英] Python downloaded data in table form
本文介绍了Python以表格形式下载数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
这是从表中下载数据并在cmd上输出的代码.我想知道是否可以在行和列的相同表结构中下载相同的数据?这是我试过的.
This is the code that is downloading data from table and output that on cmd. I want to know if the same data can be downloaded in the same structure of table like in rows and columns? This is what i have tried.
代码:
import urllib
import re
from urlparse import urlparse
from bs4 import BeautifulSoup as bs
urls = ["http://physics.iitd.ac.in/content/list-faculty-members", "http://www.iitkgp.ac.in/commdir3/list.php?division=3&deptcode=ME","http://www.iitkgp.ac.in/commdir3/list.php?division=3&deptcode=CE"]
i = 0
while i< len(urls):
htmlfile = urllib.urlopen(urls[i])
htmltext = htmlfile.read()
soup = bs(htmltext)
tables = soup.find_all('table', attrs = {'border': '0' , 'width' : '100%' , 'cellpadding': '10'})
head = soup.find_all('h2' , attrs = {'class' : 'title style3'})
ree = tables.find_all('tr')
hea = head.find_all('big').find_all('strong')
datasets = []
q = []
s = []
t = hea.get_text()
q.append(t)
for b in ree:
x = [td.get_text() for td in b.find_all('td')]
dataset = [strong.get_text() for strong in b.find('td').find('a').find_all('strong')]
datasets.append(dataset)
q.append(x)
print q
i+=1
推荐答案
我想很多人在处理表格数据时会推荐使用 pandas 库.对于结构良好的 HTML,你可以盲目地使用 pandas read_html.
I think many people would recommend the use of the pandas library when working with tabular data. For well structured HTML, you can just blindly use pandas read_html.
import pandas as pd
tables = pd.read_html("http://physics.iitd.ac.in/content/list-faculty-members")
dataframe = tables[0]
这篇关于Python以表格形式下载数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文