CSV中的Python BeautifulSoup打印信息 [英] Python BeautifulSoup Print Info in CSV

查看:50
本文介绍了CSV中的Python BeautifulSoup打印信息的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我可以毫无问题地打印从站点获取的信息.但是,当我尝试将街道名称放在一列中并将邮政编码放入另一列时,将其放入CSV文件中,这就是我遇到问题的时候.我在CSV中获得的只是两列名称,以及页面中其自己列中的所有内容.这是我的代码.我也在使用Python 2.7.5和Beautiful汤4

I can print the information I am pulling from a site with no problem. But when I try to place the street names in one column and the zipcodes into another column into a CSV file that is when I run into problems. All I get in the CSV is the two column names and every thing in its own column across the page. Here is my code. Also I am using Python 2.7.5 and Beautiful soup 4

from bs4 import BeautifulSoup
import csv
import urllib2

url="http://www.conakat.com/states/ohio/cities/defiance/road_maps/"

page=urllib2.urlopen(url)

soup = BeautifulSoup(page.read())

f = csv.writer(open("Defiance Steets1.csv", "w"))
f.writerow(["Name", "ZipCodes"]) # Write column headers as the first line

links = soup.find_all(['i','a'])

for link in links:
    names = link.contents[0]
    print unicode(names)

f.writerow(names)   

推荐答案

从URL检索的数据包含的 a 元素比 i 元素更多.您必须过滤 a 元素,然后使用Python zip 内置插件构建配对.

The data you retrieve from the URL contains more a elements than i elements. You must filter the a elements and then build pairs using the Python zip buildin.

links = soup.find_all('a')
links = [link for link in links
         if link["href"].startswith("http://www.conakat.com/map/?p=")]
zips = soup.find_all('i')

for l, z in zip(links, zips):
    f.writerow((l.contents[0], z.contents[0]))

输出:

Name,ZipCodes
1ST ST,(43512)
E 1ST ST,(43512)
W 1ST ST,(43512)
2ND ST,(43512)
E 2ND ST,(43512)
W 2ND ST,(43512)
3 RIVERS CT,(43512)
3RD ST,(43512)
E 3RD ST,(43512)
...

这篇关于CSV中的Python BeautifulSoup打印信息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆