BeautifulSoup，保存文本文件刮结果 [英] BeautifulSoup, save scrape results in text file

查看：240 发布时间：2016/8/5 19:00:33 python beautifulsoup

本文介绍了BeautifulSoup，保存文本文件刮结果的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图从BeautifulSoup表中抽取数据，这保存到文件中。我写这样的：

I'm trying to scrape data from a table with BeautifulSoup and save this to a file. I wrote this:

import urllib2
from bs4 import BeautifulSoup

url = "http://dofollow.netsons.org/table1.htm"

page = urllib2.urlopen(url).read()
soup = BeautifulSoup(page)

for tr in soup.find_all('tr')[2:]:
    tds = tr.find_all('td')
    print "%s, %s, %s" % (tds[0].text, tds[1].text, tds[2].text)

这工作。

然后我试着写结果到一个文件，但它不工作。（

I then tried to write the results to a file but it is not working. :(

logfile = open("log.txt", 'a')             
logfile.write("%s,%s,%s\n" % (tds[0].text, tds[1].text, tds[2].text))   
logfile.close()

如何保存我的结果在测试文件？

How can save my results in a test file?

推荐答案

BeautifulSoup给你的Uni code数据，这些数据在写入文件之前需要连接code。

BeautifulSoup gives you Unicode data, which you need to encode before writing it to a file.

如果您使用 IO 库，它可以打开与透明编码的文件对象时，它会更容易：

It'll be easier if you use the io library, which lets you open a file object with transparent encoding:

import io

with io.open('log.txt', 'a', encoding='utf8') as logfile:
    for tr in soup.find_all('tr')[2:]:
        tds = tr.find_all('td')
        logfile.write(u"%s, %s, %s\n" % (tds[0].text, tds[1].text, tds[2].text))

的与语句采用关闭文件对象对你的照顾。

The with statement takes care of closing the file object for you.

我用UTF8作为codeC，但你可以选择任何一台可以处理你刮页面中使用的所有codepoints。

I used UTF8 as the codec, but you can pick any that can handle all codepoints used in the pages you are scraping.

这篇关于BeautifulSoup，保存文本文件刮结果的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

BeautifulSoup，保存文本文件刮结果 [英] BeautifulSoup, save scrape results in text file

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

BeautifulSoup，保存文本文件刮结果 [英] BeautifulSoup, save scrape results in text file

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭