美丽的汤，prettified HTML为TXT，获得编码错误 [英] Beautiful Soup, prettified html to txt, get encoding error

查看：164 发布时间：2016/8/5 19:22:30 python-2.7 encoding utf-8 beautifulsoup

本文介绍了美丽的汤，prettified HTML为TXT，获得编码错误的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想保存的HTML文件的prettified打印到一个txt文件，但得到这个错误信息：

 回溯（最后最近一次调用）：
  文件prettyhtmlfiles.py，第16行，上述＆lt;＆模块GT;
    file.write（汤。prettify（））
UNI $ C $岑codeError：ASCIIcodeC无法连接code字符U'\\ XBB在8532的位置是：序数不在范围内（128）

我

怎样才能解决这个问题呢？

在code我有：

 进口的urllib2
进口OS
从BS4进口BeautifulSoup
导入CSVURL =/home/sveisa/S141test/ayuki.html
开放（URL，'R'）为f：
    数据= f.read（）
    汤= BeautifulSoup（开（'/家庭/ sveisa / S141test / ayuki.html'）。阅读（））打印（汤。prettify（））
文件=打开（newfile.txt，W）file.write（汤。prettify（））

解决方案

试试这个。它应该工作。

 打印＆GT;＆GT;文件（汤。prettify（）。EN code（UTF-8））

I'm trying to save a prettified print of a html file, to a txt file, but get this error message:

Traceback (most recent call last):
  File "prettyhtmlfiles.py", line 16, in <module>
    file.write(soup.prettify())
UnicodeEncodeError: 'ascii' codec can't encode character u'\xbb' in position 8532: ordinal not in range(128)

How can I get around this problem?

The code I have:

import urllib2
import os
from bs4 import BeautifulSoup
import csv

url = "/home/sveisa/S141test/ayuki.html"
with open(url, 'r') as f:
    data = f.read()
    soup = BeautifulSoup(open('/home/sveisa/S141test/ayuki.html').read())

print(soup.prettify())


file = open("newfile.txt", "w")

file.write(soup.prettify())

解决方案

Try this. It should work.

print >> file, (soup.prettify().encode('utf-8'))

这篇关于美丽的汤，prettified HTML为TXT，获得编码错误的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

美丽的汤，prettified HTML为TXT，获得编码错误 [英] Beautiful Soup, prettified html to txt, get encoding error

问题描述

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

美丽的汤，prettified HTML为TXT，获得编码错误 [英] Beautiful Soup, prettified html to txt, get encoding error

问题描述

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭