统一写入文件时,$ C $岑codeError [英] UnicodeEncodeError when writing to a file

查看:128
本文介绍了统一写入文件时,$ C $岑codeError的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试一些字符串写入一个文件(字符串已被HTML解析器BeautifulSoup给我)。

I am trying to write some strings to a file (the strings have been given to me by the HTML parser BeautifulSoup).

我可以使用打印,以显示他们,但是当我使用file.write()我收到以下错误:

I can use "print" to display them, but when I use file.write() I get the following error:

UnicodeEncodeError: 'ascii' codec can't encode character u'\xa3' in position 6: ordinal not in range(128)

有没有人见过这个?

Has anyone seen this before?

推荐答案

在传递一个包含非英文字符(统一code字符超过128)的东西,期望一个一个统一code字符串出现此错误ASCII字节字符串。对于一个Python字节字符串的默认编码为ASCII,这正是处理128(英文)字符。这就是为什么尝试转换的Uni code字符超出128产生错误。

This error occurs when you pass a Unicode string containing non-English characters (Unicode characters beyond 128) to something that expects an ASCII bytestring. The default encoding for a Python bytestring is ASCII, "which handles exactly 128 (English) characters". This is why trying to convert Unicode characters beyond 128 produces the error.

单向code()

unicode(string[, encoding, errors])

构造函数签名UNI code(字符串[,编码,错误])。它的所有参数应该是8位的字符串。

constructor has the signature unicode(string[, encoding, errors]). All of its arguments should be 8-bit strings.

第一个参数是使用指定的编码转换成统一code; 如果你离开了编码参数,ASCII编码用于转换,所以字符大于127将被视为错误

The first argument is converted to Unicode using the specified encoding; if you leave off the encoding argument, the ASCII encoding is used for the conversion, so characters greater than 127 will be treated as errors

例如:

s = u'La Pe\xf1a' 
print s.encode('latin-1')

write(s.encode('latin-1'))

将连接code使用Latin-1

will encode using latin-1

这篇关于统一写入文件时,$ C $岑codeError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆