写入文件时出现 UnicodeEncodeError [英] UnicodeEncodeError when writing to a file
问题描述
我正在尝试将一些字符串写入文件(这些字符串已由 HTML 解析器 BeautifulSoup 提供给我).
I am trying to write some strings to a file (the strings have been given to me by the HTML parser BeautifulSoup).
我可以使用打印"来显示它们,但是当我使用 file.write() 时出现以下错误:
I can use "print" to display them, but when I use file.write() I get the following error:
UnicodeEncodeError: 'ascii' codec can't encode character u'xa3' in position 6: ordinal not in range(128)
我该如何解析?
推荐答案
当您将包含非英语字符(超过 128 的 Unicode 字符)的 Unicode 字符串传递给需要 ASCII 字节字符串的内容时,会发生此错误.Python 字节串的默认编码是 ASCII,它正好处理 128 个(英文)字符".这就是尝试转换 Unicode 字符超过 128 会产生错误的原因.
This error occurs when you pass a Unicode string containing non-English characters (Unicode characters beyond 128) to something that expects an ASCII bytestring. The default encoding for a Python bytestring is ASCII, "which handles exactly 128 (English) characters". This is why trying to convert Unicode characters beyond 128 produces the error.
unicode()
unicode(string[, encoding, errors])
构造函数具有签名 unicode(string[, encoding, errors]).它的所有参数都应该是 8 位字符串.
constructor has the signature unicode(string[, encoding, errors]). All of its arguments should be 8-bit strings.
第一个参数使用指定的编码转换为Unicode;如果省略编码参数,将使用 ASCII 编码进行转换,因此大于 127 的字符将被视为错误
The first argument is converted to Unicode using the specified encoding; if you leave off the encoding argument, the ASCII encoding is used for the conversion, so characters greater than 127 will be treated as errors
例如
s = u'La Pexf1a'
print s.encode('latin-1')
或
write(s.encode('latin-1'))
将使用 latin-1 编码
will encode using latin-1
这篇关于写入文件时出现 UnicodeEncodeError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!