ascii编解码器无法解码字节0xe9 [英] ascii codec cant decode byte 0xe9

查看:349
本文介绍了ascii编解码器无法解码字节0xe9的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经进行了一些研究,看到了解决方案,但没有一个对我有用.

Python-'ascii'编解码器无法解码字节

这对我不起作用.而且我知道0xe9是é字符.但是我仍然不知道如何使它工作,这是我的代码

output_lines = ['<menu>', '<day name="monday">', '<meal name="BREAKFAST">', '<counter name="Entreé">', '<dish>', '<name icon1="Vegan" icon2="Mindful Item">', 'Cream of Wheat (Farina)','</name>', '</dish>', '</counter >', '</meal >', '</day >', '</menu >']
output_string = '\n'.join([line.encode("utf-8") for line in output_lines])

这给了我错误ascii codec cant decode byte 0xe9

我尝试解码,尝试替换é",但似乎也无法使它工作.

解决方案

您正在尝试对字节串进行编码:

>>> '<counter name="Entreé">'.encode('utf8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 20: ordinal not in range(128)

Python试图提供帮助,您只能将 Unicode 字符串编码为字节,因此要使用默认编码首先隐式地 decodes 编码Python.

解决方案是对已编码的数据进行不编码,或者,如果数据被编码为与所需编码不同的编解码器,则在尝试再次编码之前先使用合适的编解码器进行解码.

如果混合使用unicode和字节串值,则仅解码字节串或仅编码unicode值;尽量避免混淆类型.以下代码将字节字符串首先解码为unicode:

def ensure_unicode(v):
    if isinstance(v, str):
        v = v.decode('utf8')
    return unicode(v)  # convert anything not a string to unicode too

output_string = u'\n'.join([ensure_unicode(line) for line in output_lines])

I have done some research and seen solutions but none have worked for me.

Python - 'ascii' codec can't decode byte

This didn't work for me. And I know the 0xe9 is the é character. But I still can't figure out how to get this working, here is my code

output_lines = ['<menu>', '<day name="monday">', '<meal name="BREAKFAST">', '<counter name="Entreé">', '<dish>', '<name icon1="Vegan" icon2="Mindful Item">', 'Cream of Wheat (Farina)','</name>', '</dish>', '</counter >', '</meal >', '</day >', '</menu >']
output_string = '\n'.join([line.encode("utf-8") for line in output_lines])

And this give me the error ascii codec cant decode byte 0xe9

And I have tried decoding, I have tried to replace the "é" but can't seem to get that to work either.

解决方案

You are trying to encode bytestrings:

>>> '<counter name="Entreé">'.encode('utf8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 20: ordinal not in range(128)

Python is trying to be helpful, you can only encode a Unicode string to bytes, so to encode Python first implictly decodes, using the default encoding.

The solution is to not encode data that is already encoded, or first decode using a suitable codec before trying to encode again, if the data was encoded to a different codec than what you needed.

If you have a mix of unicode and bytestring values, decode just the bytestrings or encode just the unicode values; try to avoid mixing the types. The following decodes byte strings to unicode first:

def ensure_unicode(v):
    if isinstance(v, str):
        v = v.decode('utf8')
    return unicode(v)  # convert anything not a string to unicode too

output_string = u'\n'.join([ensure_unicode(line) for line in output_lines])

这篇关于ascii编解码器无法解码字节0xe9的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆