Python ASCII和Unicode解码错误 [英] Python ASCII and Unicode decode error

查看:179
本文介绍了Python ASCII和Unicode解码错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在将特定字符串插入数据库时​​,我遇到了非常令人沮丧的错误。它说的像:


Python不能解码字节字符,期望unicode


经过大量搜索,我发现我可以通过将我的字符串编码为 Unicode 。我尝试通过首先解码字符串,然后以 UTF-8 格式对其进行编码来完成此操作。喜欢:

  string = string.encode(utf8)
 'ascii'编解码器无法解码位置6中的字节0xe3:顺序不在范围内(128)

这个错误!我如何解决它?

解决方案

编辑:正如你可以看到从downvote,这不是最好的方法。一个很好的,强烈推荐的答案是在此之后,所以如果你正在寻找一个很好的解决方案,请使用它。



我感到你的痛苦,我有很多问题,在同一个错误。我解决它的最简单的方法(这可能不是最好的方法,它取决于你的应用程序)是将事情转换为unicode,并忽略错误。以下是 Unicode HOWTO - Python v2.7.3文档的示例

 >>> unicode('\x80abc',errors ='strict')
回溯(最近一次调用):
文件< stdin>,行1,
UnicodeDecodeError:'ascii'编解码器不能解码位置0中的字节0x80:
序数不在范围内(128)
>>> unicode('\x80abc',errors ='replace')
u'\\\�abc'
>>>> unicode('\x80abc',errors ='ignore')
u'abc'





编辑:



有几个人在评论中提到,这是一个坏主意,即使asker接受了答案。这不是一个好主意,如果你在处理欧洲和重音字符,它会拧紧。然而,这是你可以使用,如果它不是生产级代码,如果它是一个个人项目,你正在工作,你需要一个快速修复,以使事情滚动。你最终需要使用正确的方法来修复它,这在下面的答案中提到。


I got this very very frustrating error when inserting a certain string into my database. It said something like:

Python cannot decode byte characters, expecting unicode"

After a lot of searching, I saw that I could overcome this error by encoding my string into Unicode. I try to do this by decoding the string first and then encoding it in UTF-8 format. Like:

string = string.encode("utf8")

And I get the following error:

'ascii' codec can't decode byte 0xe3 in position 6: ordinal not in range(128)

I have been dying with this error! How do I fix it?

解决方案

EDIT: As you can see from the downvotes, this is NOT THE BEST WAY TO DO IT. An excellent, and a highly recommended answer is immediately after this, so if you are looking for a good solution, please use that. This is a hackish solution that will not be kind to you at a later point of time.

I feel your pain, I've had a lot of problems with the same error. The simplest way I solved it (and this might not be the best way, and it depends on your application) was to convert things to unicode, and ignore errors. Here's an example from Unicode HOWTO - Python v2.7.3 documentation

>>> unicode('\x80abc', errors='strict')
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 0:
                    ordinal not in range(128)
>>> unicode('\x80abc', errors='replace')
u'\ufffdabc'
>>> unicode('\x80abc', errors='ignore')
u'abc'

While this might not be the most expedient method, this is a method that has worked for me.

EDIT:

A couple of people in the comments have mentioned that this is a bad idea, even though the asker accepted the answer. It is NOT a great idea, it will screw things up if you are dealing with european and accented characters. However, this is something you can use if it is NOT production level code, if it is a personal project you are working on, and you need a quick fix to get things rolling. You will eventually need to fix it with the right methods, which are mentioned in the answers below.

这篇关于Python ASCII和Unicode解码错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆