Python：UnicodeEncodeError：'latin-1'编解码器无法编码字符 [英] Python : UnicodeEncodeError: 'latin-1' codec can't encode character

查看：602 发布时间：2020/10/29 5:58:19 python unicode encode

本文介绍了Python：UnicodeEncodeError：'latin-1'编解码器无法编码字符的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在一种情况下，我调用api，并根据api的结果为api中的每条记录调用数据库。我的api调用返回字符串，当我对由api返回的项进行数据库调用时，对于某些元素，出现以下错误。

I am at a scenario where I call api and based on the results from api I call database for each record that I in api. My api call return strings and when I make the database call for the items return by api, for some elements I get the following error.

Traceback (most recent call last):
  File "TopLevelCategories.py", line 267, in <module>
    cursor.execute(categoryQuery, {'title': startCategory});
  File "/opt/ts/python/2.7/lib/python2.7/site-packages/MySQLdb/cursors.py", line 158, in execute
    query = query % db.literal(args)
  File "/opt/ts/python/2.7/lib/python2.7/site-packages/MySQLdb/connections.py", line 265, in literal
    return self.escape(o, self.encoders)
  File "/opt/ts/python/2.7/lib/python2.7/site-packages/MySQLdb/connections.py", line 203, in unicode_literal
    return db.literal(u.encode(unicode_literal.charset))
UnicodeEncodeError: 'latin-1' codec can't encode character u'\u2013' in position 3: ordinal not in range(256)

上面错误所指的代码段是：

The segment of my code the above error is referring is:

         ...    
         for startCategory in value[0]:
            categoryResults = []
            try:
                categoryRow = ""
                baseCategoryTree[startCategory] = []
                #print categoryQuery % {'title': startCategory}; 
                cursor.execute(categoryQuery, {'title': startCategory}) #unicode issue
                done = False
                cont...

做完一些Google搜索后，我在命令行上尝试了以下操作，以了解发生了什么...

After doing some google search I tried the following on my command line to understand whats going on...

>>> import sys
>>> u'\u2013'.encode('iso-8859-1')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'latin-1' codec can't encode character u'\u2013' in position 0: ordinal not in range(256)
>>> u'\u2013'.encode('cp1252')
'\x96'
>>> '\u2013'.encode('cp1252')
'\\u2013'
>>> u'\u2013'.encode('cp1252')
'\x96'

但我不确定解决该问题的解决方案是什么。另外，我也不知道 encode（'cp1252'）背后的理论是什么，如果我能对上面的尝试有所解释，那会很棒。

But I am not sure what would be the solution to overcome this issue. Also I don't know what is the theory behind encode('cp1252') it would be great if I can get some explanation on what I tried above.

推荐答案

如果您需要Latin-1编码，则可以通过多种方法摆脱上面的破折号或其他代码点255（不包含在Latin-1中的字符）：

If you need Latin-1 encoding, you have several options to get rid of the en-dash or other code points above 255 (characters not included in Latin-1):

>>> u = u'hello\u2013world'
>>> u.encode('latin-1', 'replace')    # replace it with a question mark
'hello?world'
>>> u.encode('latin-1', 'ignore')     # ignore it
'helloworld'

或者您自己进行自定义替换：

Or do your own custom replacements:

>>> u.replace(u'\u2013', '-').encode('latin-1')
'hello-world'

如果您不需要输出Latin-1，则UTF-8是常见且首选的选择。 W3C推荐使用它，并对所有Unicode代码点进行很好的编码：

If you aren't required to output Latin-1, then UTF-8 is a common and preferred choice. It is recommended by the W3C and nicely encodes all Unicode code points:

>>> u.encode('utf-8')
'hello\xe2\x80\x93world'

这篇关于Python：UnicodeEncodeError：'latin-1'编解码器无法编码字符的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Python：UnicodeEncodeError：'latin-1'编解码器无法编码字符 [英] Python : UnicodeEncodeError: 'latin-1' codec can't encode character

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python：UnicodeEncodeError：'latin-1'编解码器无法编码字符 [英] Python : UnicodeEncodeError: &#39;latin-1&#39; codec can&#39;t encode character

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

Python：UnicodeEncodeError：'latin-1'编解码器无法编码字符 [英] Python : UnicodeEncodeError: 'latin-1' codec can't encode character

登录关闭