python unicode:如何判断是否需要将字符串解码为utf-8? [英] python unicode: How can I judge if a string needs to be decoded into utf-8?

查看:607
本文介绍了python unicode:如何判断是否需要将字符串解码为utf-8?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个功能,可以接受来自网络的请求.在大多数情况下,传入的字符串不是unicode,但有时是.

I have a function accepting requests from the network. Most of the time, the string passed in is not unicode, but sometimes it is.

我有将所有内容转换为unicode的代码,但它报告了此错误:

I have code to convert everything to unicode, but it reports this error:

message.create(username, unicode(body, "utf-8"), self.get_room_name(),\
TypeError: decoding Unicode is not supported

我认为原因是'body'参数已经是unicode了,所以unicode()引发了异常.

I think the reason is the 'body' parameter is already unicode, so unicode() raises an exception.

有什么办法可以避免这种异常,例如在转换之前判断类型?

Is there any way to avoid this exception, e.g. judge the type before the conversion?

推荐答案

  1. 您没有解码为UTF-8,而是编码为 UTF-8或解码.
  2. 即使只是ASCII,也可以从UTF8安全解码. ASCII是UTF8的子集.
  3. 检测是否需要解码的最简单方法是

  1. You do not decode to UTF-8, you encode to UTF-8 or decode from.
  2. You can safely decode from UTF8 even if it's just ASCII. ASCII is a subset of UTF8.
  3. The easiest way to detect if it needs decoding or not is

if not isinstance(data, unicode):
    # It's not Unicode!
    data = data.decode('UTF8')

这篇关于python unicode:如何判断是否需要将字符串解码为utf-8?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆