'UTF-8'编解码器无法解码字节0x80 [英] 'utf-8' codec can't decode byte 0x80

查看:2104
本文介绍了'UTF-8'编解码器无法解码字节0x80的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试下载经过BVLC训练的模型,但我一直卡在这个错误中

I'm trying to download BVLC-trained model and I'm stuck with this error

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 110: invalid start byte

我认为是由于以下功能(完整代码)

I think it's because of the following function (complete code)

  # Closure-d function for checking SHA1.
  def model_checks_out(filename=model_filename, sha1=frontmatter['sha1']):
      with open(filename, 'r') as f:
          return hashlib.sha1(f.read()).hexdigest() == sha1

有什么办法解决这个问题吗?

Any idea how to fix this?

推荐答案

您正在打开的文件未经UTF-8编码,而系统的默认编码设置为UTF-8.

You are opening a file that is not UTF-8 encoded, while the default encoding for your system is set to UTF-8.

由于您正在计算SHA1哈希,因此应将数据读取为 binary . hashlib函数要求您传入字节:

Since you are calculating a SHA1 hash, you should read the data as binary instead. The hashlib functions require you pass in bytes:

with open(filename, 'rb') as f:
    return hashlib.sha1(f.read()).hexdigest() == sha1

请注意在文件模式下添加了b.

Note the addition of b in the file mode.

请参见 open()文档:

mode 是一个可选字符串,用于指定打开文件的模式.默认为'r',这意味着可以在文本模式下阅读. [...] 在文本模式下,如果未指定 encoding ,则使用的编码取决于平台:调用<​​c4>以获得当前的语言环境编码. (要读取和写入原始字节,请使用二进制模式,而未指定 encoding .)

mode is an optional string that specifies the mode in which the file is opened. It defaults to 'r' which means open for reading in text mode. [...] In text mode, if encoding is not specified the encoding used is platform dependent: locale.getpreferredencoding(False) is called to get the current locale encoding. (For reading and writing raw bytes use binary mode and leave encoding unspecified.)

并从 hashlib模块文档中:

and from the hashlib module documentation:

您现在可以使用update()方法向该对象提供类似字节的对象(通常为字节).

You can now feed this object with bytes-like objects (normally bytes) using the update() method.

这篇关于'UTF-8'编解码器无法解码字节0x80的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆