'UTF-8'编解码器无法解码字节0x80 [英] 'utf-8' codec can't decode byte 0x80
问题描述
我正在尝试下载经过BVLC训练的模型,但我一直卡在这个错误中
I'm trying to download BVLC-trained model and I'm stuck with this error
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 110: invalid start byte
我认为是由于以下功能(完整代码)
I think it's because of the following function (complete code)
# Closure-d function for checking SHA1.
def model_checks_out(filename=model_filename, sha1=frontmatter['sha1']):
with open(filename, 'r') as f:
return hashlib.sha1(f.read()).hexdigest() == sha1
有什么办法解决这个问题吗?
Any idea how to fix this?
推荐答案
您正在打开的文件未经UTF-8编码,而系统的默认编码设置为UTF-8.
You are opening a file that is not UTF-8 encoded, while the default encoding for your system is set to UTF-8.
由于您正在计算SHA1哈希,因此应将数据读取为 binary . hashlib
函数要求您传入字节:
Since you are calculating a SHA1 hash, you should read the data as binary instead. The hashlib
functions require you pass in bytes:
with open(filename, 'rb') as f:
return hashlib.sha1(f.read()).hexdigest() == sha1
请注意在文件模式下添加了b
.
Note the addition of b
in the file mode.
请参见 open()
文档:
mode 是一个可选字符串,用于指定打开文件的模式.默认为
'r'
,这意味着可以在文本模式下阅读. [...] 在文本模式下,如果未指定 encoding ,则使用的编码取决于平台:调用<c4>以获得当前的语言环境编码. (要读取和写入原始字节,请使用二进制模式,而未指定 encoding .)
mode is an optional string that specifies the mode in which the file is opened. It defaults to
'r'
which means open for reading in text mode. [...] In text mode, if encoding is not specified the encoding used is platform dependent:locale.getpreferredencoding(False)
is called to get the current locale encoding. (For reading and writing raw bytes use binary mode and leave encoding unspecified.)
并从 hashlib
模块文档中:
and from the hashlib
module documentation:
您现在可以使用update()方法向该对象提供类似字节的对象(通常为字节).
You can now feed this object with bytes-like objects (normally bytes) using the update() method.
这篇关于'UTF-8'编解码器无法解码字节0x80的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!