Python:UnicodeDecodeError:'utf-8'编解码器无法解码位置0的字节0x80:无效的起始字节 [英] Python: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
问题描述
我正在从目录中获取数据,并且正在以字节格式提供数据.
I am fetching data from a catalog and it's giving data in bytes format.
字节数据:
b'\x80\x00\x00\x00\n\x00\x00%\x83\xa0\x08\x01\x00\xbb@\x00\x00\x05p
\x02\x00>\xf3\x00\x00\x00}\x02\x00`\x03\xef0\x00\x00\r\xc0
\x06\xf0>\xf3\x00\x00\x02\x88\x02\x03\xec\x03\xef0\x00\x00/.....'
在将数据转换为字符串或任何可读格式时,出现此错误:
While converting this data in string or any readable format I'am getting this error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
我使用的代码(Python 3.7.3):
Code which I used(Python 3.7.3):
blobs = blob.decode('utf-8')
AND
import json
json.dumps(blob.decode())
我还使用了 pickle
, ast
和 pprint
,但是它们在这里没有帮助.
I've also used pickle
, ast
and pprint
but they are not helpful here.
我尝试过的事情:
- UnicodeDecodeError:"utf8"编解码器无法解码位置0的字节0xa5:无效的起始字节
- 错误UnicodeDecodeError:'utf-8'编解码器无法解码位置0的字节0xff:无效的起始字节
- Python 3 CSV给出UnicodeDecodeError的文件:'utf-8'编解码器在我打印时无法解码字节错误"utf-8"编解码器无法解码字节0x80
- UnicodeDecodeError:"utf8"编解码器可以在位置3131解码字节0x80:无效的起始字节
- https://www.edureka.co/community/52722/unicodedecodeerror-codec-decode-position-invalid-start-byte
推荐答案
UTF-8编码具有一些内置的冗余,至少可用于两个目的:
The UTF-8 encoding has some built-in redundancy that serves at least two purposes:
起始字节(携带实际数据的二进制点)与这4种模式之一匹配
Start bytes (in binary dots carrying actual data) match one of these 4 patterns
0.......
110.....
1110....
11110...
而连续字节(0到3)始终具有这种形式
whereas continuation bytes (0 to 3) have always this form
10......
2)检查有效性
如果不遵守此编码,可以肯定地说它不是UTF-8数据,例如因为在传输过程中发生了损坏.
2) checking for validity
If this encoding is not respected, it is safe to say that it is not UTF-8 data, e.g. because corruptions occurred during a transfer.
为什么可以说 b'\ x80 \'
不能为UTF-8?已经在前两个字节违反了编码:因为80必须是连续字节.这完全是您的错误消息所说的:
Why is it possible to say that b'\x80\'
cannot be UTF-8?
Already at the first two bytes the encoding is violated: because 80 must be a continuation byte. This is exactly what your error message says:
UnicodeDecodeError:'utf-8'编解码器无法解码位置0:无效的起始字节中的字节0x80
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
即使您跳过了这个问题,以后在 b'%\ x83'
上的另一个字节上也会遇到另一个问题,因此很可能是您尝试解码错误的数据或假设错误的编码.
And even if you skip this one, you get another problem some bytes later at b'%\x83'
, so it's most likely that either you are trying to decode the wrong data or assume the wrong encoding.
这篇关于Python:UnicodeDecodeError:'utf-8'编解码器无法解码位置0的字节0x80:无效的起始字节的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!