urllib2开头提供错误的字符集 [英] urllib2 opener providing wrong charset

查看：242 发布时间：2016/11/19 13:12:25 python utf-8 character-encoding urllib2

本文介绍了urllib2开头提供错误的字符集的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

当我打开网址阅读它，我不能认出它。但是当我检查内容头，它说它被编码为utf-8。所以我试图将其转换为unicode，并抱怨UnicodeDecodeError：'ascii'编解码器无法解码字节0x8b在位置1：序数不在范围（128）使用unicode（）。

When I open the url and read it, I can't recognize it. But when I check the content header it says it is encoded as utf-8. So I tried to convert it to unicode and it complained UnicodeDecodeError: 'ascii' codec can't decode byte 0x8b in position 1: ordinal not in range(128) using unicode().

.encode（utf-8）产生
UnicodeDecodeError：'ascii'编解码器无法解码位置1中的字节0x8b：在范围（128）

.encode("utf-8") produces UnicodeDecodeError: 'ascii' codec can't decode byte 0x8b in position 1: ordinal not in range(128)

.decode（utf-8）生成
UnicodeDecodeError：'utf8'编解码器无法解码字节0x8b位置1：无效的起始字节。

.decode("utf-8") produced UnicodeDecodeError: 'utf8' codec can't decode byte 0x8b in position 1: invalid start byte.

我试过了一切我可以想到的（我不是很好的编码）

I have tried everything I can come up with(I'm not that good at encodings)

如果我能得到这个工作，我会很高兴。感谢。

I would be happy if I could get this to work. Thanks.

推荐答案

这是一个常见的错误。服务器发送gzip压缩的流。

This is a common mistake. The server sends gzipped stream.

您应该首先解压缩：

response = opener.open(self.__url, data)
if response.info().get('Content-Encoding') == 'gzip':
    buf = StringIO.StringIO( response.read())
    gzip_f = gzip.GzipFile(fileobj=buf)
    content = gzip_f.read()
else:
    content = response.read()

这篇关于urllib2开头提供错误的字符集的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

urllib2开头提供错误的字符集 [英] urllib2 opener providing wrong charset

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

urllib2开头提供错误的字符集 [英] urllib2 opener providing wrong charset

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭