请求如何确定响应的编码? [英] How does requests determine the encoding of a reponse?

查看：76 发布时间：2021/4/21 20:23:05 python character-encoding python-requests

本文介绍了请求如何确定响应的编码?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

响应的 apparent_encoding 属性怎么可能不正确?

How can a response's apparent_encoding attribute be incorrect?

我有以下代码片段，演示了我的问题:

I have the below code snippet, demonstrates my question:

import requests

url = "https://item.jd.com/100000177760.html"

r = requests.get(url)

print(r.status_code, r.encoding)  # 200, gbk (correct)

print(r.apparent_encoding)  # GB2312 (wrong)

请求如何确定响应的字符编码?

How does requests determine the response's characters encoding?

推荐答案

请求 apparent_encoding 属性为已评估并用作 r.encoding 的值.

Requests extracts the encoding from the response's Content-Type header's charset parameter. If no charset is found in the header and the content-type is of type "text", ISO-8859-1 (latin-1) is assumed. Otherwise the response's apparent_encoding property is evaluated and used as the value of r.encoding.

apparent_encoding 是通过使用 chardet 库确定的响应主体的编码.

apparent_encoding is determined by using the chardet library to guess the encoding of the response body.

对于问题中的URL，编码在Content-Type标头中声明

In the case of the URL in the question, the encoding is declared in the Content-Type header

>>> r.headers['Content-Type']
'text/html; charset=gbk'

因此，只有通过执行 print(r.apparent_encoding)显式访问它，才会对 r.apparent_encoding 进行评估.

so r.apparent_encoding is not evaluated until it is explicitly accessed by executing print(r.apparent_encoding).

在这种特殊情况下，chardet似乎弄错了:响应的text属性可以使用gbk编解码器进行编码，但不能使用GB2312进行编码.

In this particular case, chardet seems to get it wrong: the response's text attribute can be encoded with the gbk codec, but not with GB2312.

这篇关于请求如何确定响应的编码?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

请求如何确定响应的编码? [英] How does requests determine the encoding of a reponse?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

请求如何确定响应的编码? [英] How does requests determine the encoding of a reponse?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭