请求中的大写 URL 返回“名称无法解析"; [英] Uppercase URL in requests returns "Name does not resolve"
问题描述
我想从带有大写字符的 URL 获取数据.URL 基于 docker 主机名.请求总是返回 Name does not resolve
因为它降低了 URL.
I'd like to GET data from an URL with uppercase characters. The URL is based on a docker hostname. requests always returns Name does not resolve
as it lowers the URL.
网址为 http://gateway.Niedersachsen/api/bundeslaender
.
ping gateway.Niedersachsen
有效,但 ping gateway.niedersachsen
无效.
我的 Python 请求代码:
My Python requests code:
url = f'http://gateway.Niedersachsen/api/wfs/insertGeometry'
r = requests.get(url)
出现以下错误:
requests.exceptions.ConnectionError: HTTPConnectionPool(host='gateway.niedersachsen', port=80): Max retries exceeded with url: /api/wfs/insertGeometry (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f5f5eb5a3c8>: Failed to establish a new connection: [Errno -2] Name does not resolve'))
我的版本:
$ python --version
Python 3.7.3
> requests.__version__
'2.21.0'
推荐答案
RFC 3986 6.2.2.1 节介绍了 URI:
RFC 3986 Section 6.2.2.1 says about URIs:
[...] 方案和主机不区分大小写,因此应该标准化为小写 [...].
[...] the scheme and host are case-insensitive and therefore should be normalized to lowercase [...].
恕我直言,您的名称解析行为不正确,似乎有一个开放的问题与 Docker 网络的大小写敏感相关,我假设这里正在使用.
IMHO, your name resolution behaves incorrectly and there seems to be an open issue related to case-sensitivity for Docker's networking, which I assume is in use here.
requests
,resp.urllib3
,遵守 RFC 建议,至少对于 HTTP 方案连接.就 requests
而言,似乎有四个相关的地方将主机名转换为小写.
requests
, resp. urllib3
, honors the RFC recommendation, at least for HTTP scheme connections. As far as requests
is concerned, there seem to be four relevant places where hostnames are converted to lowercase.
urllib3
的实用程序类Url
当requests
'PreparedRequest
实例执行prepare_url
方法._default_key_normalizer
由PoolManager
通过key_fn_by_scheme
映射- 如果您的主机名包含非 ASCII 字符,它也是 通过 IDNA 编码,但在您的示例中并非如此.
urllib3
1.22 版还对ConnectionPool
基类初始值设定项.此规范化已移至_ipv6_host
显然是 1.23 版的功能.
urllib3
's utility classUrl
which comes into play whenrequests
'PreparedRequest
instance executes theprepare_url
method.- the
_default_key_normalizer
function that is called by thePoolManager
via thekey_fn_by_scheme
mapping - in case your hostname contains non-ASCII characters, it is also passed through IDNA encoding, but this is not the case in your example.
urllib3
version 1.22 also had alower()
call on the host name in theConnectionPool
base class initializer. This normalization has been moved to the_ipv6_host
function as of version 1.23 apparently.
使用monkeypatching 我似乎已经能够强制requests
,resp.urllib3
,保留 URL 的主机名部分:
Using monkeypatching I seem to have been able to coerce requests
, resp. urllib3
, into leaving the host name portion of the URL untouched:
import functools
import urllib3
def _custom_key_normalizer(key_class, request_context):
# basically a 1:1 copy of urllib3.poolmanager._default_key_normalizer
# commenting out
# https://github.com/urllib3/urllib3/blob/master/src/urllib3/poolmanager.py#L84
#context['host'] = context['host'].lower()
class ConnectionPool(object):
def __init__(self, host, port=None):
# complete copy of urllib3.connectionpool.ConnectionPool base class
# I needed this due to my urllib3 version 1.22.
# If you have urllib3 >= 1.23 this is not necessary
# remove the .lower() from
# https://github.com/urllib3/urllib3/blob/1.22/urllib3/connectionpool.py#L71
self.host = urllib3.connectionpool._ipv6_host(host)
urllib3.util.url.NORMALIZABLE_SCHEMES = (None,)
# This is needed for urllib3 >= 1.23. The connectionpool module imports
# NORMALIZABLE_SCHEMES before we can patch it, so we have to explicitly patch it again
urllib3.connectionpool.NORMALIZABLE_SCHEMES = (None,)
urllib3.poolmanager.key_fn_by_scheme['http'] = functools.partial(_custom_key_normalizer,
urllib3.poolmanager.PoolKey)
# just for urllib3 < 1.23
urllib3.connectionpool.ConnectionPool = ConnectionPool
# do not use anything that would import urllib3 before this point
import requests
url = f'http://gateway.Niedersachsen/api/wfs/insertGeometry'
r = requests.get(url)
我假设成功,因为我的错误消息显示连接池中使用的主机,仍然使用首字母大写:
I assume success by the fact that my error message, displaying the host used in the connection pool, still uses the initial capitalization:
requests.exceptions.ConnectionError: HTTPConnectionPool(host='gateway.Niedersachsen', port=80): [...]
注意:
直接使用 urllib3
可能更简单;我还没有研究过这个.
另外,如果有人知道使用 requests
保留主机大写的更直接方法,请告诉我.
Note:
There might be an even easier method by using urllib3
directly; I haven't looked into this.
Also, if someone knows a more straight forward way of preserving host capitalization using requests
, please let me know.
这篇关于请求中的大写 URL 返回“名称无法解析";的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!