请求中的大写 URL 返回“名称无法解析"; [英] Uppercase URL in requests returns "Name does not resolve"

查看:18
本文介绍了请求中的大写 URL 返回“名称无法解析";的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想从带有大写字符的 URL 获取数据.URL 基于 docker 主机名.请求总是返回 Name does not resolve 因为它降低了 URL.

I'd like to GET data from an URL with uppercase characters. The URL is based on a docker hostname. requests always returns Name does not resolve as it lowers the URL.

网址为 http://gateway.Niedersachsen/api/bundeslaender.

ping gateway.Niedersachsen 有效,但 ping gateway.niedersachsen 无效.

我的 Python 请求代码:

My Python requests code:

url = f'http://gateway.Niedersachsen/api/wfs/insertGeometry'
r = requests.get(url)

出现以下错误:

requests.exceptions.ConnectionError: HTTPConnectionPool(host='gateway.niedersachsen', port=80): Max retries exceeded with url: /api/wfs/insertGeometry (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f5f5eb5a3c8>: Failed to establish a new connection: [Errno -2] Name does not resolve'))

我的版本:

$ python --version
Python 3.7.3

> requests.__version__
'2.21.0'

推荐答案

RFC 3986 6.2.2.1 节介绍了 URI:

RFC 3986 Section 6.2.2.1 says about URIs:

[...] 方案和主机不区分大小写,因此应该标准化为小写 [...].

[...] the scheme and host are case-insensitive and therefore should be normalized to lowercase [...].

恕我直言,您的名称解析行为不正确,似乎有一个开放的问题与 Docker 网络的大小写敏感相关,我假设这里正在使用.

IMHO, your name resolution behaves incorrectly and there seems to be an open issue related to case-sensitivity for Docker's networking, which I assume is in use here.

requests,resp.urllib3,遵守 RFC 建议,至少对于 HTTP 方案连接.就 requests 而言,似乎有四个相关的地方将主机名转换为小写.

requests, resp. urllib3, honors the RFC recommendation, at least for HTTP scheme connections. As far as requests is concerned, there seem to be four relevant places where hostnames are converted to lowercase.

  1. urllib3 的实用程序类 Urlrequests' PreparedRequest 实例执行 prepare_url 方法.
  2. _default_key_normalizerPoolManager 通过 key_fn_by_scheme 映射
  3. 如果您的主机名包含非 ASCII 字符,它也是 通过 IDNA 编码,但在您的示例中并非如此.
  4. urllib3 1.22 版还对 ConnectionPool 基类初始值设定项.此规范化已移至 _ipv6_host 显然是 1.23 版的功能.
  1. urllib3's utility class Url which comes into play when requests' PreparedRequest instance executes the prepare_url method.
  2. the _default_key_normalizer function that is called by the PoolManager via the key_fn_by_scheme mapping
  3. in case your hostname contains non-ASCII characters, it is also passed through IDNA encoding, but this is not the case in your example.
  4. urllib3 version 1.22 also had a lower() call on the host name in the ConnectionPool base class initializer. This normalization has been moved to the _ipv6_host function as of version 1.23 apparently.

使用monkeypatching 我似乎已经能够强制requests,resp.urllib3,保留 URL 的主机名部分:

Using monkeypatching I seem to have been able to coerce requests, resp. urllib3, into leaving the host name portion of the URL untouched:

import functools
import urllib3

def _custom_key_normalizer(key_class, request_context):
    # basically a 1:1 copy of urllib3.poolmanager._default_key_normalizer
    # commenting out 
    # https://github.com/urllib3/urllib3/blob/master/src/urllib3/poolmanager.py#L84
    #context['host'] = context['host'].lower()

class ConnectionPool(object):
    def __init__(self, host, port=None):
        # complete copy of urllib3.connectionpool.ConnectionPool base class
        # I needed this due to my urllib3 version 1.22. 
        # If you have urllib3 >= 1.23 this is not necessary
        # remove the .lower() from 
        # https://github.com/urllib3/urllib3/blob/1.22/urllib3/connectionpool.py#L71
        self.host = urllib3.connectionpool._ipv6_host(host)

urllib3.util.url.NORMALIZABLE_SCHEMES = (None,)
# This is needed for urllib3 >= 1.23. The connectionpool module imports
# NORMALIZABLE_SCHEMES before we can patch it, so we have to explicitly patch it again
urllib3.connectionpool.NORMALIZABLE_SCHEMES = (None,)
urllib3.poolmanager.key_fn_by_scheme['http'] = functools.partial(_custom_key_normalizer, 
                                                                 urllib3.poolmanager.PoolKey)
# just for urllib3 < 1.23
urllib3.connectionpool.ConnectionPool = ConnectionPool

# do not use anything that would import urllib3 before this point    
import requests
url = f'http://gateway.Niedersachsen/api/wfs/insertGeometry'
r = requests.get(url)

我假设成功,因为我的错误消息显示连接池中使用的主机,仍然使用首字母大写:

I assume success by the fact that my error message, displaying the host used in the connection pool, still uses the initial capitalization:

requests.exceptions.ConnectionError: HTTPConnectionPool(host='gateway.Niedersachsen', port=80): [...]

注意:
直接使用 urllib3 可能更简单;我还没有研究过这个.
另外,如果有人知道使用 requests 保留主机大写的更直接方法,告诉我.

Note:
There might be an even easier method by using urllib3 directly; I haven't looked into this.
Also, if someone knows a more straight forward way of preserving host capitalization using requests, please let me know.

这篇关于请求中的大写 URL 返回“名称无法解析";的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆