更改 url 中的主机名 [英] Changing hostname in a url
问题描述
我正在尝试使用 python 更改 url 中的主机名,并且一直在使用 urlparse 模块一段时间,但没有找到令人满意的解决方案.例如,考虑以下网址:
https://www.google.dk:80/barbaz
我想用例如替换www.google.dk"www.foo.dk",所以我得到以下网址:
所以我要替换的部分是 urlparse.urlsplit 所指的主机名.我曾希望 urlsplit 的结果能让我进行更改,但结果类型 ParseResult 不允许我这样做.如果没有别的,我当然可以通过将所有部分与 + 一起附加来重建新的 url,但这会给我留下一些非常丑陋的代码,其中包含很多条件以在正确的位置获取://"和:".
您可以使用 urllib.parse.urlparse
函数和 ParseResult._replace
方法(Python 3):
如果您使用的是 Python 2,请将 urllib.parse
替换为 urlparse
.
ParseResult
是 namedtuple
和 _replace
是一个 namedtuple
方法:
返回命名元组的新实例替换指定字段新的价值观
更新:
正如@2rs2ts 在评论中所说,netloc
属性包括一个端口号.
好消息:ParseResult
具有 hostname
和 port
属性.坏消息:hostname
和 port
不是 namedtuple
的成员,它们是动态属性,你不能做 parsed._replace(hostname="www.foo.dk")
.它会抛出异常.
如果您不想在 :
上拆分并且您的 url 始终有一个端口号并且没有 username
和 password
(例如https://username:password@www.google.dk:80/barbaz") 你可以这样做:
parsed._replace(netloc="{}:{}".format(parsed.hostname, parsed.port))
I am trying to use python to change the hostname in a url, and have been playing around with the urlparse module for a while now without finding a satisfactory solution. As an example, consider the url:
https://www.google.dk:80/barbaz
I would like to replace "www.google.dk" with e.g. "www.foo.dk", so I get the following url:
So the part I want to replace is what urlparse.urlsplit refers to as hostname. I had hoped that the result of urlsplit would let me make changes, but the resulting type ParseResult doesn't allow me to. If nothing else I can of course reconstruct the new url by appending all the parts together with +, but this would leave me with some quite ugly code with a lot of conditionals to get "://" and ":" in the correct places.
You can use urllib.parse.urlparse
function and ParseResult._replace
method (Python 3):
>>> import urllib.parse
>>> parsed = urllib.parse.urlparse("https://www.google.dk:80/barbaz")
>>> replaced = parsed._replace(netloc="www.foo.dk:80")
>>> print(replaced)
ParseResult(scheme='https', netloc='www.foo.dk:80', path='/barbaz', params='', query='', fragment='')
If you're using Python 2, then replace urllib.parse
with urlparse
.
ParseResult
is a subclass of namedtuple
and _replace
is a namedtuple
method that:
returns a new instance of the named tuple replacing specified fields with new values
UPDATE:
As @2rs2ts said in the comment netloc
attribute includes a port number.
Good news: ParseResult
has hostname
and port
attributes.
Bad news: hostname
and port
are not the members of namedtuple
, they're dynamic properties and you can't do parsed._replace(hostname="www.foo.dk")
. It'll throw an exception.
If you don't want to split on :
and your url always has a port number and doesn't have username
and password
(that's urls like "https://username:password@www.google.dk:80/barbaz") you can do:
parsed._replace(netloc="{}:{}".format(parsed.hostname, parsed.port))
这篇关于更改 url 中的主机名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!