更改 url 中的主机名 [英] Changing hostname in a url

查看:38
本文介绍了更改 url 中的主机名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 python 更改 url 中的主机名,并且一直在使用 urlparse 模块一段时间,但没有找到令人满意的解决方案.例如,考虑以下网址:

https://www.google.dk:80/barbaz

我想用例如替换www.google.dk"www.foo.dk",所以我得到以下网址:

https://www.foo.dk:80/barbaz.

所以我要替换的部分是 urlparse.urlsplit 所指的主机名.我曾希望 urlsplit 的结果能让我进行更改,但结果类型 ParseResult 不允许我这样做.如果没有别的,我当然可以通过将所有部分与 + 一起附加来重建新的 url,但这会给我留下一些非常丑陋的代码,其中包含很多条件以在正确的位置获取://"和:".

解决方案

您可以使用 urllib.parse.urlparse 函数和 ParseResult._replace 方法(Python 3):

<预><代码>>>>导入 urllib.parse>>>解析 = urllib.parse.urlparse("https://www.google.dk:80/barbaz")>>>替换 = 已解析._replace(netloc="www.foo.dk:80")>>>打印(替换)ParseResult(scheme='https', netloc='www.foo.dk:80', path='/barbaz', params='', query='', fragment='')

如果您使用的是 Python 2,请将 urllib.parse 替换为 urlparse.

ParseResultnamedtuple_replace 是一个 namedtuple 方法:

<块引用>

返回命名元组的新实例替换指定字段新的价值观

更新:

正如@2rs2ts 在评论中所说,netloc 属性包括一个端口号.

好消息:ParseResult 具有 hostnameport 属性.坏消息:hostnameport 不是 namedtuple 的成员,它们是动态属性,你不能做 parsed._replace(hostname="www.foo.dk").它会抛出异常.

如果您不想在 : 上拆分并且您的 url 始终有一个端口号并且没有 usernamepassword(例如https://username:password@www.google.dk:80/barbaz") 你可以这样做:

parsed._replace(netloc="{}:{}".format(parsed.hostname, parsed.port))

I am trying to use python to change the hostname in a url, and have been playing around with the urlparse module for a while now without finding a satisfactory solution. As an example, consider the url:

https://www.google.dk:80/barbaz

I would like to replace "www.google.dk" with e.g. "www.foo.dk", so I get the following url:

https://www.foo.dk:80/barbaz.

So the part I want to replace is what urlparse.urlsplit refers to as hostname. I had hoped that the result of urlsplit would let me make changes, but the resulting type ParseResult doesn't allow me to. If nothing else I can of course reconstruct the new url by appending all the parts together with +, but this would leave me with some quite ugly code with a lot of conditionals to get "://" and ":" in the correct places.

解决方案

You can use urllib.parse.urlparse function and ParseResult._replace method (Python 3):

>>> import urllib.parse
>>> parsed = urllib.parse.urlparse("https://www.google.dk:80/barbaz")
>>> replaced = parsed._replace(netloc="www.foo.dk:80")
>>> print(replaced)
ParseResult(scheme='https', netloc='www.foo.dk:80', path='/barbaz', params='', query='', fragment='')

If you're using Python 2, then replace urllib.parse with urlparse.

ParseResult is a subclass of namedtuple and _replace is a namedtuple method that:

returns a new instance of the named tuple replacing specified fields with new values

UPDATE:

As @2rs2ts said in the comment netloc attribute includes a port number.

Good news: ParseResult has hostname and port attributes. Bad news: hostname and port are not the members of namedtuple, they're dynamic properties and you can't do parsed._replace(hostname="www.foo.dk"). It'll throw an exception.

If you don't want to split on : and your url always has a port number and doesn't have username and password (that's urls like "https://username:password@www.google.dk:80/barbaz") you can do:

parsed._replace(netloc="{}:{}".format(parsed.hostname, parsed.port))

这篇关于更改 url 中的主机名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆