Python 3:为什么要使用urlparse/urlsplit [英] Python 3 : Why would you use urlparse/urlsplit

查看:141
本文介绍了Python 3:为什么要使用urlparse/urlsplit的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我不确定这些模块的用途.我知道他们将各自的url分成了各个组成部分,但是为什么这样做有用,或者什么时候使用urlparse的例子呢?

I'm not exactly sure what these modules are used for. I get that they split the respective url into its components, but why would that be useful, or what is an example of when to use urlparse?

推荐答案

仅当需要参数时才使用urlparse.我在下面解释了为什么需要参数.

Use urlparse only if you need parameter. I have explained below why do you need parameter for.

参考

urllib.parse. urlsplit (urlstring,scheme ='',allow_fragments = True)

urllib.parse.urlsplit(urlstring, scheme='', allow_fragments=True)

这类似于urlparse(),但不会从 网址.如果更多,通常应使用它代替urlparse() 最新的URL语法允许将参数应用于 需要URL的路径部分(请参见 RFC 2396 ).

This is similar to urlparse(), but does not split the params from the URL. This should generally be used instead of urlparse() if the more recent URL syntax allowing parameters to be applied to each segment of the path portion of the URL (see RFC 2396) is wanted.

主机名对于存储在变量中以供以后使用或添加参数,查询主机名以获取您想要的网页时总是有用的.

Hostname is always useful to store in variable to use it later or adding parameter, query to hostname to get the web page you want while scraping.

关于参数:

仅供参考:根据RFC2396,URL中的参数

对当前客户端应用程序的广泛测试表明, 大多数已部署系统不使用;"表示字符 尾随的参数信息,以及分号的存在 在路径段中不会影响该段的相对解析 部分.因此,已将参数作为单独的参数删除. 组件,现在可以出现在任何路径段中.他们的影响有 已从用于解析相对URI的算法中删除 参考.

Extensive testing of current client applications demonstrated that the majority of deployed systems do not use the ";" character to indicate trailing parameter information, and that the presence of a semicolon in a path segment does not affect the relative parsing of that segment. Therefore, parameters have been removed as a separate component and may now appear in any path segment. Their influence has been removed from the algorithm for resolving a relative URI reference.

参数在抓取时很有用, 例如如果网址是http://www.example.com/products/women?color=green

Parameter are useful in scraping, e.g. if the url is http://www.example.com/products/women?color=green

使用urlparse时,将获得参数.现在,您必须将其更改为men,因此它将是http://www.example.com/products/men?color=greenkidsgirlboy,依此类推.

When you use urlparse, you will get parameter. Now You have to change it to men so it will be http://www.example.com/products/men?color=green and kids, girl, boy so on.

这篇关于Python 3:为什么要使用urlparse/urlsplit的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆