使用urlparse(Python)解析自定义URI [英] Parse custom URIs with urlparse (Python)

查看:363
本文介绍了使用urlparse(Python)解析自定义URI的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的应用程序创建了自定义URI(或URL?)来标识对象并解析它们.问题在于Python的urlparse模块拒绝解析未知的URL方案,就像它解析http.

My application creates custom URIs (or URLs?) to identify objects and resolve them. The problem is that Python's urlparse module refuses to parse unknown URL schemes like it parses http.

如果我不调整urlparse的uses_ *列表,我会得到:

If I do not adjust urlparse's uses_* lists I get this:

>>> urlparse.urlparse("qqqq://base/id#hint")
('qqqq', '', '//base/id#hint', '', '', '')
>>> urlparse.urlparse("http://base/id#hint")
('http', 'base', '/id', '', '', 'hint')

这是我的工作,我想知道是否有更好的方法:

Here is what I do, and I wonder if there is a better way to do it:

import urlparse

SCHEME = "qqqq"

# One would hope that there was a better way to do this
urlparse.uses_netloc.append(SCHEME)
urlparse.uses_fragment.append(SCHEME)

为什么没有更好的方法呢?

Why is there no better way to do this?

推荐答案

我认为问题在于方案之后URI并非都具有通用格式.例如,mailto:网址的结构与http:网址的结构不同.

I think the problem is that URI's don't all have a common format after the scheme. For example, mailto: urls aren't structured the same as http: urls.

我将使用第一个解析的结果,然后合成一个http url并再次解析它:

I would use the results of the first parse, then synthesize an http url and parse it again:

parts = urlparse.urlparse("qqqq://base/id#hint")
fake_url = "http:" + parts[2]
parts2 = urlparse.urlparse(fake_url)

这篇关于使用urlparse(Python)解析自定义URI的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆