使用urlparse(Python)解析自定义URI [英] Parse custom URIs with urlparse (Python)
问题描述
我的应用程序创建了自定义URI(或URL?)来标识对象并解析它们.问题在于Python的urlparse模块拒绝解析未知的URL方案,就像它解析http.
My application creates custom URIs (or URLs?) to identify objects and resolve them. The problem is that Python's urlparse module refuses to parse unknown URL schemes like it parses http.
如果我不调整urlparse的uses_ *列表,我会得到:
If I do not adjust urlparse's uses_* lists I get this:
>>> urlparse.urlparse("qqqq://base/id#hint")
('qqqq', '', '//base/id#hint', '', '', '')
>>> urlparse.urlparse("http://base/id#hint")
('http', 'base', '/id', '', '', 'hint')
这是我的工作,我想知道是否有更好的方法:
Here is what I do, and I wonder if there is a better way to do it:
import urlparse
SCHEME = "qqqq"
# One would hope that there was a better way to do this
urlparse.uses_netloc.append(SCHEME)
urlparse.uses_fragment.append(SCHEME)
为什么没有更好的方法呢?
Why is there no better way to do this?
推荐答案
我认为问题在于方案之后URI并非都具有通用格式.例如,mailto:网址的结构与http:网址的结构不同.
I think the problem is that URI's don't all have a common format after the scheme. For example, mailto: urls aren't structured the same as http: urls.
我将使用第一个解析的结果,然后合成一个http url并再次解析它:
I would use the results of the first parse, then synthesize an http url and parse it again:
parts = urlparse.urlparse("qqqq://base/id#hint")
fake_url = "http:" + parts[2]
parts2 = urlparse.urlparse(fake_url)
这篇关于使用urlparse(Python)解析自定义URI的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!