解析URI参数和关键字值对 [英] Parsing URI parameter and keyword value pairs

查看:246
本文介绍了解析URI参数和关键字值对的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想从文本文件中的URI/L解析参数和关键字值.没有值的参数也应包括在内. Python很好,但是可以使用其他工具(例如Perl或单行代码)也可以提出建议.

I would like to parse the parameter and keyword values from URI/L's in a text file. Parameters without values should also be included. Python is fine but am open to suggestion using other tools such as Perl or a one-liner that may also do the trick.

示例来源:

www.domain.com/folder/page.php?date=2012-11-20
www2.domain.edu/folder/folder/page.php?l=user&x=0&id=1&page=http%3A//domain.com/page.html&unique=123456&refer=http%3A//domain2.net/results.aspx%3Fq%3Dbob+test+1.21+some%26file%3Dname&text=
www.domain.edu/some/folder/image.php?l=adm&y=5&id=2&page=http%3A//support.domain.com/downloads/index.asp&unique=12345
blog.news.org/news/calendar.php?view=month&date=2011-12-10

示例输出:

date=2012-11-20
l=user
x=0
page=http%3A//domain.com/page.html&unique=123456
refer=http%3A//domain2.net/results.aspx%3Fq%3Dbob+test+1.21+some%26file%3Dname
test=
l=adm
y=5
id=2
page=http%3A//support.domain.com/downloads/index.asp
unique=12345
view=month
date=2011-12-10

推荐答案

您无需深入研究脆弱的正则表达式世界.

You don't need to dive into fragile regex world.

urlparse.parse_qsl() 是这项工作的工具( urllib.quote() 有助于转义特殊字符):

urlparse.parse_qsl() is the tool for the job (urllib.quote() helps to escape special characters):

from urllib import quote
from urlparse import parse_qsl, urlparse


with open('links.txt') as f:
    for url in f:
        params = parse_qsl(urlparse(url.strip()).query, keep_blank_values=True)
        for key, value in params:
            print "%s=%s" % (key, quote(value))

打印:

date=2012-11-20
l=user
x=0
id=1
page=http%3A//domain.com/page.html
unique=123456
refer=http%3A//domain2.net/results.aspx%3Fq%3Dbob%20test%201.21%20some%26file%3Dname
text=
l=adm
y=5
id=2
page=http%3A//support.domain.com/downloads/index.asp
unique=12345
view=month
date=2011-12-10

希望有帮助.

这篇关于解析URI参数和关键字值对的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆