网址和&符号 [英] URLs and ampersands

查看:337
本文介绍了网址和&符号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用urllib.urlretrieve()下载HTML页面,我已经点击了包含&符号的网址:

http://www.example.com/parrot.php?x = 1& y = 2


在这个过程的某个地方,上面的网址被转移到:

http://www.example.com/parrot.php?x=1& ; amp; y = 2


自然不存在。


我可以做一个字符串替换,但是有一个正确的逃避的方式

和unescape网址?我已经查看了标准库,但是我找不到

任何有用的东西。

-

史蒂文

I''m using urllib.urlretrieve() to download HTML pages, and I''ve hit a
snag with URLs containing ampersands:

http://www.example.com/parrot.php?x=1&y=2

Somewhere in the process, urls like the above are escaped to:

http://www.example.com/parrot.php?x=1&y=2

which naturally fails to exist.

I could just do a string replace, but is there a "right" way to escape
and unescape URLs? I''ve looked through the standard lib, but I can''t find
anything helpful.
--
Steven

推荐答案

En Mon,2008年8月4日20:43:45 -0300,Steven D''Aprano

< st ** *@REMOVE-THIS-cybersource.com.auescribi ???:
En Mon, 04 Aug 2008 20:43:45 -0300, Steven D''Aprano
<st***@REMOVE-THIS-cybersource.com.auescribi???:

我正在使用urllib.urlretrieve()下载HTML页面,而我''用包含&符号的网址打了一个

的障碍:

http://www.example.com/parrot.php?x=1&y=2


在这个过程的某个地方,像上面这样的网址被转移到:

http://www.example.com/parrot.php?x=1&amp;y=2


这自然不会存在。


我可以做一个字符串替换,但是有一个正确的吗?逃避的方式

和unescape网址?我已经查看了标准库,但我找不到

任何有用的东西。
I''m using urllib.urlretrieve() to download HTML pages, and I''ve hit a
snag with URLs containing ampersands:

http://www.example.com/parrot.php?x=1&y=2

Somewhere in the process, urls like the above are escaped to:

http://www.example.com/parrot.php?x=1&amp;y=2

which naturally fails to exist.

I could just do a string replace, but is there a "right" way to escape
and unescape URLs? I''ve looked through the standard lib, but I can''t find
anything helpful.



这对我来说很好用:


pyimport urllib

pyfn =

urllib.urlretrieve(" http://c7.amazingcounters.com/counter.php?i = 1516903

& c = 4551022")[0]

pyopen(fn," rb")。read()

''\ x89PNG \\\\\\\\\\\\\\\\\\ \ rIHDR \x00 \ x00 ...


所以它不是urlretrieve转义网址,而是你的

代码中的其他内容。 ..


-

Gabriel Genellina

This works fine for me:

pyimport urllib
pyfn =
urllib.urlretrieve("http://c7.amazingcounters.com/counter.php?i=1516903
&c=4551022")[0]
pyopen(fn,"rb").read()
''\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00...

So it''s not urlretrieve escaping the url, but something else in your
code...

--
Gabriel Genellina


8月4日星期一2008 23:16:46 -0300,Gabriel Genellina写道:
On Mon, 04 Aug 2008 23:16:46 -0300, Gabriel Genellina wrote:

En Mon,2008年8月4日20:43:45 -0300,Steven D''Aprano

< st *** @ REMOVE-THIS-cybersource.com.auescribi ???:
En Mon, 04 Aug 2008 20:43:45 -0300, Steven D''Aprano
<st***@REMOVE-THIS-cybersource.com.auescribi???:

>我正在使用urllib .urlretrieve()下载HTML页面,我发现了包含&符号的网址:

http:/ /www.example.com/parrot.php?x=1&y=2

在此过程中,上述网址被转义为:
http://www.example.com /parrot.php?x=1&amp;y=2

自然不存在。

我可以做一个字符串替换,但是有一个正确的逃避
和unescape网址的方式?我查看了标准库,但我找不到任何有用的东西。
>I''m using urllib.urlretrieve() to download HTML pages, and I''ve hit a
snag with URLs containing ampersands:

http://www.example.com/parrot.php?x=1&y=2

Somewhere in the process, urls like the above are escaped to:

http://www.example.com/parrot.php?x=1&amp;y=2

which naturally fails to exist.

I could just do a string replace, but is there a "right" way to escape
and unescape URLs? I''ve looked through the standard lib, but I can''t
find anything helpful.



这对我来说很好用:


pyimport urllib

pyfn =

urllib.urlretrieve(" http://c7.amazingcounters.com/counter.php?i = 1516903

& c = 4551022")[0]

pyopen(fn," rb")。read()

''\ x89PNG \\\\\\\\\\\\\\\\\\ \ rIHDR \x00 \ x00 ...


所以它不是urlretrieve转义网址,而是你的

代码中的其他内容。 ..


This works fine for me:

pyimport urllib
pyfn =
urllib.urlretrieve("http://c7.amazingcounters.com/counter.php?i=1516903
&c=4551022")[0]
pyopen(fn,"rb").read()
''\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00...

So it''s not urlretrieve escaping the url, but something else in your
code...



我没有说urlretrieve正在逃避URL。我实际上认为

网址是在我从H​​TML文件中删除时预先转义的。我已经搜索了

,但无法找到逃脱的标准库函数或

unescapes URL。有没有这样的功能?


-

史蒂文

I didn''t say it urlretrieve was escaping the URL. I actually think the
URLs are pre-escaped when I scrape them from a HTML file. I have searched
for, but been unable to find, standard library functions that escapes or
unescapes URLs. Are there any such functions?

--
Steven


Dnia 05 Aug 2008 09: 59:20 GMT,Steven D''Aprano napisa3(a):
Dnia 05 Aug 2008 09:59:20 GMT, Steven D''Aprano napisa3(a):

我没有说urlretrieve正在逃避URL。我实际上认为

网址是在我从H​​TML文件中删除时预先转义的。我已经搜索了

,但无法找到逃脱的标准库函数或

unescapes URL。有没有这样的功能?
I didn''t say it urlretrieve was escaping the URL. I actually think the
URLs are pre-escaped when I scrape them from a HTML file. I have searched
for, but been unable to find, standard library functions that escapes or
unescapes URLs. Are there any such functions?


这篇关于网址和&符号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆