在网站中呈现URL时如何避免双重URL编码? [英] How do I avoid double URL encoding when rendering URLs in my website?
问题描述
用户通过文本输入向我的网站提供正确转义的URL和原始URL;例如,我认为这两个URL是等效的:
Users provide both properly escaped URLs and raw URLs to my website in a text input; for example I consider these two URLs equivalent:
https://www.cool.com/cool%20beans
https://www.cool.com/cool beans
现在,我想稍后在查看此数据时将它们渲染为<a>
标签.我陷入了给定文本编码和获得这些链接之间的困境:
Now I want to render these as <a>
tags later, when viewing this data. I am stuck between encoding the given text and getting these links:
<a href="https://www.cool.com/cool%2520beans"> <!-- This one is broken! -->
<a href="https://www.cool.com/cool%20beans">
或者不对其进行编码并得到它:
Or not encoding it and getting this:
<a href="https://www.cool.com/cool%20beans">
<a href="https://www.cool.com/cool beans"> <!-- This one is broken! -->
从用户体验的角度来看,现代浏览器的最佳出路是什么?我在对他们的输入进行解码传递或在上面列出的第二个选项(我们不对href
属性进行编码)之间感到困惑.
What's the best way out from a user experience standpoint with modern browsers? I'm torn between doing a decoding pass over their input, or the second option I listed above where we don't encode the href
attribute.
推荐答案
如果要避免对链接进行双重编码,可以在两个链接上都使用urldecode()
,然后再使用urlencode()
来解码URL,例如" https://www.cool.com/cool bean"将返回相同的值,而解码"https://www.cool.com/cool%20beans "将随空格一起返回.这使得两个链接都可以自由编码.
If you want to avoid double encoding the links you can just use urldecode()
on both links, and then urlencode()
afterwards, as decoding a URL such as "https://www.cool.com/cool beans" would return the same value, whereas decoding "https://www.cool.com/cool%20beans" would return with the space. This leaves both links free to be encoded properly.
或者,可以使用strpos()
功能扫描编码的字符,例如
Alternatively, encoded characters could be scanned for using strpos()
function, e.g.
if ($pos = strpos($url, "%20") {
//Encoded character found
}
理想情况下,将扫描一组常见的编码字符,以代替%20"
Ideally for this an array of common encoded characters would be scanned for, in the place of the "%20"
这篇关于在网站中呈现URL时如何避免双重URL编码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!