urlencoded 正斜杠正在破坏 URL [英] urlencoded Forward slash is breaking URL

查看:29
本文介绍了urlencoded 正斜杠正在破坏 URL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

关于系统

我的项目中有这种格式的网址:-

I have URLs of this format in my project:-

http://project_name/browse_by_exam/type/tutor_search/keyword/class/new_search/1/search_exam/0/search_subject/0

其中关键字/类对表示使用类"关键字进行搜索.

Where keyword/class pair means search with "class" keyword.

我有一个通用的 index.php 文件,它为项目中的每个模块执行.从 URL 中删除 index.php 只有一个重写规则:-

I have a common index.php file which executes for every module in the project. There is only a rewrite rule to remove the index.php from URL:-

RewriteCond $1 !^(index.php|resources|robots.txt)
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php [L,QSA]

我在准备搜索 URL 时使用 urlencode(),在读取搜索 URL 时使用 urldecode().

I am using urlencode() while preparing the search URL and urldecode() while reading the search URL.

问题

只有正斜杠字符会破坏 URL,导致 404 页面未找到错误.例如,如果我搜索 one/two,则 URL 为

Only the forward slash character is breaking URLs causing 404 page not found error. For example, if I search one/two the URL is

http://project_name/browse_by_exam/type/tutor_search/keyword/one%2Ftwo/new_search/1/search_exam/0/search_subject/0/page_sort/

我该如何解决这个问题?我需要将 index.php 隐藏在 URL 中.否则,如果不需要,那么正斜杠就没有问题,我可以使用这个 URL:-

How do I fix this? I need to keep index.php hidden in the URL. Otherwise, if that was not needed, there would have been no problem with forward slash and I could have used this URL:-

http://project_name/index.php?browse_by_exam/type/tutor_search/keyword/one
%2Ftwo/new_search/1/search_exam/0/search_subject/0

推荐答案

Apache 拒绝所有路径部分带有 %2F 的 URL,出于安全原因:脚本不能正常(即不重写) 区分 %2F/ 之间的区别,因为 PATH_INFO 环境变量会自动进行 URL 解码(这很愚蠢,但是很长——是 CGI 规范的一部分,因此无能为力).

Apache denies all URLs with %2F in the path part, for security reasons: scripts can't normally (ie. without rewriting) tell the difference between %2F and / due to the PATH_INFO environment variable being automatically URL-decoded (which is stupid, but a long-standing part of the CGI specification so there's nothing can be done about it).

您可以使用 AllowEncodedSlashes 指令,但请注意其他 Web 服务器仍将禁止它(无法选择将其关闭),并且其他字符也可能是禁忌(例如 %5C),尤其是 %00 将始终被 Apache 和 IIS 阻止.因此,如果您的应用程序依赖于能够在路径部分中包含 %2F 或其他字符,那么您将限制您的兼容性/部署选项.

You can turn this feature off using the AllowEncodedSlashes directive, but note that other web servers will still disallow it (with no option to turn that off), and that other characters may also be taboo (eg. %5C), and that %00 in particular will always be blocked by both Apache and IIS. So if your application relied on being able to have %2F or other characters in a path part you'd be limiting your compatibility/deployment options.

我在准备搜索 URL 时使用 urlencode()

I am using urlencode() while preparing the search URL

你应该使用rawurlencode(),不是 urlencode() 用于转义路径部分.urlencode() 被错误命名,它实际上是针对 application/x-www-form-urlencoded 数据,例如在查询字符串或 POST 请求的正文中,而不是网址的其他部分.

You should use rawurlencode(), not urlencode() for escaping path parts. urlencode() is misnamed, it is actually for application/x-www-form-urlencoded data such as in the query string or the body of a POST request, and not for other parts of the URL.

区别在于 + 并不表示路径部分中的空格.rawurlencode() 将正确生成 %20 代替,这将适用于表单编码数据和 URL 的其他部分.

The difference is that + doesn't mean space in path parts. rawurlencode() will correctly produce %20 instead, which will work both in form-encoded data and other parts of the URL.

这篇关于urlencoded 正斜杠正在破坏 URL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆