使用UTF-8字符和放大器麻烦; Apache2的重写规则 [英] trouble with utf-8 chars & apache2 rewrite rules
问题描述
我看到帖子的http://stackoverflow.com/questions/2565864/validating-utf-8-in-htaccess-rewrite-rule我认为这是伟大的,但我有一首更根本的问题:
I see the post http://stackoverflow.com/questions/2565864/validating-utf-8-in-htaccess-rewrite-rule and I think that is great, but a more fundamental problem I am having first:
我需要扩展处理的查询字符串参数UTF-8字符,名称目录,文件,并在显示用于用户等。
I needed to expand to handle utf-8 chars for query string parameters, names of directories, files, and used in displays to users etc.
我配置我的Apache与DefaultCharset UTF-8,也是我如果该事项的PHP。我原来的重写规则过滤一切,除了常规的A-ZA-z和下划线和连字符。和它的工作。还有什么会给你一个404(这是我想要的!)但现在,似乎一切都匹配,包括的东西,我不想,但是,尽管它似乎符合它不会在查询字符串,除非它去是一个普通的A-ZA-z_-字符串。
I configured my Apache with DefaultCharset utf-8 and also my php if that matters. My original rewrite rule filtered everything except regular A-Za-z and underscore and hyphen. and it worked. Anything else would give you a 404 (which is what I want!) Now, however it seems that everything matches, including stuff I don't want, however, although it seems to match it doesn't go in the query string unless it is a regular A-Za-z_- character string.
我觉得这是令人困惑,因为该规则说,无论你进入查询字符串匹配的说:
I find this confusing, because the rule says put whatever you matched into the query string:
下面是原来的规则:
RewriteRule ^/puzzle/([A-Za-z_-]+)$ /puzzle.php?g=$1 [NC]
和这里是修改后的规则:
and here is the revised rule:
RewriteRule ^/puzzle/(\w+)$ /puzzle.php?g=$1 [NC]
我做出了改变,因为某处,我读了\\ W ALL阿尔法字符,其中为A-Zetc匹配。仅匹配没有口音和东西的人。
I made the change because somewhere I read that \w matches ALL the alpha chars where as A-Zetc. only matches the ones without accents and stuff.
这似乎并不重要我使用的这些规则:这里是发生了什么:
It doesn't seem to matter which of those rules I use: Here is what happens:
在应用我有这样的:
echo $_GET['g'];
如果我给它像 http://mydomain.com/puzzle/USA一个网址呼应了USA和工作正常。结果
如果我给它一个url像 http://mydomain.com/puzzle /墨西哥呼应什么为,并警告我说,指数g是没有定义,当然不会为墨西哥获得的资源。结果
如果我给它一个url像<一个href=\"http://mydomain.com/puzzle/fuzzle/buzzle/j.qle\">http://mydomain.com/puzzle/fuzzle/buzzle/j.qle它做同样的事情。结果
这最后一种情况应该是404!
If I feed it a url like http://mydomain.com/puzzle/USA it echoes out "USA" and works fine.
If I feed it a url like http://mydomain.com/puzzle/México it echoes nothing for that and warns me that index g is not defined and of course doesn't get resources for Mexico.
if I feed it a url like http://mydomain.com/puzzle/fuzzle/buzzle/j.qle it does the same thing.
This last case should be a 404!
和它这样做无论哪个我使用上述规则。我配置了一个重写日志
And it does this no matter which of the above rules I use. I configured a rewrite log
RewriteLogLevel 5
RewriteLog /opt/local/apache2/logs/puzzles.httpd.rewrite
,但它是空的。
but it is empty.
下面是来自经常访问日志(它给的200状态)
Here is from the regular access log (it gives a status of 200)
[26/May/2010:11:21:42 -0700] "GET /puzzle/M%C3%A9xico HTTP/1.1" 200 342
[26/May/2010:11:21:54 -0700] "GET /puzzle/M/l.foo HTTP/1.1" 200 342
我能做些什么让这些$%#$ @(*#@!字符,但不削减,圆点或其他非字母到我的程序,并且一旦出现,将它去code正确它们???请问POSIX字符类工作的更好吗?还有什么我需要配置?
What can I do to get these $%#$@(*#@!!! characters but not slash, dot or other non-alpha into my program, and once there, will it decode them correctly??? Would posix char classes work any better? Is there anything else I need to configure?
推荐答案
此解决方案是基于:的 http://www.dracos.co.uk/$c$c/apache-rewrite-problem/
This solution is based on: http://www.dracos.co.uk/code/apache-rewrite-problem/
试试这个重写规则:
AddDefaultCharset UTF-8
RewriteEngine On
RewriteCond %{THE_REQUEST} /puzzle/([^?\ /]+)
RewriteRule ^puzzle/(.*)$ puzzle.php/%1 [L]
如何获得的查询参数:
How to get the query param:
<?php
// Get query param
$g = substr($_SERVER['PATH_INFO'], 1);
echo "<p>g: $g</p>";
// Test if '/' is present in URL for 404's
$g2 = substr($_SERVER['REQUEST_URI'], 8);
if (strpos($g2, '/') === false) {
// do stuff
} else {
// Send 404 header here
echo "<p>404</p>";
}
?>
通过这个解决方案,你必须从PHP发送404。
With this solution you have to send the 404 from php.
这篇关于使用UTF-8字符和放大器麻烦; Apache2的重写规则的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!