REQUEST_URI 与显式路径和文件名不匹配 [英] REQUEST_URI not matching explicit path and filename

查看:33
本文介绍了REQUEST_URI 与显式路径和文件名不匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

真的很难过,因为形式和语法看起来都不错.

REQUEST_URI 的 RewriteCond 与显式路径和文件名不匹配.隔离时,REQUEST_FILENAME 的 RewriteCond 匹配得很好.我已经使用 phpinfo() 验证了 REQUEST_URI 包含前导斜杠,并且还测试了没有前导斜杠.

这里的目标是知道请求是针对这个文件的,如果它不存在,则抛出 410.

RewriteCond %{REQUEST_URI} ^/dir1/dir2/dir3/v_9991_0726dd5b5e8dd67a214c0c243436d131_all.css$RewriteCond %{REQUEST_FILENAME} !-f重写规则 ^(.*)$ - [R=410,L]

我不想省略第一个 Cond,因为我只想对少数与此类似的文件执行此操作.

更新一

试图进行最终测试.测试设置:

  • testmee.txt 不存在
  • 请求是针对根目录中的 testmee.txt
  • 通过重定向到 google 验证 request_uri 是否匹配
  • 仅使用第一个 Cond 时无法获得 410
  • (仅使用第一个 Cond 时,服务器提供 404,而不是 410)
  • (使用两个 Conds,服务器提供 404,而不是 410)
  • 仅使用第二个 Cond 时可以获得 410

RewriteCond %{REQUEST_URI} ^/testmee.txt$#RewriteCond %{REQUEST_FILENAME} !-f重写规则 ^(.*)$ - [R=410,L]

对比

#RewriteCond %{REQUEST_URI} ^/testmee.txt$RewriteCond %{REQUEST_FILENAME} !-f重写规则 ^(.*)$ - [R=410,L]

更新二

对 MrWhite 的回应:

呃,同样的症状.对于过时的 css/js,可能不得不忍受 googlebot 命中 404 而不是所需的 410.从长远来看,可能没什么大不了的.

感谢您的 request_uri 测试重定向.在这些测试中一切正常.页面名称等按预期返回,在 var= rewrite URL 中.

此时,我想一定是对文件类型扩展名相关的404s进行了一些内部处理.看下面的线索.我有 Prestashop 购物车软件,它必须在文件类型上强制使用 404.

这将重定向到谷歌(确认模式匹配):

RewriteCond %{REQUEST_FILENAME} !-f重写规则 ^testmee.txt$ http://www.google.com/[L](需要 L 标志,否则其他规则进一步向下会干扰.)

这将继续返回 404 而不是 410:

RewriteCond %{REQUEST_FILENAME} !-f重写规则 ^testmee.txt$ - [NC,R=410]

作为对照测试,这将返回 410:

RewriteCond %{REQUEST_FILENAME} !-f重写规则 ^.*$ - [NC,R=410]

如果在上述失败的测试中文件类型是 css,那么我的自定义 404 控制器不会被调用.我只得到一个普通的 404 响应,没有包含我所有网站模板的自定义 404.

例如:

RewriteCond %{REQUEST_FILENAME} !-f重写规则 ^testmee.css$ - [NC,R=410]

恐怕我浪费了你的一些时间.我很抱歉.我从没想过 Prestashop 的代码会根据文件类型强制 404,但我看不到任何其他解释.我可以深入研究它,也许可以在控制器中找到执行此操作的位置.不过还是要休息一下.

解决方案

这不是一个真正可靠的答案,更多的是尝试帮助调试并消除一些误解...

<块引用>

我已经使用 phpinfo() 验证了 REQUEST_URI 包含前导斜杠

是的,REQUEST_URI Apache 服务器变量确实包含前导斜杠.它包含完整的 URL 路径.

然而,REQUEST_URI Apache 服务器变量不一定与 $_SERVER['REQUEST_URI'] PHP 超全局变量相同 - 事实上,它们并不是真正的完全一样.这些变量之间存在一些显着差异(在某些方面,它们共享相同的名称可能有点不幸).值得注意的是,PHP 超全局变量包含来自请求的初始 URL 并包含查询字符串(如果有)并且未进行 % 解码.而同名的 Apache 服务器变量包含重写的 URL(不一定是请求的 URL)并且不包含查询字符串并且是 % 解码的.

所以,这就是我问你是否有其他 mod_rewrite 指令的原因.你很可能发生了冲突.如果另一个指令重写了 URL,那么条件将永远不会匹配(尽管 PHP 超全局建议它应该匹配).

<块引用>

似乎如果我把它放在顶部,Last 标志将结束该行程的处理,返回 410

这个指令当然应该放在 .htaccess 文件的顶部,以避免之前的 URL 被重写.L 标志在与 R=410(除了 3xx 之外的任何东西)一起使用时实际上是多余的 - 在这种情况下是隐含的.

<块引用>

然后我将结果更改为抛出 410"并抛出 404.

这肯定是由服务器端覆盖引起的.但是你可以在其他情况下抛出 410,所以这似乎排除了这一点.但是,如果有疑问,您可以在 .htaccess 中重置错误文档(除非您已经在使用自定义错误文档):

ErrorDocument 410 默认

<块引用>

RewriteCond %{REQUEST_URI} ^/dir1/dir2/dir3/v_9991_0726dd5b5e8dd67a214c0c243436d131_all.css$RewriteCond %{REQUEST_FILENAME} !-f重写规则 ^(.*)$ - [R=410,L]

虽然这对规则的行为方式并没有真正的影响,但您不需要第一个 RewriteCond 指令来检查 REQUEST_URI.你应该在 RewriteRule pattern 中做这个检查(这会更有效率,因为它首先被处理).例如:

RewriteCond %{REQUEST_FILENAME} !-f重写规则 ^dir1/dir2/dir3/v_9991_0726dd5b5e8dd67a214c0c243436d131_all.css$ - [NC,R=410]

NC 标志应该是多余的.

不过,与现有指令的冲突是最可能的原因.删除所有其他指令.您是否仍然看到相同的行为?

<小时>

您可以测试 REQUEST_URI 服务器变量的值.您可以发出重定向并将 REQUEST_URI 作为 URL 参数传递,或设置环境变量(但每次重写都需要注意 REDIRECT_).

例如,在您的 .htaccess 顶部(或您尝试此操作的任何地方):

RewriteCond %{QUERY_STRING} ^$重写规则 ^/test.php?var=%{REQUEST_URI} [NE,R,L]

创建了一个虚拟的 test.php 文件以避免对错误文档的内部子请求.

Really stumped, because form and syntax seem fine.

RewriteCond for REQUEST_URI is not matching the explicit path and filename. When isolated, RewriteCond for REQUEST_FILENAME matches just fine. I have verified using phpinfo() that REQUEST_URI contains the leading slash, and have tested without the leading slash, also.

The goal here is to know that the request is for this file and, if it doesn't exist, then throw a 410.

RewriteCond %{REQUEST_URI} ^/dir1/dir2/dir3/v_9991_0726dd5b5e8dd67a214c0c243436d131_all.css$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ - [R=410,L]

I don't want to omit the first Cond, because I only want to do this for a handful of files similar to this one.

UPDATE I

trying to get a definitive test. Test set-up:

  • testmee.txt does not exist
  • request is for testmee.txt in the root
  • verified the request_uri is matching, by redirecting to google
  • cannot get 410 when using only first Cond
  • (when using only first Cond, server serves 404, not 410)
  • (using both Conds, server serves 404, not 410)
  • CAN get 410 when using only second Cond

RewriteCond %{REQUEST_URI} ^/testmee.txt$
#RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ - [R=410,L]

versus

#RewriteCond %{REQUEST_URI} ^/testmee.txt$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ - [R=410,L]

UPDATE II

Response for MrWhite:

ughh, same symptom. Might have to live with googlebot hitting 404s instead of a desired 410 for outdated css/js. No biggie in the long run, probably.

Thank you for that request_uri test redirect. Everything is working normally in those tests. Page names, etc. are returned as expected, in the var= rewrite URL.

At this point, I think it must be some internal handling of 404s related to the file type extensions. See clue below. I have Prestashop shopping cart software, and it must be forcing 404s on file types.

This will redirect to google (to affirm pattern match):

RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^testmee.txt$ http://www.google.com/ [L]
(L flag is needed or else other Rules further down will interfere.)

This will continue to return 404 instead of 410:

RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^testmee.txt$ - [NC,R=410]

And as a control test, this will return a 410:

RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^.*$ - [NC,R=410]

If file type is css in the above failed test, then my custom 404 controller does not get invoked. I just get a plain 404 Response, w/o the custom 404 that is wrapped with all my site templating.

For example:

RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^testmee.css$ - [NC,R=410]

I'm afraid I've wasted some of your time. My apologies. I never imagined that Prestashop's code would be forcing 404 based on file type, but I can't see any other explanation. I could dig into it and maybe find the spot in the Controllers that is doing it. Gotta take a break, though.

解决方案

This isn't really a solid answer, more of a things to try to help debug this and to quash some myths...

I have verified using phpinfo() that REQUEST_URI contains the leading slash

Yes, the REQUEST_URI Apache server variable does indeed contain the leading slash. It contains the full URL-path.

However, the REQUEST_URI Apache server variable is not necessarily the same as the $_SERVER['REQUEST_URI'] PHP superglobal - in fact, they aren't really the same thing at all. There are some significant differences between these variables (in some ways it's perhaps a bit unfortunate they share the same name). Notably, the PHP superglobal contains the initial URL from the request and includes the query string (if any) and is not %-decoded. Whereas the Apache server variable of the same name contains the rewritten URL (not necessarily the requested URL) and does not contain the query string and is %-decoded.

So, that's why I was asking whether you have other mod_rewrite directives. You could very well have had a conflict. If another directive rewrites the URL, then the condition will never match (despite the PHP superglobal suggesting that it should).

It seemed that if I put this at the top, the Last flag would end processing for that trip through, return the 410

This directive should certainly go at the top of the .htaccess file, to avoid the URL being rewritten earlier. The L flag is actually superfluous when used with a R=410 (anything other than a 3xx) - it is implied in this case.

Then I change the result to be "throw a 410" and it throws a 404.

That can certainly be caused by a server-side override. But you are able to throw a 410 in other situations, so that would seem to rule that out. However, you can reset the error document in .htaccess if in doubt (unless you are already using a custom error document):

ErrorDocument 410 default

RewriteCond %{REQUEST_URI} ^/dir1/dir2/dir3/v_9991_0726dd5b5e8dd67a214c0c243436d131_all.css$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ - [R=410,L]

Whilst this doesn't really make a difference to how the rule behaves, you don't need the first RewriteCond directive that checks against the REQUEST_URI. You should be doing this check in the RewriteRule pattern instead (which will be more efficient, since this is processed first). For example:

RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^dir1/dir2/dir3/v_9991_0726dd5b5e8dd67a214c0c243436d131_all.css$ - [NC,R=410]

The NC flag should be superfluous.

Still, a conflict with existing directives is the most probable cause. Remove all other directives. Do you still see the same behaviour?


You can test the value of the REQUEST_URI server variable. You could either issue a redirect and pass the REQUEST_URI as a URL parameter, or set environment variables (but you will need to look out for REDIRECT_<var> for each rewrite).

For example, at the top of your .htaccess (or wherever you are trying this):

RewriteCond %{QUERY_STRING} ^$
RewriteRule ^ /test.php?var=%{REQUEST_URI} [NE,R,L]

Created a dummy test.php file to avoid an internal subrequest to an error document.

这篇关于REQUEST_URI 与显式路径和文件名不匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆