NGINX 删除 .html 扩展名 [英] NGINX remove .html extension
问题描述
因此,我找到了删除页面上 .html 扩展名的答案,此代码可以正常工作:
So, I found an answer to removing the .html extension on my page, that works fine with this code:
server {
listen 80;
server_name _;
root /var/www/html/;
index index.html;
if (!-f "${request_filename}index.html") {
rewrite ^/(.*)/$ /$1 permanent;
}
if ($request_uri ~* "/index.html") {
rewrite (?i)^(.*)index.html$ $1 permanent;
}
if ($request_uri ~* ".html") {
rewrite (?i)^(.*)/(.*).html $1/$2 permanent;
}
location / {
try_files $uri.html $uri $uri/ /index.html;
}
}
但是如果我打开 mypage.com 它会将我重定向到 mypage.com/index
这不是通过将 index.html 声明为索引来解决的吗?任何帮助表示赞赏.
But if I open mypage.com it redirects me to mypage.com/index
Wouldn't this be fixed by declaring index.html as index? Any help is appreciated.
推荐答案
NGINX 中删除.html"的圣杯"解决方案:
更新的答案:这个问题激起了我的好奇心,于是我又一次更深入地寻找圣杯".NGINX 中 .html
重定向的解决方案.这是我找到的答案的链接,因为我不是自己想出来的:https://stackoverflow.com/a/32966347/4175718
The "Holy Grail" Solution for Removing ".html" in NGINX:
UPDATED ANSWER: This question piqued my curiosity, and I went on another, more in-depth search for a "holy grail" solution for .html
redirects in NGINX. Here is the link to the answer I found, since I didn't come up with it myself: https://stackoverflow.com/a/32966347/4175718
不过,我会举一个例子并解释它是如何工作的.代码如下:
However, I'll give an example and explain how it works. Here is the code:
location / {
if ($request_uri ~ ^/(.*).html) {
return 302 /$1;
}
try_files $uri $uri.html $uri/ =404;
}
这里发生的事情是对 if
指令的巧妙使用.NGINX 在传入请求的 $request_uri
部分运行正则表达式.正则表达式检查 URI 是否具有 .html 扩展名,然后将 URI 的无扩展名部分存储在内置变量 $1
中.
What's happening here is a pretty ingenious use of the if
directive. NGINX runs a regex on the $request_uri
portion of incoming requests. The regex checks if the URI has an .html extension and then stores the extension-less portion of the URI in the built-in variable $1
.
来自 docs,因为我花了一段时间才弄清楚在哪里$1
来自:
From the docs, since it took me a while to figure out where the $1
came from:
正则表达式可以包含可供以后在 $1..$9 变量中重用的捕获.
Regular expressions can contain captures that are made available for later reuse in the $1..$9 variables.
正则表达式检查是否存在不需要的 .html 请求并有效地清理 URI 使其不包含扩展名.然后,使用一个简单的 return
语句,将请求重定向到现在存储在 $1
中的清理过的 URI.
The regex both checks for the existence of unwanted .html requests and effectively sanitizes the URI so that it does not include the extension. Then, using a simple return
statement, the request is redirected to the sanitized URI that is now stored in $1
.
正如原作者 cnst 解释的那样,最好的部分是
The best part about this, as original author cnst explains, is that
由于每个请求 $request_uri 始终保持不变,并且不受其他重写的影响,因此实际上不会形成任何无限循环.
Due to the fact that $request_uri is always constant per request, and is not affected by other rewrites, it won't, in fact, form any infinite loops.
与对任何 .html
请求(包括对/index.html
的不可见内部重定向)进行操作的重写不同,此解决方案仅对用户可见的外部 URI 进行操作.
Unlike the rewrites, which operate on any .html
request (including the invisible internal redirect to /index.html
), this solution only operates on external URIs that are visible to the user.
您仍然需要 try_files
指令,否则 NGINX 将不知道如何处理新清理的无扩展 URI.上面显示的 try_files
指令将首先自己尝试新的 URL,然后使用.html"来尝试它.扩展名,然后尝试将其作为目录名.
You will still need the try_files
directive, as otherwise NGINX will have no idea what to do with the newly sanitized extension-less URIs. The try_files
directive shown above will first try the new URL by itself, then try it with the ".html" extension, then try it as a directory name.
NGINX 文档还解释了默认的 try_files
指令是如何工作的.默认的 try_files
指令的顺序与上面的示例不同,因此下面的解释并不完全一致:
The NGINX docs also explain how the default try_files
directive works. The default try_files
directive is ordered differently than the example above so the explanation below does not perfectly line up:
NGINX 将首先将 .html
附加到 URI 的末尾并尝试为其提供服务.如果找到合适的 .html
文件,它将返回该文件并维护无扩展名的 URI.如果找不到合适的 .html
文件,它会尝试不带任何扩展名的 URI,然后将 URI 作为目录,最后返回 404 错误.
NGINX will first append
.html
to the end of the URI and try to serve it. If it finds an appropriate.html
file, it will return that file and will maintain the extension-less URI. If it cannot find an appropriate.html
file, it will try the URI without any extension, then the URI as a directory, and then finally return a 404 error.
更新:正则表达式有什么作用?
上面的回答涉及到了正则表达式的使用,但这里有一个更具体的解释给那些仍然好奇的人.使用了以下正则表达式(regex):
UPDATE: What does the regex do?
The above answer touches on the use of regular expressions, but here is a more specific explanation for those who are still curious. The following regular expression (regex) is used:
^/(.*).html
这分解为:
^
:表示行首.
/
:匹配字符/";字面上地.在 NGINX 中不需要对正斜杠进行转义.
/
: match the character "/" literally. Forward slashes do NOT need to be escaped in NGINX.
(.*)
:捕获组:无限次匹配任意字符
(.*)
: capturing group: match any character an unlimited number of times
.
:匹配字符.";字面上地.这必须用反斜杠转义.
.
: match the character "." literally. This must be escaped with a backslash.
html
:匹配字符串html";字面意思.
html
: match the string "html" literally.
捕获组 (.*)
是包含非.html"的URL 的一部分.稍后可以使用变量 $1
引用它.然后 NGINX 被配置为重新尝试请求(return 302/$1;
),并且 try_files
指令在内部重新附加.html".扩展名,以便可以找到文件.
The capturing group (.*)
is what contains the non-".html" portion of the URL. This can later be referenced with the variable $1
. NGINX is then configured to re-try the request (return 302 /$1;
) and the try_files
directive internally re-appends the ".html" extension so the file can be located.
要保留传递给 .html
页面的查询字符串和参数,可以将 return
语句更改为:
To retain query strings and arguments passed to a .html
page, the return
statement can be changed to:
return 302 /$1$is_args$args;
这应该允许诸如 /index.html?test
之类的请求重定向到 /index?test
而不仅仅是 /index
.
This should allow requests such as /index.html?test
to redirect to /index?test
instead of just /index
.
来自 NGINX 页面 If Is Evil:
From the NGINX page If Is Evil:
如果在位置上下文中,可以在内部完成的唯一 100% 安全的事情是:
The only 100% safe things which may be done inside if in a location context are:
返回...;
重写...最后;
另外,请注意,您可以将302"重定向替换为301".
301
重定向是永久性的,并由网络浏览器和搜索引擎缓存.如果您的目标是从已被搜索引擎索引的页面中永久删除 .html
扩展名,您将需要使用 301
重定向.但是,如果您在实时站点上进行测试,最好的做法是从 302
开始,只有在您完全确信您的配置工作正常时才移动到 301
.
Also, note that you may swap out the '302' redirect for a '301'.
A 301
redirect is permanent, and is cached by web browsers and search engines. If your goal is to permanently remove the .html
extension from pages that are already indexed by a search engine, you will want to use a 301
redirect. However, if you are testing on a live site, it is best practice to start with a 302
and only move to a 301
when you are absolutely confident your configuration is working correctly.
这篇关于NGINX 删除 .html 扩展名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!