NGINX 删除 .html 扩展名 [英] NGINX remove .html extension

查看:46
本文介绍了NGINX 删除 .html 扩展名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

因此,我找到了删除页面上 .html 扩展名的答案,此代码可以正常工作:

So, I found an answer to removing the .html extension on my page, that works fine with this code:

server {
    listen 80;
    server_name _;
    root /var/www/html/;
    index index.html;

    if (!-f "${request_filename}index.html") {
        rewrite ^/(.*)/$ /$1 permanent;
    }

    if ($request_uri ~* "/index.html") {
        rewrite (?i)^(.*)index.html$ $1 permanent;
    }   

    if ($request_uri ~* ".html") {
        rewrite (?i)^(.*)/(.*).html $1/$2 permanent;
    }

    location / {
        try_files $uri.html $uri $uri/ /index.html;
    }
}

但是如果我打开 mypage.com 它会将我重定向到 mypage.com/index
这不是通过将 index.html 声明为索引来解决的吗?任何帮助表示赞赏.

But if I open mypage.com it redirects me to mypage.com/index
Wouldn't this be fixed by declaring index.html as index? Any help is appreciated.

推荐答案

NGINX 中删除.html"的圣杯"解决方案:

更新的答案:这个问题激起了我的好奇心,于是我又一次更深入地寻找圣杯".NGINX 中 .html 重定向的解决方案.这是我找到的答案的链接,因为我不是自己想出来的:https://stackoverflow.com/a/32966347/4175718

The "Holy Grail" Solution for Removing ".html" in NGINX:

UPDATED ANSWER: This question piqued my curiosity, and I went on another, more in-depth search for a "holy grail" solution for .html redirects in NGINX. Here is the link to the answer I found, since I didn't come up with it myself: https://stackoverflow.com/a/32966347/4175718

不过,我会举一个例子并解释它是如何工作的.代码如下:

However, I'll give an example and explain how it works. Here is the code:

location / {
    if ($request_uri ~ ^/(.*).html) {
        return 302 /$1;
    }
    try_files $uri $uri.html $uri/ =404;
}

这里发生的事情是对 if 指令的巧妙使用.NGINX 在传入请求的 $request_uri 部分运行正则表达式.正则表达式检查 URI 是否具有 .html 扩展名,然后将 URI 的无扩展名部分存储在内置变量 $1 中.

What's happening here is a pretty ingenious use of the if directive. NGINX runs a regex on the $request_uri portion of incoming requests. The regex checks if the URI has an .html extension and then stores the extension-less portion of the URI in the built-in variable $1.

来自 docs,因为我花了一段时间才弄清楚在哪里$1 来自:

From the docs, since it took me a while to figure out where the $1 came from:

正则表达式可以包含可供以后在 $1..$9 变量中重用的捕获.

Regular expressions can contain captures that are made available for later reuse in the $1..$9 variables.

正则表达式检查是否存在不需要的 .html 请求并有效地清理 URI 使其不包含扩展名.然后,使用一个简单的 return 语句,将请求重定向到现在存储在 $1 中的清理过的 URI.

The regex both checks for the existence of unwanted .html requests and effectively sanitizes the URI so that it does not include the extension. Then, using a simple return statement, the request is redirected to the sanitized URI that is now stored in $1.

正如原作者 cnst 解释的那样,最好的部分是

The best part about this, as original author cnst explains, is that

由于每个请求 $request_uri 始终保持不变,并且不受其他重写的影响,因此实际上不会形成任何无限循环.

Due to the fact that $request_uri is always constant per request, and is not affected by other rewrites, it won't, in fact, form any infinite loops.

与对任何 .html 请求(包括对/index.html 的不可见内部重定向)进行操作的重写不同,此解决方案仅对用户可见的外部 URI 进行操作.

Unlike the rewrites, which operate on any .html request (including the invisible internal redirect to /index.html), this solution only operates on external URIs that are visible to the user.

您仍然需要 try_files 指令,否则 NGINX 将不知道如何处理新清理的无扩展 URI.上面显示的 try_files 指令将首先自己尝试新的 URL,然后使用.html"来尝试它.扩展名,然后尝试将其作为目录名.

You will still need the try_files directive, as otherwise NGINX will have no idea what to do with the newly sanitized extension-less URIs. The try_files directive shown above will first try the new URL by itself, then try it with the ".html" extension, then try it as a directory name.

NGINX 文档还解释了默认的 try_files 指令是如何工作的.默认的 try_files 指令的顺序与上面的示例不同,因此下面的解释并不完全一致:

The NGINX docs also explain how the default try_files directive works. The default try_files directive is ordered differently than the example above so the explanation below does not perfectly line up:

NGINX 将首先将 .html 附加到 URI 的末尾并尝试为其提供服务.如果找到合适的 .html 文件,它将返回该文件并维护无扩展名的 URI.如果找不到合适的 .html 文件,它会尝试不带任何扩展名的 URI,然后将 URI 作为目录,最后返回 404 错误.

NGINX will first append .html to the end of the URI and try to serve it. If it finds an appropriate .html file, it will return that file and will maintain the extension-less URI. If it cannot find an appropriate .html file, it will try the URI without any extension, then the URI as a directory, and then finally return a 404 error.

更新:正则表达式有什么作用?

上面的回答涉及到了正则表达式的使用,但这里有一个更具体的解释给那些仍然好奇的人.使用了以下正则表达式(regex):

UPDATE: What does the regex do?

The above answer touches on the use of regular expressions, but here is a more specific explanation for those who are still curious. The following regular expression (regex) is used:

^/(.*).html

这分解为:

^:表示行首.

/:匹配字符/";字面上地.在 NGINX 中不需要对正斜杠进行转义.

/: match the character "/" literally. Forward slashes do NOT need to be escaped in NGINX.

(.*):捕获组:无限次匹配任意字符

(.*): capturing group: match any character an unlimited number of times

.:匹配字符.";字面上地.这必须用反斜杠转义.

.: match the character "." literally. This must be escaped with a backslash.

html:匹配字符串html";字面意思.

html: match the string "html" literally.

捕获组 (.*) 是包含非.html"的URL 的一部分.稍后可以使用变量 $1 引用它.然后 NGINX 被配置为重新尝试请求(return 302/$1;),并且 try_files 指令在内部重新附加.html".扩展名,以便可以找到文件.

The capturing group (.*) is what contains the non-".html" portion of the URL. This can later be referenced with the variable $1. NGINX is then configured to re-try the request (return 302 /$1;) and the try_files directive internally re-appends the ".html" extension so the file can be located.

要保留传递给 .html 页面的查询字符串和参数,可以将 return 语句更改为:

To retain query strings and arguments passed to a .html page, the return statement can be changed to:

return 302 /$1$is_args$args;

这应该允许诸如 /index.html?test 之类的请求重定向到 /index?test 而不仅仅是 /index.

This should allow requests such as /index.html?test to redirect to /index?test instead of just /index.

来自 NGINX 页面 If Is Evil:

From the NGINX page If Is Evil:

如果在位置上下文中,可以在内部完成的唯一 100% 安全的事情是:

The only 100% safe things which may be done inside if in a location context are:

返回...;

重写...最后;


另外,请注意,您可以将302"重定向替换为301".

301 重定向是永久性的,并由网络浏览器和搜索引擎缓存.如果您的目标是从已被搜索引擎索引的页面中永久删除 .html 扩展名,您将需要使用 301 重定向.但是,如果您在实时站点上进行测试,最好的做法是从 302 开始,只有在您完全确信您的配置工作正常时才移动到 301.


Also, note that you may swap out the '302' redirect for a '301'.

A 301 redirect is permanent, and is cached by web browsers and search engines. If your goal is to permanently remove the .html extension from pages that are already indexed by a search engine, you will want to use a 301 redirect. However, if you are testing on a live site, it is best practice to start with a 302 and only move to a 301 when you are absolutely confident your configuration is working correctly.

这篇关于NGINX 删除 .html 扩展名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆