NGINX反向代理在某些页面上导致502错误 [英] NGINX Reverse Proxy Causes 502 Errors On Some Pages

查看:2437
本文介绍了NGINX反向代理在某些页面上导致502错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个在Ubuntu服务器上运行的Node.js/Express应用程序.它位于NGINX反向代理的后面,该代理将端口80(或ssl的443)上的流量传递到应用程序的端口.

I have a Node.js/Express application running on an Ubuntu server. It sits behind an NGINX reverse proxy that passes traffic on port 80 (or 443 for ssl) to the application's port.

我最近遇到了一个问题,由于无法确定的原因,尝试访问/的流量最终会出现504错误和超时.作为测试,我增加了超时时间,现在出现了502错误.我可以毫无问题地访问应用程序上的其他路由,例如/login.

I've recently had an issue where for no identifiable reason, traffic trying to access / will eventually get a 504 error and timeout. As a test, I increased the timeout and am now getting a 502 error. I can access some other routes on my application, /login for example, with no problems.

当我重新启动Express应用程序时,我的应用程序正常运行,没有问题,通常持续几天,直到再次发生这种情况.查看我的Express应用程序的日志,一个很好的请求看起来像:

When I restart my Express application, my app runs fine with no issues, usually for a few days until this happens again. Viewing the logs for my Express app, a good request looks something like:

GET / 200 15.786 ms - 1214

未正确响应的请求如下所示:

Whereas requests that aren't responding properly look like this:

GET / - - ms - -

此应用程序已正常运行约13个月,没有任何问题,出现此问题时没有任何提示.在发生这种情况之前,我还没有推送任何更新.

This application has been running properly for about 13 months with no issues, this issue has arisen with no prompting. I haven't pushed any updates within the time that this has occurred.

这是我的NGINX配置(为安全起见进行了一些修改,例如example.com)

Here is my NGINX config (modified a bit for security, e.g. example.com)

upstream site_upstream {
    server 127.0.0.1:3000;
}

server {
    listen 80;
    listen 443 ssl;

    server_name example.com;
    ssl_certificate /etc/nginx/ssl/nginx.crt;
    ssl_certificate_key /etc/nginx/ssl/nginx.key;

    location / {
        proxy_pass http://site_upstream;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection 'upgrade';
        proxy_set_header Host $host;
        proxy_cache_bypass $http_upgrade;
        proxy_redirect http://rpa_upstream https://example.com;
    }
}

我不确定我的NGINX配置或应用程序本身是否存在此问题,因为我的两个配置均未更改.

I am unsure of if this an issue with my NGINX config or with my application itself as neither of my configurations have changed.

推荐答案

这听起来像是nginx或您的Node应用程序中的内存泄漏.如果重新启动Node应用程序后它又开始工作,但是没有重新启动nginx,那么看来这是您的Node应用程序有问题.

It sounds like a memory leak in either nginx or your Node application. If it starts to work again after restarting your Node application but without restarting nginx then it seems it's a problem with your Node app.

还尝试不使用代理直接访问您的应用,以查看在这种情况下您遇到了什么问题.有时,您有时可以在浏览器的开发人员工具中,或通过curl等命令行工具或Apache ab之类的基准来获得更多详细信息.使用ab运行严格的基准测试可以帮助您更快地发现问题,而不必等待.

Try also accessing your app directly without a proxy to see what problems do you have in that case. You can sometimes get more detailed info that way in your browser's developer tools or with command-line tools like curl or benchmarks like Apache ab. Running heavy benchmarks with ab can help you spot the problems more quickly instead of waiting.

当然,当您不显示任何代码时,很难说出到底是什么问题.

Of course it's hard to say what's exactly the problem when you don't show any code.

如果以前运行良好,并且在那段时间内没有升级任何东西(您的应用程序,任何Node模块或Node本身),那么您的流量可能会略有增加,现在您开始看到的问题并不是体现之前.也许您的系统现在将更多的RAM用于其他任务,并且内存泄漏开始比以前更快地成为问题.

If it was working fine before, and if you didn't upgrade anything (your app, any Node modules, or Node itself) during that time, then maybe your traffic increased slightly and now you start seeing the problems that were not manifesting before. Or maybe your system now uses more RAM for other tasks and the memory leak starts to be a problem quicker than before.

您可以定期记录process.memoryUsage()返回的数据,看是否有问题.

You can start logging data returned by process.memoryUsage() on a regular intervals and see if anything looks problematic.

还可以使用pstophtop或其他命令监视您的Node进程,或者查看内存使用情况/proc/PID/status等.

Also monitor your Node processes with ps, top, htop or other commands, or see the memory usage /proc/PID/status etc.

您还可以定期监视/proc/meminfo,以查看系统中使用的总内存是否与您的应用程序无响应相关.

You can also monitor /proc/meminfo on regular intervals and see if the total memory used in your system is correlated with your application getting unresponsive.

可能引起问题的另一件事是,例如,如果您未在应用程序内部处理错误和超时,则说明数据库响应缓慢或根本不响应.添加更广泛的日志记录(进入每个路由处理程序的行,在每个I/O操作开始之前的行,以及在每个I/O操作成功或失败或超时后的行),应该可以使您更深入地了解它.

Another thing that may be causing problems is for example conenctions to your database responding slowly or not at all, if you are not handling errors and timeouts inside of your application. Adding more extensive logging (a line entering every route handler, a line before every I/O opertation starts and after every I/O operation either succeeds or fails or times out) should give you some more insight into it.

这篇关于NGINX反向代理在某些页面上导致502错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆