502高负载下的网关错误(nginx / php-fpm) [英] 502 Gateway Errors under High Load (nginx/php-fpm)

查看:139
本文介绍了502高负载下的网关错误(nginx / php-fpm)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在一个相当繁忙的互联网网站工作,往往得到非常大的交通高峰。在这些高峰期间,每秒请求数百页的页面,这会产生随机的502个网关错误。



现在我们在一台机器上运行Nginx(1.0.10)和PHP-FPM 4x SAS 15k驱动器(raid10),具有16核CPU和24GB DDR3 RAM。我们也使用最新的Xcache版本。 DB位于另一台机器上,但此机器的负载非常低,没有问题。



在正常负载下,一切运行完美,系统负载低于1, PHP-FPM状态报告永远不会同时显示超过10个活动进程。总是有大约10GB的ram仍然可用。在正常负载下,机器每秒处理大约100次网页浏览。



当流量大量涌入时,会产生数百个页面视图每秒请求从机器。我注意到FPM的状态报告显示多达50个活动进程,但仍然低于我们已配置的300个最大连接数。在这些高峰期间,Nginx状态报告高达5000个活动连接,而不是正常平均值1000.



操作系统信息:CentOS版本5.7(最终)



CPU:Intel(R)Xeon(R)CPU E5620 @ 2.40GH(16核)



php-fpm.conf

  daemonize = yes 
listen = /tmp/fpm.sock
pm = static
pm.max_children = 300
pm.max_requests = 1000

设置rlimit_files,因为据我所知,如果你不使用系统默认值。



fastcgi_params 文件)

  fastcgi_connect_timeout 60; 
fastcgi_send_timeout 180;
fastcgi_read_timeout 180;
fastcgi_buffer_size 128k;
fastcgi_buffers 4 256k;
fastcgi_busy_buffers_size 256k;
fastcgi_temp_file_write_size 256k;
fastcgi_intercept_errors on;

fastcgi_pass unix:/tmp/fpm.sock;

nginx.conf


$ b b

  worker_processes 8; 
worker_connections 16384;
sendfile on;
tcp_nopush on;
keepalive_timeout 4;

Nginx通过Unix Socket连接到FPM。



sysctl.conf

  net.ipv4.ip_forward = 0 
net.ipv4 .conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 1
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.shmmax = 68719476736
kernel.shmall = 4294967296
net.ipv4.conf.all。 send_redirects = 0
net.ipv4.conf.default.send_redirects = 0
net.ipv4.tcp_max_syn_backlog = 2048
net.ipv4.icmp_echo_ignore_broadcasts = 1
net.ipv4.conf。 all.accept_source_route = 0
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.all.secure_redirects = 0
net.ipv4.conf.all.log_martians = 1
net.ipv4.conf.default.accept_redirects = 0
net.ipv4.conf.default.secure_redirects = 0
net.ipv4.icmp_echo_ignore_broadcasts = 1
net.ipv4.icmp_ignore_bogus_error_responses = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.tcp_timestamps = 0
net.ipv4.conf.all.rp_filter = 1
net.ipv4 .conf.default.rp_filter = 1
net.ipv4.conf.eth0.rp_filter = 1
net.ipv4.conf.lo.rp_filter = 1
net.ipv4.ip_conntrack_max = 100000

limits.conf

  * soft nofile 65536 
* hard nofile 65536

这些是以下命令的结果:

  ulimit -n 
65536

ulimit -Sn
65536

ulimit -Hn
65536

cat / proc / sys / fs / file-max
2390143问题:如果PHP-FPM没有耗尽连接,则可能会出现以下错误:负载仍然很低,并且有大量的RAM可用,在高流量期间,什么瓶颈可能导致这些随机502网关错误?



注意:机器的ulimit是1024,因为我把它改为65536我没有完全重新启动机器,因为它是一台生产机器,这意味着太多的停机时间。



您有:
fastcgi_buffers 4 256k;



更改为:
fastcgi_buffers 256 16k; //总共4096k个



同时设置 fastcgi_max_temp_file_size 0 ,如果回复开始优化您的fastcgi缓冲区,将禁用磁盘缓冲。


I work for a rather busy internet site that is often gets very large spikes of traffic. During these spikes hundreds of pages per second are requested and this produces random 502 gateway errors.

Now we run Nginx (1.0.10) and PHP-FPM on a machine with 4x SAS 15k drives (raid10) with a 16 core CPU and 24GB of DDR3 ram. Also we make use of the latest Xcache version. The DB is located on another machine, but this machine's load is very low, and has no issues.

Under normal load everything runs perfect, system load is below 1, and PHP-FPM status report never really shows more than 10 active processes at one time. There is always about 10GB of ram still available. Under normal load the machine handles about 100 pageviews per second.

The problem arises when huge spikes of traffic arrive, and hundreds of page-views per second are requested from the machine. I notice that FPM's status report then shows up to 50 active processes, but that is still way below the 300 max connections that we have configured. During these spikes Nginx status reports up to 5000 active connections instead of the normal average of 1000.

OS Info: CentOS release 5.7 (Final)

CPU: Intel(R) Xeon(R) CPU E5620 @ 2.40GH (16 cores)

php-fpm.conf

daemonize = yes
listen = /tmp/fpm.sock
pm = static
pm.max_children = 300
pm.max_requests = 1000

I have not setup rlimit_files, because as far as I know it should use the system default if you don't.

fastcgi_params (only added values to standard file)

fastcgi_connect_timeout 60;
fastcgi_send_timeout 180;
fastcgi_read_timeout 180;
fastcgi_buffer_size 128k;
fastcgi_buffers 4 256k;
fastcgi_busy_buffers_size 256k;
fastcgi_temp_file_write_size 256k;
fastcgi_intercept_errors on;

fastcgi_pass            unix:/tmp/fpm.sock;

nginx.conf

worker_processes        8;
worker_connections      16384;
sendfile                on;
tcp_nopush              on;
keepalive_timeout       4;

Nginx connects to FPM via Unix Socket.

sysctl.conf

net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 1
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.shmmax = 68719476736
kernel.shmall = 4294967296
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.default.send_redirects = 0
net.ipv4.tcp_max_syn_backlog = 2048
net.ipv4.icmp_echo_ignore_broadcasts = 1
net.ipv4.conf.all.accept_source_route = 0
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.all.secure_redirects = 0
net.ipv4.conf.all.log_martians = 1
net.ipv4.conf.default.accept_redirects = 0
net.ipv4.conf.default.secure_redirects = 0
net.ipv4.icmp_echo_ignore_broadcasts = 1
net.ipv4.icmp_ignore_bogus_error_responses = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.tcp_timestamps = 0
net.ipv4.conf.all.rp_filter=1
net.ipv4.conf.default.rp_filter=1
net.ipv4.conf.eth0.rp_filter=1
net.ipv4.conf.lo.rp_filter=1
net.ipv4.ip_conntrack_max = 100000

limits.conf

* soft nofile 65536
* hard nofile 65536

These are the results for the following commands:

ulimit -n
65536

ulimit -Sn
65536

ulimit -Hn
65536

cat /proc/sys/fs/file-max
2390143

Question: If PHP-FPM is not running out of connections, the load is still low, and there is plenty of RAM available, what bottleneck could be causing these random 502 gateway errors during high traffic?

Note: by default this machine's ulimit's were 1024, since I changed it to 65536 I have not fully rebooted the machine, as it's a production machine and it would mean too much downtime.

解决方案

This should fix it...

You have: fastcgi_buffers 4 256k;

Change it to: fastcgi_buffers 256 16k; // 4096k total

Also set fastcgi_max_temp_file_size 0, that will disable buffering to disk if replies start to exceeed your fastcgi buffers.

这篇关于502高负载下的网关错误(nginx / php-fpm)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆