调试独角兽postgres连接泄漏 [英] Debugging unicorn postgres connection leak

查看:287
本文介绍了调试独角兽postgres连接泄漏的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们的新应用程式昨天流量很大,而且开始崩溃。错误是达到PG最大连接数。根据我们的配置,这很奇怪 -

Our new app had a lot of traffic yesterday and it started crashing. The error was PG max connections reached. This was very weird given our configuration -

PG max_connections = 100
Unicorn workers = 8
Sidekiq Processes = 1
Sidekiq Concurrency = 25

PG max_connections = 100 Unicorn workers = 8 Sidekiq Processes = 1 Sidekiq Concurrency = 25

那么技术上应该只有34个活动连接?我不认为我们在应用程序中有任何多线程。

So technically there should be only 34 active connections right? I dont think we have any multithreading in the app going on.

这是重新启动db服务器和应用程序后解决的。今天,我开始看到这些连接弹出窗口。

This was resolved once I restarted the db server and apps. Today, I am starting to see these connections popup.

检查 pg_stats_activity

prod_db=# select datid, datname, pid, usesysid, usename, application_name, state from pg_stat_activity;                                                                             
 datid |        datname        |  pid  | usesysid | usename  |                        application_name                         | state  
-------+-----------------------+-------+----------+----------+-----------------------------------------------------------------+--------
 16384 | prod_db | 30104 |       10 | postgres | unicorn worker[1] -c /u/apps/e...ig/unicorn.rb -E deployment -D | idle
 16384 | prod_db | 30094 |       10 | postgres | unicorn worker[0] -c /u/apps/e...ig/unicorn.rb -E deployment -D | idle
 16384 | prod_db | 30110 |       10 | postgres | unicorn worker[2] -c /u/apps/e...ig/unicorn.rb -E deployment -D | idle
 16384 | prod_db | 30116 |       10 | postgres | unicorn worker[3] -c /u/apps/e...ig/unicorn.rb -E deployment -D | idle
 16384 | prod_db | 30123 |       10 | postgres | unicorn worker[4] -c /u/apps/e...ig/unicorn.rb -E deployment -D | idle
 16384 | prod_db | 30129 |       10 | postgres | unicorn worker[5] -c /u/apps/e...ig/unicorn.rb -E deployment -D | idle
 16384 | prod_db | 30135 |       10 | postgres | unicorn worker[6] -c /u/apps/e...ig/unicorn.rb -E deployment -D | idle
 16384 | prod_db | 30157 |       10 | postgres | unicorn worker[7] -c /u/apps/e...ig/unicorn.rb -E deployment -D | idle
 16384 | prod_db | 32161 |       10 | postgres | unicorn worker[5] -c /u/apps/e...ig/unicorn.rb -E deployment -D | idle
 16384 | prod_db | 32183 |       10 | postgres | unicorn worker[7] -c /u/apps/e...ig/unicorn.rb -E deployment -D | idle
 16384 | prod_db | 32273 |       10 | postgres | unicorn worker[5] -c /u/apps/e...ig/unicorn.rb -E deployment -D | idle
 16384 | prod_db | 32296 |       10 | postgres | unicorn worker[2] -c /u/apps/e...ig/unicorn.rb -E deployment -D | idle
 16384 | prod_db |   374 |       10 | postgres | unicorn worker[1] -c /u/apps/e...ig/unicorn.rb -E deployment -D | idle
 16384 | prod_db |   491 |       10 | postgres | sidekiq 3.4.2 app_production [0 of 25 busy]                     | idle
 16384 | prod_db |   498 |       10 | postgres | unicorn worker[7] -c /u/apps/e...ig/unicorn.rb -E deployment -D | idle
 16384 | prod_db |   581 |       10 | postgres | unicorn worker[3] -c /u/apps/e...ig/unicorn.rb -E deployment -D | idle
 16384 | prod_db |  1337 |       10 | postgres | psql                                                            | active
(17 rows)

很奇怪,我看到多个连接归属于单独的独角兽工作进程。

Its weird that I am seeing multiple connections attributed to single unicorn worker processes.

我正在读这个错误吗?我的假设是,这是postgres连接的致命事件的构建块,当有重大负载时。

Am I reading this correct? My hypothesis is that, this is the building block to the fatal event of postgres connections running out when there is a major load.

如果是真的,一个调试这个?任何指针?谢谢! :)

If that is true, how can one debug this? Any pointers? Thanks! :)

希望在需要时分享更多详情。

Would love to share more details if needed.

postgresql.conf

data_directory = '/var/lib/postgresql/9.4/main'
datestyle = 'iso, mdy'
default_text_search_config = 'pg_catalog.english'
external_pid_file = '/var/run/postgresql/9.4-main.pid'
hba_file = '/etc/postgresql/9.4/main/pg_hba.conf'
ident_file = '/etc/postgresql/9.4/main/pg_ident.conf'
listen_addresses = 'localhost'
log_line_prefix = '%t '
max_connections = 100
port = 5432
shared_buffers = '24MB'
ssl = on
ssl_cert_file = '/etc/ssl/certs/ssl-cert-snakeoil.pem'
ssl_key_file = '/etc/ssl/private/ssl-cert-snakeoil.key'
unix_socket_directories = '/var/run/postgresql'

unicorn.rb - https://gist.github.com/steverob/b83e41bb49d78f9aa32f79136df5af5f

database.yml -

production:
  adapter: postgresql
  host: localhost
  username: postgres
  password: app_name
  pool: 40
  timeout: 5000
  database: app_production
  encoding: utf8


推荐答案

通常,应用程序中的任何线程在DB中执行查询当它以任何方式使用 ActiveRecord :: Base.connection 时,它将获得与数据库的连接,与其他线程分离 ConnectionPool class 负责将数据库连接分配给Rails中的线程。请阅读类文档以获取更多信息。

In general, any thread in the application, when it executes a query in the DB (i.e. when it uses ActiveRecord::Base.connection in any way), it will get a connection to the DB, separate from other threads. The ConnectionPool class takes care of allocating DB connections to threads in Rails. Please read the class documentation for more info.

池最多可以 pool database.yml中的配置选项)大。因此,如果您将 pool 选项设置为40,则每个进程(独角兽worker)最多可以打开40个连接,如果它使用40 ) ,所以在您的情况下,独角兽可以吃多达40 * 8 = 200个连接

The pool can be up to pool (the configuration option in the database.yml) big. Thus, if you set the pool option to 40, each process (unicorn worker) may open up to 40 connections if it used 40 (or more) threads, so in your case, unicorn alone may eat up to 40 * 8 = 200 connections.

所以,总的来说,我认为你必须在你的应用程序的某个地方(或许在某个地方的宝石)。如果这与您的其他问题中的应用程序相同,那么我猜它肯定与多个生成线程相关。

So, overall I think you must have some threading somewhere in your app (perhaps in a gem somewhere). If this is the same application as in your other question, then I'd guess it's definitely related to the multiple spawning threads.

这篇关于调试独角兽postgres连接泄漏的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆