节点用临时表调用postgres函数,导致“内存泄漏”。 [英] Node calling postgres function with temp tables causing "memory leak"

查看:91
本文介绍了节点用临时表调用postgres函数,导致“内存泄漏”。的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个node.js程序,它调用一个Postgres(Amazon RDS微型实例)功能,在事务中使用 get_jobs


节点代码只是




  • 基于:自动清理无法访问临时表。因此,应通过会话SQL命令执行适当的清理和分析操作。 ,我尝试从节点服务器运行 VACUUM (因为有人尝试将 VACUUM 设为会话中命令)。我实际上无法进行此测试。我的数据库中有许多对象,对所有对象进行操作的 VACUUM 花费的时间太长,无法执行每次作业迭代。仅将 VACUUM 限制在临时表是不可能的-(a)您不能在事务中运行 VACUUM ,并且( b)在事务外部,临时表不存在。 :P编辑:稍后在Postgres IRC论坛上,一个有用的小伙解释说VACUUM与临时表本身无关,但是可以用于清理从 pg_attributes TEMP TABLES引起的。在任何情况下,在会话中抽真空都不是答案。


  • DROP TABLE ...如果存在 CREATE TABLE 之前,而不是在 ON COMMIT DROP 之前。服务器仍然死掉。


  • 创建温度表(...)插入...(select ...)而不是 CREATE TEMP TABLE ... AS 而不是 ON提交拖放。服务器死了。




ON COMMIT DROP 也不释放所有相关资源?还有什么可以保留记忆?如何释放它?

解决方案

使用CTE创建部分结果集而不是临时表。

 创建或替换功能get_jobs(
)返回表(
...
)AS
$ BODY $
DECLARE
_nowstamp bigint;
BEGIN

-以Unix当前服务器时间(ms)
_nowstamp:=(选择extract(epoch from now())* 1000):: bigint;

返回查询(

-1.获得到期的
WITH Jobs AS(

select ...
来自true_big_table_1
其中job_time< _nowstamp;

-2.将其他东西附加到这些工作
),jobs_extra AS(

select ...
fromreal_big_table_2 r
内部连接作业j在r.id = j.some_id



-3.返回最终结果与第三个大表的连接
选择je.id,...
来自jobs_extra je
左连接je.id = r.id
上的true_big_table_3 r按je.id

分组);

END
$ BODY $语言plpgsql挥发;

计划者将按照我想用临时表实现的方式依次评估每个块。



我知道这并不能直接解决内存泄漏问题(我很确定Postgres对它们的实现存在问题,至少是它们在RDS配置上的体现方式)。



但是,查询有效,它是按照我的预期计划的,并且在运行工作3天后服务器的内存使用情况现在稳定了不会崩溃



我根本没有更改节点代码。


I have a node.js program calling a Postgres (Amazon RDS micro instance) function, get_jobs within a transaction, 18 times a second using the node-postgres package by brianc.

The node code is just an enhanced version of brianc's basic client pooling example, roughly like...

var pg = require('pg');
var conString = "postgres://username:password@server/database";

function getJobs(cb) {
  pg.connect(conString, function(err, client, done) {
    if (err) return console.error('error fetching client from pool', err);
    client.query("BEGIN;");
    client.query('select * from get_jobs()', [], function(err, result) {
      client.query("COMMIT;");
      done(); //call `done()` to release the client back to the pool
      if (err) console.error('error running query', err);
      cb(err, result);
    });
  });
}

function poll() {
  getJobs(function(jobs) {
    // process the jobs
  });
  setTimeout(poll, 55);
}

poll(); // start polling

So Postgres is getting:

2016-04-20 12:04:33 UTC:172.31.9.180(38446):XXX@XXX:[5778]:LOG:  statement: BEGIN;
2016-04-20 12:04:33 UTC:172.31.9.180(38446):XXX@XXX:[5778]:LOG:  execute <unnamed>: select * from get_jobs();
2016-04-20 12:04:33 UTC:172.31.9.180(38446):XXX@XXX:[5778]:LOG:  statement: COMMIT;

... repeated every 55ms.

get_jobs is written with temp tables, something like this

CREATE OR REPLACE FUNCTION get_jobs (
) RETURNS TABLE (
  ...
) AS 
$BODY$
DECLARE 
  _nowstamp bigint; 
BEGIN

  -- take the current unix server time in ms
  _nowstamp := (select extract(epoch from now()) * 1000)::bigint;  

  --  1. get the jobs that are due
  CREATE TEMP TABLE jobs ON COMMIT DROP AS
  select ...
  from really_big_table_1 
  where job_time < _nowstamp;

  --  2. get other stuff attached to those jobs
  CREATE TEMP TABLE jobs_extra ON COMMIT DROP AS
  select ...
  from really_big_table_2 r
    inner join jobs j on r.id = j.some_id

  ALTER TABLE jobs_extra ADD PRIMARY KEY (id);

  -- 3. return the final result with a join to a third big table
  RETURN query (

    select je.id, ...
    from jobs_extra je
      left join really_big_table_3 r on je.id = r.id
    group by je.id

  );

END
$BODY$ LANGUAGE plpgsql VOLATILE;

I've used the temp table pattern because I know that jobs will always be a small extract of rows from really_big_table_1, in hopes that this will scale better than a single query with multiple joins and multiple where conditions. (I used this to great effect with SQL Server and I don't trust any query optimiser now, but please tell me if this is the wrong approach for Postgres!)

The query runs in 8ms on small tables (as measured from node), ample time to complete one job "poll" before the next one starts.

Problem: After about 3 hours of polling at this rate, the Postgres server runs out of memory and crashes.

What I tried already...

  • If I re-write the function without temp tables, Postgres doesn't run out of memory, but I use the temp table pattern a lot, so this isn't a solution.

  • If I stop the node program (which kills the 10 connections it uses to run the queries) the memory frees up. Merely making node wait a minute between polling sessions doesn't have the same effect, so there are obviously resources that the Postgres backend associated with the pooled connection is keeping.

  • If I run a VACUUM while polling is going on, it has no effect on memory consumption and the server continues on its way to death.

  • Reducing the polling frequency only changes the amount of time before the server dies.

  • Adding DISCARD ALL; after each COMMIT; has no effect.

  • Explicitly calling DROP TABLE jobs; DROP TABLE jobs_extra; after RETURN query () instead of ON COMMIT DROPs on the CREATE TABLEs. Server still crashes.

  • Per CFrei's suggestion, added pg.defaults.poolSize = 0 to the node code in an attempt to disable pooling. The server still crashed, but took much longer and swap went much higher (second spike) than all the previous tests which looked like the first spike below. I found out later that pg.defaults.poolSize = 0 may not disable pooling as expected.

  • On the basis of this: "Temporary tables cannot be accessed by autovacuum. Therefore, appropriate vacuum and analyze operations should be performed via session SQL commands.", I tried to run a VACUUM from the node server (as some attempt to make VACUUM an "in session" command). I couldn't actually get this test working. I have many objects in my database and VACUUM, operating on all objects, was taking too long to execute each job iteration. Restricting VACUUM just to the temp tables was impossible - (a) you can't run VACUUM in a transaction and (b) outside the transaction the temp tables don't exist. :P EDIT: Later on the Postgres IRC forum, a helpful chap explained that VACUUM isn't relevant for temp tables themselves, but can be useful to clean up the rows created and deleted from pg_attributes that TEMP TABLES cause. In any case, VACUUMing "in session" wasn't the answer.

  • DROP TABLE ... IF EXISTS before the CREATE TABLE, instead of ON COMMIT DROP. Server still dies.

  • CREATE TEMP TABLE (...) and insert into ... (select...) instead of CREATE TEMP TABLE ... AS, instead of ON COMMIT DROP. Server dies.

So is ON COMMIT DROP not releasing all the associated resources? What else could be holding memory? How do I release it?

解决方案

Use CTEs to create partial result sets instead of temp tables.

CREATE OR REPLACE FUNCTION get_jobs (
) RETURNS TABLE (
  ...
) AS 
$BODY$
DECLARE 
  _nowstamp bigint; 
BEGIN

  -- take the current unix server time in ms
  _nowstamp := (select extract(epoch from now()) * 1000)::bigint;  

  RETURN query (

    --  1. get the jobs that are due
    WITH jobs AS (

      select ...
      from really_big_table_1 
      where job_time < _nowstamp;

    --  2. get other stuff attached to those jobs
    ), jobs_extra AS (

      select ...
      from really_big_table_2 r
        inner join jobs j on r.id = j.some_id

    ) 

    -- 3. return the final result with a join to a third big table
    select je.id, ...
    from jobs_extra je
      left join really_big_table_3 r on je.id = r.id
    group by je.id

  );

END
$BODY$ LANGUAGE plpgsql VOLATILE;

The planner will evaluate each block in sequence the way I wanted to achieve with temp tables.

I know this doesn't directly solve the memory leak issue (I'm pretty sure there's something wrong with Postgres' implementation of them, at least the way they manifest on the RDS configuration).

However, the query works, it is query planned the way I was intending and the memory usage is stable now after 3 days of running the job and my server doesn't crash.

I didn't change the node code at all.

这篇关于节点用临时表调用postgres函数,导致“内存泄漏”。的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆