postgresql 9.4 - 防止应用始终选择最新更新的行 [英] postgresql 9.4 - prevent app selecting always the latest updated rows

查看:23
本文介绍了postgresql 9.4 - 防止应用始终选择最新更新的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个(Ruby on Rails 4)应用程序并使用 postgresql 查询,它基本上查看机会表,在机会为可用=真"的行中随机搜索,并用可用=假"更新这些选定的行'.每次用户点击尝试一个机会"时,该应用都会使用下面的查询.

I have a (Ruby on Rails 4) app and using a postgresql query that basically look into a Opportunity table, search randomly among the rows those where opportunity is 'available=true' and update these selected rows with ' available = false'. Each time a user clicks on 'try an opportunity' the app uses the query below.

UPDATE opportunities s
SET    opportunity_available = false
FROM  (
   SELECT id
   FROM   opportunities
   WHERE  deal_id = #{@deal.id}
   AND    opportunity_available
   AND    pg_try_advisory_xact_lock(id)
   LIMIT  1
   FOR    UPDATE
   ) sub
WHERE     s.id = sub.id
RETURNING s.prize_id, s.id;
// inspired by https://stackoverflow.com/questions/33128531/put-pg-try-advisory-xact-lock-in-a-nested-subquery

我已经挣扎了 5 天,但现在我已经大致了解了它的行为(糟糕):现在我需要知道如何改变它.

I've been struggling for 5 days but I have managed to understand now roughly how it's behaving (badly): now I need to know how to change it.

确实,该应用程序变得有点疯狂",并完全按照我更新的最新内容进行选择(并更新行).

Indeed, the app "becomes a little crazy" and select (and update the row) exactly by the latest I updated.

让我清楚我如何创造这些机会的过程

Let me be clear on my process on how I create these opportunities

  • 我作为管理员通过我的应用程序创建了 20 个机会(行)

  • I as admin create 20 opportunities (rows) via my app as an admin

然后在我的管理面板中,我创建了一个奖品,它的数量比方说奖品 ID = 45,数量 = 4.该应用程序将进入机会表并随机填充(这部分工作正常)4 行有奖品 ID= 45.

then in my admin panel, I create a prize which a quantity let's say prize id =45 with quantity = 4. the app will go in Opportunity table and fill randomly (this part works perfectly) 4 rows with prize_id= 45.

现在用户运行该应用程序,并且如上所述,该应用程序不会随机选择一行,而是始终从最新更新的行开始:似乎该行中的一个有奖,然后另一个有奖,然后另一个,从来没有那些没有奖品的(priored_id=空)...

Now a user runs the app and as explained above, the app will not choose randomly a row but will begin always by the latest updated: it seems the takes one of the row with a prize, then another with a prize, then another and never those without prizes (with prized_id= empty)...

现在我做了一些手动实验:我尝试手动更改我表上的值(通过 pgadmin),最奇怪的事情发生了:如果我修改第 32 行,然后是第 45 行,然后是第 67 行,当用户尝试再次播放时,猜猜看,随机选择的行正是我以相反顺序更新的行:它将选择第 67 行,然后是第 45 行,然后是第 32 行..它甚至不会考虑可以选择的其他行(所有其他可用的机会= true).

Now I made some manual experiments: I tried manually change the values on my table (via pgadmin), the weirdest thing happen: if I modify row 32, then row 45, then row 67, when the user tries to play again, guess what, the lines that are picked randomly are exactly the ones I updated in the reverse order: it will pick line 67 then line45 then line 32.. It won't even consider the other rows that could be chosen (all the others opportunities where available= true).

我也尝试不使用更新或 'pg_try_advisory_xact_lock(id)' 行,但它似乎仍然存在相同的问题.

I also tried not to use for update or the 'pg_try_advisory_xact_lock(id)' line and it seems it still has the same issue.

作为管理员,我首先创建总共 20 行,然后创建 4 个获胜行,它们是最后一个被更新的(即使在我的 pgadmin 屏幕上它们保持在相同的行......也许在背景,postgresql 将它们排序为最后一次连续更新并选择它们?)这就是为什么一旦选择了 4 个这些获胜行中的一个,然后所有其他行都会跟随.

As, as an admin, I create first the total 20 rows, THEN the 4 winning rows, they're the last to be udpated (even if on my pgadmin screen they stay on the same rows...maybe in the backround, postgresql is ordering them as the last updated consecutively and selecting them?) and that's kind of why as soon as one of 4 these winning rows is selected, then all the others follow.

明确地说,我会同意应用程序逐行选择每个可用的机会(例如:第 3 行,然后是第 4 行,然后是我在 pgadmin 上看到的第 5 行,因为行已经完全归因于奖品随机).问题是它没有这样做,它经常连续取得所有获胜的行......

To be clear I would be OK with the app going row by row to select each opportunity available (ex: row 3, then row 4, then row 5 as per what I see on pgadmin as rows are already attributed the prizes totally randomly). the problem is it's not doing this, it's taking often all the winning rows consecutively....

我很无语,也不知道如何打破这种模式.

I'm speechless and no clue on how to break this pattern.

注意:这种模式不是在 100% 的时间内连续发生,而是经常发生:例如,如果我有超过 4 个获胜行,如果我以用户身份继续点击它就像在这里说的那样持续 2 分钟,然后它停止并且似乎(或者我可能是错的)在 & 之后再次正常表现.分钟将再次只选择获胜的行...

Note: this pattern is not happening continuously 100% of the time but very often: for example if i have more than 4 winning rows, it's like for 2 minutes if I keep clicking as a user it behaves as said here, then it stops and seems (or I might be wrong) to behave normally then again after & minute will again only select winning rows...

编辑 1

这是将奖品注入机会表中的方式(例如,当我创建一个 id 为 21 且该奖品数量 = 3 的奖品时)=> 它随机发送它们(据我所知)但仅在没有的地方已经有一个 Prize_id(即如果机会有 Prize_id=empty,它可以把它放在那里)

Here is how prizes are injected inside Opportunity table (example whehn I create a prize which id is 21 and quantity of this prize= 3)=> it randomly send them (as far as I understand) but only where there is not already a prize_id (i.e. if opportunity has prize_id=empty, it can put it there)

  SQL (2.4ms)  
  UPDATE "opportunities"
  SET "prize_id" = 21
  WHERE "opportunities"."id" 
  IN (
    SELECT "opportunities"."id"
    FROM "opportunities"
    WHERE (deal_id = 341 AND prize_id IS NULL)
    ORDER BY RANDOM()
    LIMIT 3) //
   (0.9ms)  COMMIT

此 SQL 查询由 Rails gem(称为 Randumb:github.com/spilliton/randumb)

This SQL query is generated by a Rails gem (called Randumb: github.com/spilliton/randumb)

编辑 2

我转向另一个更精确的问题:Postgresql 9.4 - 在具有大量写入/读取和锁定的大型数据集(> 30M 行)上选择和更新的最快查询

I moved to another more precise question: Postgresql 9.4 - FASTEST query to select and update on large dataset (>30M rows) with heavy writes/reads and locks

确实,我认为问题在于我需要一个真正的RANDOM选择并远离ARBITRARY选择.

Indeed I think the problem is really that I need a truly RANDOM pick and move away from ARBITRARY pick.

Erwin 在这里已经说过 咨询锁或 NOWAIT 避免等待锁定的行? 但现在我明白他的意思(任意"也很重要,因为它意味着 Postgresql 通常会为同一个查询任意选择同一行,这使得锁争用更大问题比真正的随机选择要多." postgresql 可以自由选择输出 'LIMIT 1' 的最快方式,并且它总是选择相同的行 = 那些已经更新到最后的行.但我不能这样做,因为最新更新的都是获奖机会.

It was already said by Erwin here Advisory locks or NOWAIT to avoid waiting for locked rows? but now I understand what he meant ("Arbitrary" also matters because it implies that Postgresql will typically pick the same row for the same query arbitrarily, which makes lock contention a much bigger issue than truly random picks would. " postgresql is free to choose the fastest way to output the 'LIMIT 1' and it's picking always the same rows= those which have been upated the last ones. But I can't do with this as the latest updated are all Winning opportunities.

推荐答案

只是一个想法:不要调用 random() 将其用作列的默认值(可以索引) 类似的方法可以使用带有增量约为 0.7 * INT_MAX.

Just an idea: instead of calling random() use it as default value for a column(which can be indexed) A similar way could use a serial with an increment of about 0.7 * INT_MAX.

i tmp.sql

CREATE TABLE opportunities
    ( id SERIAL NOT NULL PRIMARY KEY
    , deal_id INTEGER NOT NULL DEFAULT 0
    , prize_id INTEGER
    , opportunity_available boolean NOT NULL DEFAULT False
            -- ----------------------------------------
            -- precomputed random() , (could be indexed)
    , magic DOUBLE precision NOT NULL default RANDOM()
    );

INSERT INTO opportunities(deal_id)
SELECT 341
FROM generate_series(1,20) gs
    ;
VACUUM ANALYZE opportunities;

PREPARE add_three (integer) AS (
WITH zzz AS (
  UPDATE opportunities
  SET prize_id = 21
    , opportunity_available = True
    -- updating magic is not *really* needed here ...
    , magic = random()
  WHERE opportunities.id
  IN (
    SELECT opportunities.id
    FROM opportunities
    WHERE (deal_id = $1 AND prize_id IS NULL)
    -- ORDER BY RANDOM()
    ORDER BY magic
    LIMIT 3)
RETURNING id, magic
    ) -- 
SELECT * FROM zzz
    );

PREPARE draw_one (integer) AS (
  WITH upd AS (
  UPDATE opportunities s
  SET    opportunity_available = false
  FROM  (
     SELECT id
     FROM   opportunities
     WHERE  deal_id = $1
     AND    opportunity_available
     AND    pg_try_advisory_xact_lock(id)
     ORDER BY magic
     LIMIT  1

     FOR    UPDATE
     ) sub
  WHERE     s.id = sub.id
  RETURNING s.prize_id, s.id, magic
    )
SELECT * FROM upd
    );

SELECT * FROM opportunities;

echo add3
EXECUTE add_three(341);
SELECT * FROM opportunities;

echo add3 more
EXECUTE add_three(341);
SELECT * FROM opportunities;

echo draw1
EXECUTE draw_one(341);
SELECT * FROM opportunities;

echo draw2
EXECUTE draw_one(341);
SELECT * FROM opportunities;

VACUUM ANALYZE opportunities;

echo draw3
EXECUTE draw_one(341);
SELECT * FROM opportunities;

echo draw4
EXECUTE draw_one(341);
SELECT * FROM opportunities;

这篇关于postgresql 9.4 - 防止应用始终选择最新更新的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆