postgresql 9.4 - 防止应用程序总是选择最新的更新行 [英] postgresql 9.4 - prevent app selecting always the latest updated rows

查看:304
本文介绍了postgresql 9.4 - 防止应用程序总是选择最新的更新行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个(Ruby on Rails 4)应用程序,并使用一个postgresql查询,基本上查看一个Opportunity表,在机会是available = true的行中随机搜索,并更新这些选定的行可用= false '。
每次用户点击尝试机会时,应用程序使用以下查询。

 更新机会s 
SET opportunity_available = false
FROM(
SELECT id
FROM opportunities
WHERE deal_id = #{@deal.id}
AND opportunity_available
AND pg_try_advisory_xact_lock(id)
LIMIT 1
FOR UPDATE
)sub
WHERE s.id = sub.id
RETURNING s.prize_id,s.id;
// inspired by http://stackoverflow.com/questions/33128531/put-pg-try-advisory-xact-lock-in-a-nested-subquery

我一直在努力5天,但我已经设法了解现在大概它的行为(糟糕):现在我需要知道如何改变。



确实,应用程序变得有点疯狂,并选择(和更新行)完全由最新的我更新。



让我清楚了解我如何创建这些机会的过程。




  • 我作为管理员通过我的应用程序创建20个机会(行)作为管理员


  • 然后在我的管理面板中创建一个数量让说奖品id = 45,数量= 4.应用程序将进入机会表,随机填充(这部分完美)4行,其中prize_id = 45。




现在用户运行的应用程序,如上所述,应用程序不会随机选择一行,但将始终由最新更新:似乎需要一行的奖品,然后另一个奖品,然后另一个,从来没有奖品(prized_id =空)...



现在我做了一些手动实验:我尝试手动更改值在我的表(通过pgadmin),最奇怪的事情发生:如果我修改行32,然后行45,然后行67,当用户尝试再次播放,猜猜什么,随机挑选的线是我更新以相反的顺序:它将选择线67,然后选择线45,然后选择线32.
它甚至不会考虑可以选择的其他行(所有其他机会,其中available = true)。



我也尝试不使用更新或'pg_try_advisory_xact_lock(id)'行,似乎它仍然有同样的问题。



因为,作为一个管理员,我创建第一个共20行,THEN四个获胜行,他们是最后被udpated(即使在我的pgadmin屏幕上,他们保持在同一行.. 。可能在后台,postgresql是排序他们作为最后更新连续和选择它们?)这就是为什么只要4个这些获胜行之一选择,然后所有其他人跟随。



要清楚,应用程式会一行一行选择每个可用机会(例如:第3行,第4行,第5行)我在pgadmin看到,因为行已经归因于奖品完全随机)。问题是它不是这样做,它经常连续获得所有的获胜行....



我无语,没有如何打破这种模式的线索。 / p>

注意:这种模式并非连续100%的时间发生,但很常见:例如,如果我有超过4个获胜行,如果我保持2分钟点击作为一个用户,它的行为像这里说,然后它停止,似乎(或我可能是错误的)行为正常,然后再次后&分钟将再次只选择获胜行...



编辑1



是如何在机会表内注入奖品(例如,我创建一个奖励,ID为21,奖金数量= 3)=>它随机发送他们(据我所知),但只有在还没有一个prize_id即如果机会有prize_id =空,它可以放在那里)

  SQL(2.4ms)
UPDATE
SETprize_id= 21
WHEREopportunities。id
IN(
SELECTopportunities。id
FROMopportunities
WHERE(deal_id = 341 AND prize_id IS NULL)
ORDER BY RANDOM()
LIMIT 3)//
(0.9ms)COMMIT

此SQL查询由Rails gem生成(称为Randumb: github.com/spilliton/randumb



EDIT 2



我搬到另一个更精确的问题: Postgresql 9.4 - FASTEST查询,以选择和更新大型数据集(大于30M行),具有大量写入/读取和锁定



事实上,我认为问题是我需要一个真正的随机选择,并离开 ARBITRARY 选择。



这里已经被Erwin说过了咨询锁或NOWAIT以避免等待锁定的行?但现在我明白他的意思(任意也很重要,因为它意味着Postgresql通常会选择相同的行相同的查询任意地,这使得锁争用比真正随机选择将是一个更大的问题。postgresql是自由选择最快的方式输出'LIMIT 1',它总是选择相同的行=那些已经升级的最后

解决方案

只是一个想法:而不是调用random()使用它作为列的默认值(可以索引)。类似的方法可以使用一个增量大约为0.7 * INT_MAX的序列。

  \i tmp.sql 

CREATE TABLE opportunities
(id SERIAL NOT NULL PRIMARY KEY
,deal_id INTEGER NOT NULL DEFAULT 0
,prize_id INTEGER
,opportunity_available布尔值NOT NULL DEFAULT False
- ----------------------------- -----------
- precomputed random(),(可以索引)
,magic DOUBLE precision NOT NULL default RANDOM()
);

INSERT INTO opportunities(deal_id)
SELECT 341
FROM generate_series(1,20)gs
;
VACUUM ANALYZE机会;

PREPARE add_three(integer)AS(
with zzz AS(
UPDATE opportunities
SET prize_id = 21
,opportunity_available = True
- - 更新魔法不是*真正*需要这里...
,magic = random()
WHERE opportunities.id
IN(
SELECT opportunities.id
FROM机会
WHERE(deal_id = $ 1 AND prize_id IS NULL)
- ORDER BY RANDOM()
ORDER BY magic
LIMIT 3)
RETURNING id,magic
) -
SELECT * FROM zzz
);

PREPARE draw_one(integer)AS(
WITH upd AS(
)更新机会s
SET opportunity_available = false
FROM(
SELECT id
FROM opportunities
WHERE deal_id = $ 1
AND opportunity_available
和pg_try_advisory_xact_lock(id)
ORDER BY magic
LIMIT 1

FOR UPDATE
)sub
WHERE s.id = sub.id
RETURNING s.prize_id,s.id,magic

SELECT * FROM upd
);

SELECT * FROM opportunities;

\echo add3
EXECUTE add_three(341);
SELECT * FROM opportunities;

\echo add3 more
EXECUTE add_three(341);
SELECT * FROM opportunities;

\echo draw1
EXECUTE draw_one(341);
SELECT * FROM opportunities;

\echo draw2
EXECUTE draw_one(341);
SELECT * FROM opportunities;

VACUUM ANALYZE机会;

\echo draw3
EXECUTE draw_one(341);
SELECT * FROM opportunities;

\echo draw4
EXECUTE draw_one(341);
SELECT * FROM opportunities;


I have a (Ruby on Rails 4) app and using a postgresql query that basically look into a Opportunity table, search randomly among the rows those where opportunity is 'available=true' and update these selected rows with ' available = false'. Each time a user clicks on 'try an opportunity' the app uses the query below.

UPDATE opportunities s
SET    opportunity_available = false
FROM  (
   SELECT id
   FROM   opportunities
   WHERE  deal_id = #{@deal.id}
   AND    opportunity_available
   AND    pg_try_advisory_xact_lock(id)
   LIMIT  1
   FOR    UPDATE
   ) sub
WHERE     s.id = sub.id
RETURNING s.prize_id, s.id;
// inspired by http://stackoverflow.com/questions/33128531/put-pg-try-advisory-xact-lock-in-a-nested-subquery

I've been struggling for 5 days but I have managed to understand now roughly how it's behaving (badly): now I need to know how to change it.

Indeed, the app "becomes a little crazy" and select (and update the row) exactly by the latest I updated.

Let me be clear on my process on how I create these opportunities

  • I as admin create 20 opportunities (rows) via my app as an admin

  • then in my admin panel, I create a prize which a quantity let's say prize id =45 with quantity = 4. the app will go in Opportunity table and fill randomly (this part works perfectly) 4 rows with prize_id= 45.

Now a user runs the app and as explained above, the app will not choose randomly a row but will begin always by the latest updated: it seems the takes one of the row with a prize, then another with a prize, then another and never those without prizes (with prized_id= empty)...

Now I made some manual experiments: I tried manually change the values on my table (via pgadmin), the weirdest thing happen: if I modify row 32, then row 45, then row 67, when the user tries to play again, guess what, the lines that are picked randomly are exactly the ones I updated in the reverse order: it will pick line 67 then line45 then line 32.. It won't even consider the other rows that could be chosen (all the others opportunities where available= true).

I also tried not to use for update or the 'pg_try_advisory_xact_lock(id)' line and it seems it still has the same issue.

As, as an admin, I create first the total 20 rows, THEN the 4 winning rows, they're the last to be udpated (even if on my pgadmin screen they stay on the same rows...maybe in the backround, postgresql is ordering them as the last updated consecutively and selecting them?) and that's kind of why as soon as one of 4 these winning rows is selected, then all the others follow.

To be clear I would be OK with the app going row by row to select each opportunity available (ex: row 3, then row 4, then row 5 as per what I see on pgadmin as rows are already attributed the prizes totally randomly). the problem is it's not doing this, it's taking often all the winning rows consecutively....

I'm speechless and no clue on how to break this pattern.

Note: this pattern is not happening continuously 100% of the time but very often: for example if i have more than 4 winning rows, it's like for 2 minutes if I keep clicking as a user it behaves as said here, then it stops and seems (or I might be wrong) to behave normally then again after & minute will again only select winning rows...

EDIT 1

Here is how prizes are injected inside Opportunity table (example whehn I create a prize which id is 21 and quantity of this prize= 3)=> it randomly send them (as far as I understand) but only where there is not already a prize_id (i.e. if opportunity has prize_id=empty, it can put it there)

  SQL (2.4ms)  
  UPDATE "opportunities"
  SET "prize_id" = 21
  WHERE "opportunities"."id" 
  IN (
    SELECT "opportunities"."id"
    FROM "opportunities"
    WHERE (deal_id = 341 AND prize_id IS NULL)
    ORDER BY RANDOM()
    LIMIT 3) //
   (0.9ms)  COMMIT

This SQL query is generated by a Rails gem (called Randumb: github.com/spilliton/randumb)

EDIT 2

I moved to another more precise question: Postgresql 9.4 - FASTEST query to select and update on large dataset (>30M rows) with heavy writes/reads and locks

Indeed I think the problem is really that I need a truly RANDOM pick and move away from ARBITRARY pick.

It was already said by Erwin here Advisory locks or NOWAIT to avoid waiting for locked rows? but now I understand what he meant ("Arbitrary" also matters because it implies that Postgresql will typically pick the same row for the same query arbitrarily, which makes lock contention a much bigger issue than truly random picks would. " postgresql is free to choose the fastest way to output the 'LIMIT 1' and it's picking always the same rows= those which have been upated the last ones. But I can't do with this as the latest updated are all Winning opportunities.

解决方案

Just an idea: instead of calling random() use it as default value for a column(which can be indexed) A similar way could use a serial with an increment of about 0.7 * INT_MAX.

\i tmp.sql

CREATE TABLE opportunities
    ( id SERIAL NOT NULL PRIMARY KEY
    , deal_id INTEGER NOT NULL DEFAULT 0
    , prize_id INTEGER
    , opportunity_available boolean NOT NULL DEFAULT False
            -- ----------------------------------------
            -- precomputed random() , (could be indexed)
    , magic DOUBLE precision NOT NULL default RANDOM()
    );

INSERT INTO opportunities(deal_id)
SELECT 341
FROM generate_series(1,20) gs
    ;
VACUUM ANALYZE opportunities;

PREPARE add_three (integer) AS (
WITH zzz AS (
  UPDATE opportunities
  SET prize_id = 21
    , opportunity_available = True
    -- updating magic is not *really* needed here ...
    , magic = random()
  WHERE opportunities.id
  IN (
    SELECT opportunities.id
    FROM opportunities
    WHERE (deal_id = $1 AND prize_id IS NULL)
    -- ORDER BY RANDOM()
    ORDER BY magic
    LIMIT 3)
RETURNING id, magic
    ) -- 
SELECT * FROM zzz
    );

PREPARE draw_one (integer) AS (
  WITH upd AS (
  UPDATE opportunities s
  SET    opportunity_available = false
  FROM  (
     SELECT id
     FROM   opportunities
     WHERE  deal_id = $1
     AND    opportunity_available
     AND    pg_try_advisory_xact_lock(id)
     ORDER BY magic
     LIMIT  1

     FOR    UPDATE
     ) sub
  WHERE     s.id = sub.id
  RETURNING s.prize_id, s.id, magic
    )
SELECT * FROM upd
    );

SELECT * FROM opportunities;

\echo add3
EXECUTE add_three(341);
SELECT * FROM opportunities;

\echo add3 more
EXECUTE add_three(341);
SELECT * FROM opportunities;

\echo draw1
EXECUTE draw_one(341);
SELECT * FROM opportunities;

\echo draw2
EXECUTE draw_one(341);
SELECT * FROM opportunities;

VACUUM ANALYZE opportunities;

\echo draw3
EXECUTE draw_one(341);
SELECT * FROM opportunities;

\echo draw4
EXECUTE draw_one(341);
SELECT * FROM opportunities;

这篇关于postgresql 9.4 - 防止应用程序总是选择最新的更新行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆