postgres:从表中获取随机条目 - 太慢 [英] postgres: get random entries from table - too slow

查看:66
本文介绍了postgres:从表中获取随机条目 - 太慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的 postgres 数据库中,我有以下关系(为了这个问题而简化):

In my postgres database, I have the following relationships (simplified for the sake of this question):

Objects (currently has about 250,000 records)
-------
n_id
n_store_object_id (references store.n_id, 1-to-1 relationship, some objects don't have store records)
n_media_id (references media.n_id, 1-to-1 relationship, some objects don't have media records)

Store (currently has about 100,000 records)
-----
n_id
t_name,
t_description,
n_status,
t_tag

Media
-----
n_id
t_media_path

到目前为止,一切都很好.当我需要查询数据时,我运行这个(注意最后的limit 2,作为要求的一部分):

So far, so good. When I need to query the data, I run this (note the limit 2 at the end, as part of the requirement):

select
    o.n_id,
    s.t_name,
    s.t_description,
    me.t_media_path
from
    objects o
    join store s on (o.n_store_object_id = s.n_id and s.n_status > 0 and s.t_tag is not null)
    join media me on o.n_media_id = me.n_id
limit
    2

这工作正常,并按预期返回了两个条目.执行时间大约为 20 毫秒 - 很好.

This works fine and gives me two entries back, as expected. The execution time on this is about 20 ms - just fine.

现在每次运行查询时我都需要获取 2 个随机条目.我想我会添加 order by random(),就像这样:

Now I need to get 2 random entries every time the query runs. I thought I'd add order by random(), like so:

select
    o.n_id,
    s.t_name,
    s.t_description,
    me.t_media_path
from
    objects o
    join store s on (o.n_store_object_id = s.n_id and s.n_status > 0 and s.t_tag is not null)
    join media me on o.n_media_id = me.n_id
order by
    random()
limit
    2

虽然这给出了正确的结果,但执行时间现在约为 2,500 毫秒(超过 2 秒).这显然是不可接受的,因为它是为获取网络应用中页面数据而运行的众多查询之一.

While this gives the right results, the execution time is now about 2,500 ms (over 2 seconds). This is clearly not acceptable, as it's one of a number of queries to be run to get data for a page in a web app.

所以,问题是:我怎样才能获得上述随机条目,但仍将执行时间保持在合理的时间内(即,对于我的目的而言,低于 100 毫秒是可以接受的)?

So, the question is: how can I get random entries, as above, but still keep the execution time within some reasonable amount of time (i.e. under 100 ms is acceptable for my purpose)?

推荐答案

我认为您最好先选择随机对象,然后在选择这些对象后执行连接.即,查询一次以选择随机对象,然后再次查询以加入那些被选择的对象.

I'm thinking you'll be better off selecting random objects first, then performing the join to those objects after they're selected. I.e., query once to select random objects, then query again to join just those objects that were selected.

这篇关于postgres:从表中获取随机条目 - 太慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆