好主意/坏主意?在一小组子查询结果之外使用MySQL RAND()? [英] Good Idea/Bad Idea? Using MySQL RAND() outside of a small set of subquery results?

查看:71
本文介绍了好主意/坏主意?在一小组子查询结果之外使用MySQL RAND()?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

因此,在MySQL中,我读到对于有很多行的大型表,使用ORDER BY RAND()是个坏主意(据说,即使有约500个行表).缓慢且效率低下.很多行扫描.

So in MySQL, I've read that for large tables with lots of rows, using ORDER BY RAND() is a bad idea (even with ~500 row tables, supposedly). Slow and inefficient. Lots of row scanning.

这(下面)看起来如何呢?

How does this (below) seem for an alternative?

SELECT * FROM(...通常返回少于20行的集合的子查询...)ORDER BY RAND()LIMIT 8

SELECT * FROM (...subquery that generally returns a set of fewer than 20 rows...) ORDER BY RAND() LIMIT 8

我将选择一个小的子集,而不是对大量数据使用RAND(),然后才对那些返回的行应用RAND().在所有情况的99.9%中,上面看到的子查询应选择少于20行(实际上,通常少于8行).

Instead of using RAND() on a large set of data, I'd select a small subset, and only then would I apply RAND() on those returned rows. In 99.9% of all cases, the subquery seen above should select fewer than 20 rows (and in fact, it's generally fewer than 8).

好奇地听到人们的想法.

Curious to hear what people think.

(仅供参考,我正在用PHP做MySQL的事情.)

(Just for reference, I'm doing my MySQL stuff with PHP.)

谢谢!

推荐答案

实际上...我最终进行了测试,并且我可能已经回答了自己的问题.我以为我会在此处发布此信息,以防对其他人有用. (如果我在这里做错了什么,请告诉我!)

Actually...I ended up running a test and I might have answered my own question. I thought I'd post this information here in case it was useful for anyone else. (If I've done anything wrong here, please let me know!)

这真是令人惊讶...

This is kind of surprising...

与已阅读的所有内容相反,我创建了一个名为TestData的表,该表具有100万行,并运行以下查询:

Contrary to everything that I've read, I created a table called TestData with 1 million rows and ran the following query:

SELECT * FROM TestData WHERE号= 41 ORDER BY RAND()限制8

SELECT * FROM TestData WHERE number = 41 ORDER BY RAND() LIMIT 8

...,它平均返回0.0070秒的行.我真的不明白为什么RAND()的声誉如此差.至少在这种情况下,对我来说似乎很有用.

...and it returned the rows in an average of 0.0070 seconds. I don't really see why RAND() has such a bad reputation. It seems pretty usable to me, at least in this particular situation.

我的表格中有三列:

id [BIGINT(20)] |文本字段[tinytext] |编号[BIGINT(20)]

id [BIGINT(20)] | textfield [tinytext] | number [BIGINT(20)]

ID上的主键,数字上的索引.

Primary Key on id, index on number.

我认为MySQL足够聪明,可以知道它只应将RAND()应用于"WHERE number = 41"返回的20行? (我只添加了20行,其数字"的值为41.)

I guess MySQL is smart enough to know that it should only be applying RAND() to the 20 rows that are returned by "WHERE number = 41" ? (I specifically added only 20 rows that had the value 41 for 'number'.)

备用子查询方法返回结果的平均时间约为.0080秒,这比非子查询方法要慢.

The alternate subquery method returns results with an average time of around .0080 seconds, which is slower than the non-subquery method.

子查询方法:SELECT * FROM(SELECT * FROM TestData WHERE number = 41)as t ORDER BY RAND()LIMIT 8

Subquery method: SELECT * FROM (SELECT * FROM TestData WHERE number = 41) as t ORDER BY RAND() LIMIT 8

这篇关于好主意/坏主意?在一小组子查询结果之外使用MySQL RAND()?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆