为什么与MariaDB 10.2 RAND()函数发生如此多的冲突? [英] Why so many collisions with the MariaDB 10.2 RAND() function?

查看:53
本文介绍了为什么与MariaDB 10.2 RAND()函数发生如此多的冲突?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

注意:在Windows Server 2012 R2 Standard上运行MariaDB 10.2.27.

Note: Running MariaDB 10.2.27 on Windows Server 2012 R2 Standard.

我想生成用于MariaDB的随机整数,因此我一直在尝试使用MariaDB RAND()函数.无论是我的期望还是理解是不合理的(绝对有可能!),或者MariaDB RAND()函数不是很随机.

I wanted to generate random integers for use in MariaDB so I have been experimenting with the MariaDB RAND() function. Either my expectations & understanding are way off base (definitely possible!) or the MariaDB RAND() function is not very random.

使用BIGINT(20)列,我想生成最多16位数字的随机整数,因此我使用了以下SQL: FLOOR(RAND()* 9999999999999999)+1).我在循环中使用的确切SQL是:

Using a BIGINT(20) column I wanted to generate random integers up to 16 digits in length, so I used this SQL: FLOOR(RAND()*9999999999999999)+1). The exact SQL I use, in a loop is:

INSERT INTO rnd_test VALUES(FLOOR(RAND()* 9999999999999999)+1);

表rnd_test有一个列,该列是BIGINT(20)并且是主要ID.

Table rnd_test has a single column which is a BIGINT(20) and is the primary ID.

有10 ^ 16个数字的池,并考虑到生日悖论,我希望在生成10 ^ 8个数字后发生碰撞的几率约为50%.显然,这有一些差异,但是每次我运行插入循环时,我几乎都立即看到冲突,然后每2000或3000个生成的数字(有时更频繁)重复一次.生成约50,000个随机数后,我发现每隔几百个数字就会发生冲突.

With a pool of 10^16 numbers and considering the Birthday Paradox I would expect about a 50% chance of a collision after 10^8 numbers generated. Obviously there is some variance to this but every time I run the insert loop I start seeing collisions almost immediately and then repeating every 2000 or 3000 generated numbers, sometimes more often. After ~50,000 randoms have been generated I'm seeing collisions every few hundred numbers.

考虑到我的理解可能是完全错误的,我调整了循环vb.net代码以在本地生成随机数,然后将其插入到MariaDB表中.我在例程的顶部定义了一个新的 System.Random,然后使用它来生成随机数:

Thinking that maybe my understanding was wildly incorrect I adjusted my looping vb.net code to generate the random locally and then insert that into the MariaDB table. I define a new System.Random at the top of the routine and then use this to generate random numbers:

Dim r As Long = CLng(Math.Floor(rNum.NextDouble()* 9999999999999999))+ 1

通常,这种方法效果更好,但仍然不如我预期的好.在发生冲突之前,它通常会运行约100,000次迭代,然后在此之后,每10,000个随机数似乎会发生一两次碰撞.有时会有10,000个批次完全没有碰撞.

Generally this works much better but still not quite as well as I would expect. It will usually run for about 100,000 iterations before a collision occurs, and then after that there seems to be one or two collisions per 10,000 randoms generated. Sometimes a batch of 10,000 will go by without any collisions at all.

那么,为什么与vb.net函数相比,MariaDB RAND()函数的性能如此差?

So, why is the MariaDB RAND() function performing so poorly compared to the vb.net function?

推荐答案

使用5.6的RAND进行的进一步实验表明,只有30位是好的.那就是它只有大约十亿个不同的值.

Further experiments with 5.6's RAND indicate that it is good to only 30 bits. That is it has only about one billion distinct values.

通过bugs.mysql.com和/或MariaDB提交错误.

File a bug with bugs.mysql.com and/or MariaDB.

30位对于大多数应用程序已经足够了.在那些它还不够出色的应用程序中,大多数人不会注意到它的缺点.

30 bits is good enough for most applications. Of those applications for which it is not good enough, most people won't notice its wimpiness.

FLOAT 具有24位精度; DOUBLE 有53.所以30个需要 DOUBLE ,但无法填写.

FLOAT has 24 bits of precision; DOUBLE has 53. So 30 needs DOUBLE, but fails to fill it up.

这篇关于为什么与MariaDB 10.2 RAND()函数发生如此多的冲突?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆