在Postgresql中生成固定长度的唯一随机数 [英] Generate unique random numbers in Postgresql with fixed length

查看:1256
本文介绍了在Postgresql中生成固定长度的唯一随机数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要在Postgresql中生成固定长度为13位数字的唯一随机数。
我找到了类似的 thread 其中使用了使用 pseudo_encrypt加密的序列,但是返回的数字不是固定长度。



所以,我需要的是:得到一个带有固定的13位数字的长度,最小值为0000000000001,最大值为9999999999999。



可以吗?如果不可能从前面的零开始不是一个大问题(我认为),我可以在从db读取过程中以编程方式设置它们,但是如果Postgresql可以自己做到这一点将是很好的。



-编辑-



意识到一些有用的东西之后,我必须更改问题才能更好地解释我的需求:



我需要在Postgresql中生成唯一的最大13位固定长度的随机数(bigint)。实际上,我正在尝试使用 pseudo_encrypt 函数(64位),但返回的数字显然不是固定的最大长度13,在32位的情况下,最大长度为10位(整数),而对于64位的最大长度为19(bigint)。



那么,如何获得固定最大长度为13位数字的加密随机序列,其中最小值为1,最大值为9999999999999?



是否可以修改64位pseudo_ecrypt函数以获得此结果?还是如果不可能,是否有其他方法可以满足此要求?



伪加密功能(64位)

 创建或替换功能pseudo_encrypt(VALUE bigint)返回bigint AS $$ 
DECLARE
l1 bigint;
l2 bigint;
r1 bigint;
r2 bigint;
i int:= 0;
BEGIN
l1:=(值>>>" 32)& 4294967295 :: bigint;
r1:= VALUE& 4294967295;
而我< 3 LOOP
l2:= r1;
r2:= l1#((((((1366.0 * r1 + 150889)%714025)/ 714025.0)* 32767 * 32767):: int;
l1:= l2;
r1:= r2;
i:= i + 1;
END LOOP;
返回((l1 :: bigint<< 32)+ r1);
END;
$$语言plpgsql严格不变;


解决方案

为N < 64位值



调整bigint变体以将输出减小为 2 ^ N 相对简单code>值,其中 N 是偶数且小于64。



要获取13个十进制数字,请考虑 N 的最大值,其中 2 ^ N 具有13位数字。这是N = 42,其中 2 ^ 42 = 4398046511104



该算法通过将输入值分成两部分来工作将具有相等位数的一半减半,并使它们流经Feistel网络,实质上是对取整函数的结果进行XOR运算,并在每次迭代时交换减半。



如果在过程的每个阶段,每个半数都限制在 21 位,则保证将两个半部分合并的结果不超过42位。



这是我建议的变体:

 创建或替换功能pseudo_encrypt42(VALUE bigint)返回bigint 
AS $$
DECLARE
l1 bigint;
l2 bigint;
r1 bigint;
r2 bigint;
i int:= 0;
b21 int:=(1<< 21)-1; -21位掩码表示半数=>共42位
BEGIN
l1:= VALUE>> 21;
r1:= VALUE& b21;
而我< 3 LOOP
l2:= r1;
r2:= l1#(((((((1366 * r1 + 150889)%714025)/714025.0)* 32767 * 32767):: int& b21);
l1:= l2;
r1:= r2;
i:= i + 1;
END LOOP;
返回((l1 :: bigint<< 21)+ r1);
END;
$$语言plpgsql严格不变;

输入必须小于(2 ^ 42)-1 ,否则输出将发生冲突,因为 pseudo_encrypt42(x)= pseudo_encrypt42(x mod 2 ^ 42)



如何处理2 ^ 42到10 ^ 13之间的缺失数字?



2 ^ 42-10 ^ 13 = 5601953488896 ,所以缺少很多数字。
我不知道如何通过Feistel网络一口气解决这个问题。但是,可能可以接受的一种解决方法是在 0..M 中生成另一组唯一值,并添加 2 ^ 42 ,因此没有碰撞的风险。



可以通过相同的函数获得另一组,只需添加偏移量即可。 4398046511104 + pseudo_encrypt42(x)确保在 4398046511104 2 * 4398046511104 = 8796093022208之间唯一值,因此更接近目标。



但是,此替代方法会降低随机外观的行为,而不是具有单个范围。



输出范围,其中每个数字都可以在 0 X 之间,您将得到 N X / N 数字的不同输出范围。通过这样几个不同的分区,很容易猜出输出将位于哪个分区,而不是分区内的值。


I need to generate unique random numbers in Postgresql with a fixed length of 13 digits. I've found a similar thread where was used a sequence encrypted using "pseudo_encrypt", but the returned number was not with a fixed length.

So, what i need is: get an encrypted random sequence with a fixed length of 13 digits where the min value is 0000000000001 and a max value is 9999999999999.

Is it possible? If start with the zeros in front is not possible is not a big problem (i think), i can set them programmatically during the reading from the db, but would be great if Postgresql can do it by itself.

-- EDIT --

After have realized some useful things i must change the question in order to explain better what i need:

I need to generate unique random numbers (bigint) in Postgresql with a fixed max length of 13 digits. Actually i'm trying to use pseudo_encrypt function (64 bit), but the returned number obviously is not with a fixed max length of 13, in the 32 bit case the max length is 10 digits (int), and for the 64 bit is 19 (bigint).

So, how to get an encrypted random sequence with a fixed max length of 13 digits, where the min value is 1 and the max value is 9999999999999 ?

Is it possible to modify the 64 bit pseudo_ecrypt function in order to get this result? Or if is not possible, there are other methods to get an unique sequence with this requirements?

Pseudo Encrypt function (64bit)

CREATE OR REPLACE FUNCTION pseudo_encrypt(VALUE bigint) returns bigint   AS $$
DECLARE
l1 bigint;
l2 bigint;
r1 bigint;
r2 bigint;
i int:=0;
BEGIN
l1:= (VALUE >> 32) & 4294967295::bigint;
r1:= VALUE & 4294967295;
WHILE i < 3 LOOP
    l2 := r1;
    r2 := l1 # ((((1366.0 * r1 + 150889) % 714025) / 714025.0) * 32767*32767)::int;
    l1 := l2;
    r1 := r2;
    i := i + 1;
END LOOP;
RETURN ((l1::bigint << 32) + r1);
END;
$$ LANGUAGE plpgsql strict immutable;

解决方案

Tweaking the existing function for N < 64 bits values

It's relatively simple to tweak the bigint variant to reduce the output to 2^N values, where N is even, and less than 64.

To get 13 decimal digits, consider the maximum N for which 2^N has 13 digits. That's N=42, with 2^42=4398046511104.

The algorithm works by breaking the input value into two halves with an equal number of bits, and make them flow through the Feistel network, essentially XOR'ing with the result of the round function and swapping halves at each iteration.

If at every stage of the process, each half is limited to 21 bits then the result combining both halves is guaranteed not to exceed 42 bits.

So here's my proposed variant:

CREATE OR REPLACE FUNCTION pseudo_encrypt42(VALUE bigint) returns bigint
 AS $$
DECLARE
  l1 bigint;
  l2 bigint;
  r1 bigint;
  r2 bigint;
  i int:=0;
  b21 int:=(1<<21)-1; -- 21 bits mask for a half-number => 42 bits total
BEGIN
  l1:= VALUE >> 21;
  r1:= VALUE & b21;
  WHILE i < 3 LOOP
    l2 := r1;
    r2 := l1 # (((((1366*r1+150889)%714025)/714025.0)*32767*32767)::int & b21);
    l1 := l2;
    r1 := r2;
    i := i + 1;
  END LOOP;
  RETURN ((l1::bigint << 21) + r1);
END;
$$ LANGUAGE plpgsql strict immutable;

The input must be less than (2^42)-1, otherwise the outputs will collide , as pseudo_encrypt42(x) = pseudo_encrypt42(x mod 2^42).

What can be done about the missing numbers between 2^42 and 10^13 ?

2^42 - 10^13 = 5601953488896 so that's quite a lot of missing numbers. I don't know how to help with that in a single pass with the Feistel network. One workaround that might be acceptable, though, is to generate another set of unique values in 0..M and add 2^42 to them, so there's no risk of collision.

This another set could be obtained by the same function, just with the offset added. 4398046511104 + pseudo_encrypt42(x) is guaranteed to be between 4398046511104 and 2*4398046511104 = 8796093022208 unique values so that's closer to the goal. The same technique could be applied with several other ranges, not even necessarily of the same size.

However this workaround degrades the random-looking behavior , as instead of having a single output range where every number can be between 0 and X, you'd get N distinct output ranges of X/N numbers. With several distinct partitions like that, it's easy to guess in what partition the output will be, just not what value inside the partition.

这篇关于在Postgresql中生成固定长度的唯一随机数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆