如何在Postgres 9.6+中生成长度为N的随机,唯一的字母数字ID? [英] How to generate a random, unique, alphanumeric ID of length N in Postgres 9.6+?

查看:116
本文介绍了如何在Postgres 9.6+中生成长度为N的随机,唯一的字母数字ID?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在StackOverflow上看到了许多不同的解决方案,它们跨越了很多年,并且具有许多Postgres版本,但其中有些版本诸如 gen_random_bytes 之类的新功能我想再问一遍,看看新版本中是否有更简单的解决方案。



给出的ID包含 a-zA- Z0-9 ,并且大小取决于使用位置,例如...

  bTFTxFDPPq 
tcgHAdW3BD
IIo11r9J0D
FUW5I8iCiS

uXolWvg49Co5EfCo
LOscuAZu37yV84Sa
YyrbwLTRDb01TmyE $ b $ KHbBKQbbQKbBQQbAQQB $ QB $ QLbWQB $ QB $ QB $ B $ Q $$$$$$$$$$%$
qgpDcrNSMg87ngwcXTaZ9iImoUmXhSAv
RVZjqdKvtoafLi1O5HlvlpJoKzGeKJYS
3Rls4DjWxJaLfIJyXIEpcjWuh51aHHtK
<$$$$ b = https://stripe.com/docs/api#charges rel = noreferrer> Stripe使用的ID 。)



在Postgres 9.6中,如何通过一种简便的方法为不同的用例指定不同的长度,如何安全,随机地生成它们(就减少冲突和降低可预测性而言) +?



我想理想的解决方案具有类似于以下内容的签名:

  generate_uid(size integer)返回文本

其中 size 是可定制的,具体取决于您自己的权衡取舍,以减少冲突的机会与减小字符串大小的可用性。



据我所知,它必须使用 gen_random_bytes()代替 random()以获得真正的随机性,以减少被猜测的机会。



谢谢!






我知道有 gen_random_uuid() 用于UUID,但我不想在这种情况下使用它们。我正在寻找可以使我获得类似于Stripe(或其他)使用的ID的东西,例如: id: ch_19iRv22eZvKYlo2CAxkjuHxZ



此要求也是为什么 encode(gen_random_bytes(),'hex')的原因。在这种情况下不太合适,因为它会减少字符集,从而迫使我增加字符串的长度以避免冲突。



我目前正在做这在应用程序层中,但是我希望将其移到数据库层中以减少相互依赖性。在应用程序层中执行此操作的Node.js代码如下所示:

  var crypto = require('crypto' ); 
var set ='ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789';

函数generate(length){
var bytes = crypto.randomBytes(length);
var chars = [];

for(var i = 0; i chars.push(set [bytes [i]%set.length]);
}

return char.join(’’);
}


解决方案

弄清楚了,这是完成该功能的函数:

 创建或替换功能generate_uid(size INT)返回文本为$$ 
声明
个字符TEXT:='ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789';
个字节BYTEA:= gen_random_bytes(size);
l INT:=长度(字符);
i INT:= 0;
输出文本:=’;
开始
,而我<大小LOOP
输出:=输出|| substr(characters,get_byte(bytes,i)%l + 1,1);
i:= i + 1;
END LOOP;
RETURN输出;
END;
$$语言plpgsql挥发;

然后运行它就可以了:

  generate_uid(10)
-'3Rls4DjWxJ'






警告



执行此操作时,需要确保创建的ID的长度足以随着创建的对象数量的增加,避免随着时间的推移发生冲突。由于生日悖论因此,对于任何合理创建的对象,您可能希望其长度大于(或大于) 10 ,我只是使用了 10






用法



使用已定义的函数,您可以在表定义中使用它,如下所示:

 创建表集合(
id TEXT PRIMARY KEY DEFAULT generate_uid(10),
name TEXT NOT NULL,
...
);(

然后在插入数据时,就像这样:

  INSERT INTO集合(名称)VALUES(一个); 
INSERT INTO集合(名称)VALUES( Two);
INSERT INTO集合(名称)VALUES(三);
SELECT * FROM集合;

它将自动生成 id 值:

  id |名称| ... 
----------- + -------- + -----
owmCAx552Q | ian |
ZIofD6l3X9 |胜利者|






带前缀的用法



或者您可能想在查看日志或调试器中的单个ID时添加前缀以方便使用(类似于 Stripe如何做到),就像这样:

 创建表集合(
id TEXT PRIMARY KEY DEFAULT('col_'|| generate_uid(10)),
name TEXT NOT NULL,
...
);

插入收藏集(名称)VALUES(一个);
INSERT INTO集合(名称)VALUES( Two);
INSERT INTO集合(名称)VALUES(三);
SELECT * FROM集合;

id |名称| ...
--------------- + -------- + -----
col_wABNZRD5Zk | ian |
col_ISzGcTVj8f |胜利者|


I've seen a bunch of different solutions on StackOverflow that span many years and many Postgres versions, but with some of the newer features like gen_random_bytes I want to ask again to see if there is a simpler solution in newer versions.

Given IDs which contain a-zA-Z0-9, and vary in size depending on where they're used, like...

bTFTxFDPPq
tcgHAdW3BD
IIo11r9J0D
FUW5I8iCiS

uXolWvg49Co5EfCo
LOscuAZu37yV84Sa
YyrbwLTRDb01TmyE
HoQk3a6atGWRMCSA

HwHSZgGRStDMwnNXHk3FmLDEbWAHE1Q9
qgpDcrNSMg87ngwcXTaZ9iImoUmXhSAv
RVZjqdKvtoafLi1O5HlvlpJoKzGeKJYS
3Rls4DjWxJaLfIJyXIEpcjWuh51aHHtK

(Like the IDs that Stripe uses.)

How can you generate them randomly and safely (as far as reducing collisions and reducing predictability goes) with an easy way to specify different lengths for different use cases, in Postgres 9.6+?

I'm thinking that ideally the solution has a signature similar to:

generate_uid(size integer) returns text

Where size is customizable depending on your own tradeoffs for lowering the chance of collisions vs. reducing the string size for usability.

From what I can tell, it must use gen_random_bytes() instead of random() for true randomness, to reduce the chance that they can be guessed.

Thanks!


I know there's gen_random_uuid() for UUIDs, but I don't want to use them in this case. I'm looking for something that gives me IDs similar to what Stripe (or others) use, that look like: "id": "ch_19iRv22eZvKYlo2CAxkjuHxZ" that are as short as possible while still containing only alphanumeric characters.

This requirement is also why encode(gen_random_bytes(), 'hex') isn't quite right for this case, since it reduces the character set and thus forces me to increase the length of the strings to avoid collisions.

I'm currently doing this in the application layer, but I'm looking to move it into the database layer to reduce interdependencies. Here's what the Node.js code for doing it in the application layer might look like:

var crypto = require('crypto');
var set = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789';

function generate(length) {
  var bytes = crypto.randomBytes(length);
  var chars = [];

  for (var i = 0; i < bytes.length; i++) {
    chars.push(set[bytes[i] % set.length]);
  }

  return chars.join('');
}

解决方案

Figured this out, here's a function that does it:

CREATE OR REPLACE FUNCTION generate_uid(size INT) RETURNS TEXT AS $$
DECLARE
  characters TEXT := 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789';
  bytes BYTEA := gen_random_bytes(size);
  l INT := length(characters);
  i INT := 0;
  output TEXT := '';
BEGIN
  WHILE i < size LOOP
    output := output || substr(characters, get_byte(bytes, i) % l + 1, 1);
    i := i + 1;
  END LOOP;
  RETURN output;
END;
$$ LANGUAGE plpgsql VOLATILE;

And then to run it simply do:

generate_uid(10)
-- '3Rls4DjWxJ'


Warning

When doing this you need to be sure that the length of the IDs you are creating is sufficient to avoid collisions over time as the number of objects you've created grows, which can be counter-intuitive because of the Birthday Paradox. So you will likely want a length greater (or much greater) than 10 for any reasonably commonly created object, I just used 10 as a simple example.


Usage

With the function defined, you can use it in a table definition, like so:

CREATE TABLE collections (
  id TEXT PRIMARY KEY DEFAULT generate_uid(10),
  name TEXT NOT NULL,
  ...
);

And then when inserting data, like so:

INSERT INTO collections (name) VALUES ('One');
INSERT INTO collections (name) VALUES ('Two');
INSERT INTO collections (name) VALUES ('Three');
SELECT * FROM collections;

It will automatically generate the id values:

    id     |  name  | ...
-----------+--------+-----
owmCAx552Q | ian    |
ZIofD6l3X9 | victor |


Usage with a Prefix

Or maybe you want to add a prefix for convenience when looking at a single ID in the logs or in your debugger (similar to how Stripe does it), like so:

CREATE TABLE collections (
  id TEXT PRIMARY KEY DEFAULT ('col_' || generate_uid(10)),
  name TEXT NOT NULL,
  ...
);

INSERT INTO collections (name) VALUES ('One');
INSERT INTO collections (name) VALUES ('Two');
INSERT INTO collections (name) VALUES ('Three');
SELECT * FROM collections;

      id       |  name  | ...
---------------+--------+-----
col_wABNZRD5Zk | ian    |
col_ISzGcTVj8f | victor |

这篇关于如何在Postgres 9.6+中生成长度为N的随机,唯一的字母数字ID?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆