生成唯一的随机字符串 [英] Generate unique random strings

查看:458
本文介绍了生成唯一的随机字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在用Dancer写一个非常小的URL缩短器。它使用REST插件将发布的URL存储在具有六个字符串的数据库中,该字符串由用户用来访问缩短的URL。

I am writing a very small URL shortener with Dancer. It uses the REST plugin to store a posted URL in a database with a six character string which is used by the user to access the shorted URL.

现在我有点

sub generate_random_string{
    my $length_of_randomstring = shift; # the length of 
                                        # the random string to generate

    my @chars=('a'..'z','A'..'Z','0'..'9','_');
    my $random_string;
    for(1..$length_of_randomstring){
        # rand @chars will generate a random 
        # number between 0 and scalar @chars
        $random_string.=$chars[rand @chars];
    }

    # Start over if the string is already in the Database
    generate_random_string(6) if database->quick_select('urls', { shortcut => $random_string });

    return $random_string;
}

这将生成一个六个字符的字符串,如果生成的字符串为,则递归调用该函数已经在数据库中。我知道有63 ^ 6个可能的字符串,但是如果数据库收集更多的条目,这将需要一些时间。也许它将变成几乎无限的递归,我想防止这种递归。

This generates a six char string and calls the function recursively if the generated string is already in the DB. I know there are 63^6 possible strings but this will take some time if the database gathers more entries. And maybe it will become a nearly infinite recursion, which I want to prevent.

是否有生成唯一的随机字符串的方法,可以防止递归?

Are there ways to generate unique random strings, which prevent recursion?

预先感谢

推荐答案

对于您的函数将要进行多少次迭代(或递归),我们并不需要费力。我相信每次调用时,预期的迭代次数都是按地理分布的(即,第一次成功之前的试验次数由 geomtric distribution ),其平均值为1 / p,其中p是成功找到未使用的字符串的概率。我相信p只是1-n / 63 ^ 6,其中n是当前存储的字符串数。因此,我认为您将需要在数据库中存储300亿个字符串(〜63 ^ 6/2),然后函数平均每次调用才能超过两次(p = .5)。

We don't really need to be hand-wavy about how many iterations (or recursions) of your function there will be. I believe at every invocation, the expected number of iterations is geomtrically distributed (i.e. number of trials before first success is governed by the geomtric distribution), which has mean 1/p, where p is the probability of successfully finding an unused string. I believe that p is just 1 - n/63^6, where n is the number of currently stored strings. Therefore, I think that you will need to have stored 30 billion strings (~63^6/2) in your database before your function recurses on average more than 2 times per call (p = .5).

此外,地理分布的方差为1-p / p ^ 2,因此即使在300亿个条目中,一个标准偏差也仅为sqrt(2)。因此,我希望循环执行〜99%的时间少于2 + 2 * sqrt(2)插入或大约5次迭代。换句话说,我只是不必为此担心太多。

Furthermore, the variance of the geomtric distribution is 1-p/p^2, so even at 30 billion entries, one standard deviation is just sqrt(2). Therefore I expect ~99% of the time that the loop will take fewerer than 2 + 2*sqrt(2) interations or ~ 5 iterations. In other words, I would just not worry too much about it.

这篇关于生成唯一的随机字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆