创建自己的TinyURL的风格UID [英] Creating your own Tinyurl style uid

查看:131
本文介绍了创建自己的TinyURL的风格UID的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在写一个小文章在力所能及可读替代的GUID / UID的,例如那些用在TinyURL的为URL哈希(这往往是印在杂志上,所以需要很短)。

I'm writing a small article on humanly readable alternatives to Guids/UIDs, for example those used on TinyURL for the url hashes (which are often printed in magazines, so need to be short).

简单的UID,我生是 - 6个字符:无论是小写字母(az)或0-9。

The simple uid I'm generating is - 6 characters: either a lowercase letter (a-z) or 0-9.

根据我的计算船长,这是6互斥事件,虽然计算冲突的概率变得比P(A或B)= P(A)+ P(B),因为很明显它包括更难一点数字从code以下,就可以看到它的作品了是否使用50/50使用数字或字母。

"According to my calculations captain", that's 6 mutually exclusive events, although calculating the probability of a clash gets a little harder than P(A or B) = P(A) + P(B), as obviously it includes numbers and from the code below, you can see it works out whether to use a number or letter using 50/50.

我感兴趣的冲突率,若跌破code是预期的冲突率,你会得到从生成散列的逼真模拟。平均而言,我得到每百万40-50冲突,心中却暴露了UID不会立刻产生一百万次,但可能只有大约10到1000倍一分钟。

I'm interested in the clash rate and if the code below is a realistic simulation of anticipated clash rate you'd get from generating a hash. On average I get 40-50 clashes per million, however bare in mind the uid wouldn't be generated a million times at once, but probably only around 10-1000 times a minute.

每次有什么冲突的概率,任何人都可以提出这样做​​的更好的办法?

What is the probability of a clash each time, and can anyone suggest a better way of doing it?

static Random _random = new Random();

public static void main()
{
    // Size of the key, 6
    HashSet<string> set = new HashSet<string>();
    int clashes = 0;
    for (int n=0;n < 1000000;n++)
    {
        StringBuilder builder = new StringBuilder();

        for (int i =0;i < 7;i++)
        {
            if (_random.NextDouble() > 0.5)
            {
                builder.Append((char)_random.Next(97,123));
            }
            else
            {
                builder.Append(_random.Next(0,9).ToString());
            }
        }

        if (set.Contains(builder.ToString()))
        {
            clashes++;
            Console.WriteLine("clash: (" +n+ ")" +builder.ToString());
        }

        set.Add(builder.ToString());
        _random.Next();
        //Console.Write(builder.ToString());
    }

    Console.WriteLine("Clashes: " +clashes);
    Console.ReadLine();
}

更新: 这里所得到的制品从这个问题

我真的问了两个问题,在这里,所以我是作弊。答案我是后RCAR的,但是Sklivvz的也是答案的第二部分(可选)。是否有可能使一个自定义的唯一ID生成的数据库,或者会是客户端(这将是2可能读取第一)?

I really asked two questions here so I was cheating. The answer I was after was rcar's, however Sklivvz's is also the answer to the 2nd part (an alternative). Is it possible to make a custom unique id generator in a database, or would it be client side (which would be 2 possible reads first)?

这可以用于通过电话或印刷材料,我后在数据库或其他存储使用IDS总的想法,而不是一个巨16字节GUID

The general idea I was after was using Ids in databases or other stores that can be used by phone or printed material, not a giant 16 byte guid.

更新2:我把公式有两个以上的,而不是2个独立的人互斥事件(如得到一个'A'第一次并不意味着你不能得到一个'一第二次)。如果已经P(A和B)= P(A)×P(B)

UPDATE 2: I put the formula for two mutually exclusive events above instead of 2 independent ones (as getting an 'a' the first time doesn't mean you can't get an 'a' the second time). Should've been P(A and B) = P(A) x P(B)

推荐答案

针对一个特定ID碰撞的概率为:

The probability of a collision against one specific ID is:

p = ( 0.5 * ( (0.5*1/10) + (0.5*1/26) ) )^6

它是约1.7×10 ^ -9。

which is around 1.7×10^-9.

碰撞产生n个编号为1-P的n次方,所以你有大约0.17%,为每一个新的插入碰撞的机会后百万的ID已被插入,约1.7%10千万美元后后的概率身份证,1亿后,在16%左右。

The probability of a collision after generating n IDs is 1-p^n, so you'll have roughly a 0.17% chance of a collision for each new insertion after 1 million IDs have been inserted, around 1.7% after 10 million IDs, and around 16% after 100 million.

1000 IDS /分钟工程以约43万片/月,从而Sklivvz指出,使用一些递增ID很可能会是一个更好的方式去在这种情况下。

1000 IDs/minute works out to about 43 million/month, so as Sklivvz pointed out, using some incrementing ID is probably going to be a better way to go in this case.

编辑:

要解释的数学,他基本上是抛硬币,然后选择一个数字或字母的6倍。有一个0.5的概率硬币翻转匹配,然后50%的时间有1/10的机会匹配和的1/26的机会匹配50%的机会。这发生6次独立,所以你乘的概率在一起。

To explain the math, he's essentially flipping a coin and then picking a number or letter 6 times. There's a 0.5 probability that the coin flip matches, and then 50% of the time there's a 1/10 chance of matching and a 50% chance of a 1/26 chance of matching. That happens 6 times independently, so you multiply those probabilities together.

这篇关于创建自己的TinyURL的风格UID的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆