创建自己的TinyURL的风格UID [英] Creating your own Tinyurl style uid
问题描述
我正在写一个小文章在力所能及可读替代的GUID / UID的,例如那些用在TinyURL的为URL哈希(这往往是印在杂志上,所以需要很短)。
I'm writing a small article on humanly readable alternatives to Guids/UIDs, for example those used on TinyURL for the url hashes (which are often printed in magazines, so need to be short).
简单的UID,我生是 - 6个字符:无论是小写字母(az)或0-9。
The simple uid I'm generating is - 6 characters: either a lowercase letter (a-z) or 0-9.
根据我的计算船长,这是6互斥事件,虽然计算冲突的概率变得比P(A或B)= P(A)+ P(B),因为很明显它包括更难一点数字从code以下,就可以看到它的作品了是否使用50/50使用数字或字母。
"According to my calculations captain", that's 6 mutually exclusive events, although calculating the probability of a clash gets a little harder than P(A or B) = P(A) + P(B), as obviously it includes numbers and from the code below, you can see it works out whether to use a number or letter using 50/50.
我感兴趣的冲突率,若跌破code是预期的冲突率,你会得到从生成散列的逼真模拟。平均而言,我得到每百万40-50冲突,心中却暴露了UID不会立刻产生一百万次,但可能只有大约10到1000倍一分钟。
I'm interested in the clash rate and if the code below is a realistic simulation of anticipated clash rate you'd get from generating a hash. On average I get 40-50 clashes per million, however bare in mind the uid wouldn't be generated a million times at once, but probably only around 10-1000 times a minute.
每次有什么冲突的概率,任何人都可以提出这样做的更好的办法?
What is the probability of a clash each time, and can anyone suggest a better way of doing it?
static Random _random = new Random();
public static void main()
{
// Size of the key, 6
HashSet<string> set = new HashSet<string>();
int clashes = 0;
for (int n=0;n < 1000000;n++)
{
StringBuilder builder = new StringBuilder();
for (int i =0;i < 7;i++)
{
if (_random.NextDouble() > 0.5)
{
builder.Append((char)_random.Next(97,123));
}
else
{
builder.Append(_random.Next(0,9).ToString());
}
}
if (set.Contains(builder.ToString()))
{
clashes++;
Console.WriteLine("clash: (" +n+ ")" +builder.ToString());
}
set.Add(builder.ToString());
_random.Next();
//Console.Write(builder.ToString());
}
Console.WriteLine("Clashes: " +clashes);
Console.ReadLine();
}
更新: 这里所得到的制品从这个问题
我真的问了两个问题,在这里,所以我是作弊。答案我是后RCAR的,但是Sklivvz的也是答案的第二部分(可选)。是否有可能使一个自定义的唯一ID生成的数据库,或者会是客户端(这将是2可能读取第一)?
I really asked two questions here so I was cheating. The answer I was after was rcar's, however Sklivvz's is also the answer to the 2nd part (an alternative). Is it possible to make a custom unique id generator in a database, or would it be client side (which would be 2 possible reads first)?
这可以用于通过电话或印刷材料,我后在数据库或其他存储使用IDS总的想法,而不是一个巨16字节GUID
The general idea I was after was using Ids in databases or other stores that can be used by phone or printed material, not a giant 16 byte guid.
更新2:我把公式有两个以上的,而不是2个独立的人互斥事件(如得到一个'A'第一次并不意味着你不能得到一个'一第二次)。如果已经P(A和B)= P(A)×P(B)
UPDATE 2: I put the formula for two mutually exclusive events above instead of 2 independent ones (as getting an 'a' the first time doesn't mean you can't get an 'a' the second time). Should've been P(A and B) = P(A) x P(B)
推荐答案
针对一个特定ID碰撞的概率为:
The probability of a collision against one specific ID is:
p = ( 0.5 * ( (0.5*1/10) + (0.5*1/26) ) )^6
它是约1.7×10 ^ -9。
which is around 1.7×10^-9.
碰撞产生n个编号为1-P的n次方,所以你有大约0.17%,为每一个新的插入碰撞的机会后百万的ID已被插入,约1.7%10千万美元后后的概率身份证,1亿后,在16%左右。
The probability of a collision after generating n IDs is 1-p^n, so you'll have roughly a 0.17% chance of a collision for each new insertion after 1 million IDs have been inserted, around 1.7% after 10 million IDs, and around 16% after 100 million.
1000 IDS /分钟工程以约43万片/月,从而Sklivvz指出,使用一些递增ID很可能会是一个更好的方式去在这种情况下。
1000 IDs/minute works out to about 43 million/month, so as Sklivvz pointed out, using some incrementing ID is probably going to be a better way to go in this case.
编辑:
要解释的数学,他基本上是抛硬币,然后选择一个数字或字母的6倍。有一个0.5的概率硬币翻转匹配,然后50%的时间有1/10的机会匹配和的1/26的机会匹配50%的机会。这发生6次独立,所以你乘的概率在一起。
To explain the math, he's essentially flipping a coin and then picking a number or letter 6 times. There's a 0.5 probability that the coin flip matches, and then 50% of the time there's a 1/10 chance of matching and a 50% chance of a 1/26 chance of matching. That happens 6 times independently, so you multiply those probabilities together.
这篇关于创建自己的TinyURL的风格UID的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!