Ruby 中需要带有字母数字的唯一随机字符串 [英] Unique random string with alphanumberic required in Ruby

查看:47
本文介绍了Ruby 中需要带有字母数字的唯一随机字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用以下代码在 Ruby 中生成 [A-Z a-z 0-9] 的唯一 10 个字符的随机字符串:

random_code = [*('a'..'z'),*('0'..'9'),*('A'..'Z')].shuffle[0,10].加入

但是,有时这个随机字符串不包含数字或大写字符.你能帮我提供一种方法来生成一个唯一的随机字符串,该字符串至少需要一个数字、一个大写和一个小写字符吗?

解决方案

down = ('a'..'z').to_aup = ('A'..'Z').to_a数字 = ('0'..'9').to_a全部 = 向下 + 向上 + 数字[向下采样,向上采样,数字采样].concat(7.times.map { all.sample }).洗牌.加入#=>TioS8TYw0F"

.concat((((down+up+digits).sample(7))).shuffle.join结尾def提取1(arr)i = arr.size.times.to_a.samplec = arr[i]arr.delete_at(i)C结尾rnd_str #=>YTLe0WGoa1"rnd_str #=>NrBmanE9bT"

down.sample.shift(等)本来会比 extract1 更紧凑,但效率低下实在是难以忍受.

如果您不想重复随机字符串,只需保留您生成的字符串列表.如果生成列表中的另一个,则丢弃它并生成另一个.但是,您不太可能需要生成任何额外的内容.例如,如果您生成 100 个随机字符串(满足至少一个小写字母、大写字母和数字的要求),则出现一个或多个重复字符串的几率约为 700,000 分之一:

t = 107_518_933_731n = t+1t = t.to_f(1.0 - 100.times.reduce(1.0) { |prod,_| prod * (n -= 1)/t }).round(10)#=>1.39e-07

其中 t = C(62,10)C(62,10) 定义如下.

另一种选择

有一种非常简单的方法可以做到这一点,结果证明是非常有效的:只需采样而不替换,直到找到至少满足小写字母、一个大写字母和一个数字的要求的样本.我们可以这样做:

DOWN = ('a'..'z').to_aUP = ('A'..'Z').to_aDIGITS = ('0'..'9').to_a全部 = 向下 + 向上 + 数字def rnd_str循环做arr = ALL.sample(10)打破 arr.shuffle.join 除非 (DOWN&&arr).empty?||(UP&&arr).空?||(DIGITS&&arr).空?结尾结尾rnd_str #=>3jRkHcP7Ge"rnd_str #=>B0s81x4Jto

在找到好的"样本之前,我们平均必须拒绝多少个样本?事实证明(如果您真的,真的感兴趣,请参见下文)获得坏"字符串的概率(即,从 all 的 62 个元素中随机选择 10 个字符)code>, without replacement, 没有小写字母,没有大写字母或没有数字,只有大约 0.15. (15%). 这意味着在找到一个好的样本之前,85% 的时间没有坏样本会被拒绝.

事实证明,在采样好字符串之前,将采样的坏字符串的预期数量是:

0.15/0.85 =~ 0.17

以下显示了上述概率是如何推导出来的,如果有人感兴趣的话.

n_down 是可以抽取 10 个没有小写字母的样本的方法数:

n_down = C(36,10) = 36!/(10!*(36-10)!)

其中(二项式系数)C(36,10) 等于 36 个事物"一次可以取"10 个的组合数,等于:

C(36,10) = 36!/(10!*(36-10)!) #=>254_186_856

同样,

n_up = n_down #=>254_186_856

n_digits = C(52,10) #=>15_820_024_220

我们可以将这三个数字相加得到:

n_down + n_up + n_digits #=>16_328_397_932

这几乎是(但不完全是)绘制 10 个字符的方法数,无需替换,不包含小写字母字符、大写字母或数字.不完全",因为有一些重复计算.必要的调整如下:

n_down + n_up + n_digits - 2*C(26,10) - 3#=>16_317_774_459

为了获得从 62 人中抽取 10 个样本的概率,没有替换,没有小写字母,没有大写字母或没有数字,我们用这个数字除以可以抽取 10 个字符的方式总数从 62 起,无需更换:

(16_317_774_459.0/c(62,10)).round(2)#=>0.15

I'm using the following code to generate a unique 10-character random string of [A-Z a-z 0-9] in Ruby:

random_code = [*('a'..'z'),*('0'..'9'),*('A'..'Z')].shuffle[0, 10].join

However, sometimes this random string does not contain a number or an uppercase character. Could you help me have a method that generates a unique random string that requires at least one number, one uppercase and one downcase character?

解决方案

down   = ('a'..'z').to_a
up     = ('A'..'Z').to_a
digits = ('0'..'9').to_a
all    = down + up + digits
[down.sample, up.sample, digits.sample].
  concat(7.times.map { all.sample }).
  shuffle.
  join
  #=> "TioS8TYw0F"

[Edit: The above reflects a misunderstanding of the question. I'll leave it, however. To have no characters appear more than once:

def rnd_str
  down   = ('a'..'z').to_a
  up     = ('A'..'Z').to_a
  digits = ('0'..'9').to_a
  [extract1(down), extract1(up), extract1(digits)].
    concat(((down+up+digits).sample(7))).shuffle.join
end

def extract1(arr)
  i = arr.size.times.to_a.sample
  c = arr[i]
  arr.delete_at(i)
  c
end

rnd_str #=> "YTLe0WGoa1" 
rnd_str #=> "NrBmAnE9bT"

down.sample.shift (etc.) would have been more compact than extract1, but the inefficiency was just too much to bear.

If you do not want to repeat random strings, simply keep a list of the ones you generate. If you generate another that is in the list, discard it and generate another. It's pretty unlikely you'll have to generate any extra ones, however. If, for example, you generate 100 random strings (satisfying the requirement of at least one lowercase letter, uppercase letter and digit), the chances that there will be one or more duplicate strings is about one in 700,000:

t = 107_518_933_731
n = t+1
t = t.to_f
(1.0 - 100.times.reduce(1.0) { |prod,_| prod * (n -= 1)/t }).round(10)
  #=> 1.39e-07

where t = C(62,10) and C(62,10) is defined below.

An alternative

There is a really simple way to do this that turns out to be pretty efficient: just sample without replacement until a sample is found that meets the requirement of at least lowercase letter, one uppercase letter and one digit. We can do that as follows:

DOWN   = ('a'..'z').to_a
UP     = ('A'..'Z').to_a
DIGITS = ('0'..'9').to_a
ALL    = DOWN + UP + DIGITS

def rnd_str
  loop do
    arr = ALL.sample(10)
    break arr.shuffle.join unless (DOWN&&arr).empty? || (UP&&arr).empty? || 
    (DIGITS&&arr).empty?
  end
end

rnd_str #=> "3jRkHcP7Ge" 
rnd_str #=> "B0s81x4Jto

How many samples must we reject, on average, before finding a "good" one? It turns out (see below if you are really, really interested) that the probability of getting a "bad" string (i.e, selecting 10 characters at random from the 62 elements of all, without replacement, that has no lowercase letters, no uppercase letters or no digits, is only about 0.15. (15%). That means that 85% of the time no bad samples will be rejected before a good one is found.

It turns out that the expected number of bad strings that will be sampled, before a good string is sampled, is:

0.15/0.85 =~ 0.17

The following shows how the above probability was derived, should anyone be interested.

Let n_down be the number of ways a sample of 10 can be drawn that has no lowercase letters:

n_down = C(36,10) = 36!/(10!*(36-10)!)

where (the binomial coefficient) C(36,10) equals the number of combinations of 36 "things" that can be "taken" 10 at a time, and equals:

C(36,10) = 36!/(10!*(36-10)!) #=> 254_186_856

Similarly,

n_up = n_down #=> 254_186_856

and

n_digits = C(52,10) #=> 15_820_024_220

We can add these three numbers together to obtain:

n_down + n_up + n_digits #=> 16_328_397_932

This is almost, but not quite, the number of ways to draw 10 characters, without replacement, that contains no lowercase letters characters, uppercase letters or digits. "Not quite" because there is a bit of double-counting going on. The necessary adjustment is as follows:

n_down + n_up + n_digits - 2*C(26,10) - 3
  #=> 16_317_774_459

To obtain the probability of drawing a sample of 10 from a population of 62, without replacement, that has no lowercase letter, no uppercase letter or no digit, we divide this number by the total number of ways 10 characters can be drawn from 62 without replacement:

(16_317_774_459.0/c(62,10)).round(2)
  #=> 0.15

这篇关于Ruby 中需要带有字母数字的唯一随机字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆