如何生成随机字母数字字符串? [英] How to generate a random alpha-numeric string?

查看:213
本文介绍了如何生成随机字母数字字符串?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在寻找简单 Java算法来生成伪随机字母数字字符串。在我的情况下,它将被用作一个独特的会话/密钥标识符,它可能超过500K +代(我的需求并不需要更复杂的东西)。理想情况下,我可以根据我的唯一性需求指定长度。例如,生成的长度为12的字符串可能类似于AEYGF7K0DM1X

I've been looking for a simple Java algorithm to generate a pseudo-random alpha-numeric string. In my situation it would be used as a unique session/key identifier that would "likely" be unique over 500K+ generation (my needs don't really require anything much more sophisticated). Ideally, I would be able to specify a length depending on my uniqueness needs. For example, a generated string of length 12 might look something like "AEYGF7K0DM1X".

推荐答案

算法



要生成随机字符串,请连接绘制的字符从可接受符号集中随机变换,直到字符串达到所需长度。

Algorithm

To generate a random string, concatenate characters drawn randomly from the set of acceptable symbols until the string reaches the desired length.

这里有一些相当简单用于生成随机标识符的非常灵活的代码。 阅读以下信息以获取重要的应用说明。

Here's some fairly simple and very flexible code for generating random identifiers. Read the information that follows for important application notes.

import java.security.SecureRandom;
import java.util.Locale;
import java.util.Objects;
import java.util.Random;

public class RandomString {

    /**
     * Generate a random string.
     */
    public String nextString() {
        for (int idx = 0; idx < buf.length; ++idx)
            buf[idx] = symbols[random.nextInt(symbols.length)];
        return new String(buf);
    }

    public static final String upper = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";

    public static final String lower = upper.toLowerCase(Locale.ROOT);

    public static final String digits = "0123456789";

    public static final String alphanum = upper + lower + digits;

    private final Random random;

    private final char[] symbols;

    private final char[] buf;

    public RandomString(int length, Random random, String symbols) {
        if (length < 1) throw new IllegalArgumentException();
        if (symbols.length() < 2) throw new IllegalArgumentException();
        this.random = Objects.requireNonNull(random);
        this.symbols = symbols.toCharArray();
        this.buf = new char[length];
    }

    /**
     * Create an alphanumeric string generator.
     */
    public RandomString(int length, Random random) {
        this(length, random, alphanum);
    }

    /**
     * Create an alphanumeric strings from a secure generator.
     */
    public RandomString(int length) {
        this(length, new SecureRandom());
    }

    /**
     * Create session identifiers.
     */
    public RandomString() {
        this(21);
    }

}



用法示例



为8个字符的标识符创建一个不安全的生成器:

Usage examples

Create an insecure generator for 8-character identifiers:

RandomString gen = new RandomString(8, ThreadLocalRandom.current());

为会话标识符创建安全生成器:

Create a secure generator for session identifiers:

RandomString session = new RandomString();

创建一个包含易于阅读的打印代码的生成器。字符串比完整的字母数字字符串长,以补偿使用更少的符号:

Create a generator with easy-to-read codes for printing. The strings are longer than full alphanumeric strings to compensate for using fewer symbols:

String easy = RandomString.digits + "ACEFGHJKLMNPQRUVWXYabcdefhijkprstuvwx";
RandomString tickets = new RandomString(23, new SecureRandom(), easy);



用作会话标识符



生成会话可能唯一的标识符不够好,或者您可以使用简单的计数器。当使用可预测的标识符时,攻击者会劫持会话。

Use as session identifiers

Generating session identifiers that are likely to be unique is not good enough, or you could just use a simple counter. Attackers hijack sessions when predictable identifiers are used.

长度和安全性之间存在紧张关系。较短的标识符更容易猜测,因为可能性较小。但是更长的标识符会消耗更多的存储空间较大的符号集会有所帮助,但如果标识符包含在URL中或手动重新输入,则可能会导致编码问题。

There is tension between length and security. Shorter identifiers are easier to guess, because there are fewer possibilities. But longer identifiers consume more storage and bandwidth. A larger set of symbols helps, but might cause encoding problems if identifiers are included in URLs or re-entered by hand.

随机性或熵的基础来源会话标识符应来自为加密设计的随机数生成器。但是,初始化这些生成器有时可能在计算上很昂贵或很慢,因此应尽可能重新使用它们。

The underlying source of randomness, or entropy, for session identifiers should come from a random number generator designed for cryptography. However, initializing these generators can sometimes be computationally expensive or slow, so effort should be made to re-use them when possible.

并非每个应用程序都需要安全性。随机分配可以是多个实体在没有任何协调或分区的情况下在共享空间中生成标识符的有效方式。协调可能很慢,特别是在集群或分布式环境中,当实体最终使用太小或太大的共享时,拆分空间会导致问题。

Not every application requires security. Random assignment can be an efficient way for multiple entities to generate identifiers in a shared space without any coordination or partitioning. Coordination can be slow, especially in a clustered or distributed environment, and splitting up a space causes problems when entities end up with shares that are too small or too big.

标识符如果攻击者可能能够查看和操纵它们,那么在没有采取措施使其不可预测的情况下生成的应该受到其他方式的保护,就像在大多数Web应用程序中一样。应该有一个单独的授权系统来保护攻击者可以在没有访问权限的情况下猜出标识符的对象。

Identifiers generated without taking measures to make them unpredictable should be protected by other means if an attacker might be able to view and manipulate them, as happens in most web applications. There should be a separate authorization system that protects objects whose identifier can be guessed by an attacker without access permission.

还必须注意使用足够长的标识符根据预期的标识符总数,不可能发生冲突。这被称为生日悖论。 发生碰撞的概率, p ,约为n < sup> 2 /(2q x ),其中 n 是实际生成的标识符数, q 是不同的数量字母表中的符号, x 是标识符的长度。这应该是一个非常小的数字,例如2 -50 或更少。

Care must be also be taken to use identifiers that are long enough to make collisions unlikely given the anticipated total number of identifiers. This is referred to as "the birthday paradox." The probability of a collision, p, is approximately n2/(2qx), where n is the number of identifiers actually generated, q is the number of distinct symbols in the alphabet, and x is the length of the identifiers. This should be a very small number, like 2‑50 or less.

解决这个问题表明500k 15-之间发生碰撞的可能性字符标识符大约为2 -52 ,这可能不如宇宙射线等未检测到的错误。

Working this out shows that the chance of collision among 500k 15-character identifiers is about 2‑52, which is probably less likely than undetected errors from cosmic rays, etc.

根据他们的规范,UUID不是不可预测的,不应该用作会话标识符。

According to their specification, UUIDs are not designed to be unpredictable, and should not be used as session identifiers.

标准格式的UUID需要占用大量空间:36个字符,仅有122位熵。 (并非随机选择随机UUID的所有位。)随机选择的字母数字字符串仅包含21个字符的更多熵。

UUIDs in their standard format take a lot of space: 36 characters for only 122 bits of entropy. (Not all bits of a "random" UUID are selected randomly.) A randomly chosen alphanumeric string packs more entropy in just 21 characters.

UUID不灵活;他们有一个标准化的结构和布局。这是他们的主要优点,也是他们的主要弱点。与外部团队合作时,UUID提供的标准化可能会有所帮助。对于纯粹的内部使用,它们可能效率低下。

UUIDs are not flexible; they have a standardized structure and layout. This is their chief virtue as well as their main weakness. When collaborating with an outside party, the standardization offered by UUIDs may be helpful. For purely internal use, they can be inefficient.

这篇关于如何生成随机字母数字字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆