SQL Server生成UNIQUE随机字符串 [英] SQL Server Generate UNIQUE random string

查看:174
本文介绍了SQL Server生成UNIQUE随机字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Zohar Answer 的帮助下,我获得了SQL函数来生成随机字符串,但是我面临着重复的问题。

With help of Zohar Answer, I got SQL function to generate random string but I am facing the problem with duplicate.

查询

Create FUNCTION [dbo].[MaskGenerator]
(    
    @Prefix nvarchar(4000), -- use null or an empty string for no prefix    
    @suffix nvarchar(4000), -- use null or an empty string for no suffix    
    @MinLength int, -- the minimum length of the random part    
    @MaxLength int, -- the maximum length of the random part    
    @Count int, -- the maximum number of rows to return. Note: up to 1,000,000 rows           
    @CharType tinyint -- 1, 2 and 4 stands for lower-case, upper-case and digits. 
                      -- a bitwise combination of these values can be used to generate all possible combinations: 
                      -- 3: lower and upper, 5: lower and digis, 6: upper and digits, 7: lower, upper nad digits
)
RETURNS TABLE
AS 
RETURN 

-- An inline tally table with 1,000,000 rows
WITH E1(N) AS (SELECT N FROM (VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10)) V(N)), -- 10
     E2(N) AS (SELECT 1 FROM E1 a, E1 b), --100
     E3(N) AS (SELECT 1 FROM E2 a, E2 b), --10,000
     Tally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY @@SPID) FROM E3 a, E2 b) --1,000,000 

SELECT TOP(@Count)  N As Number, 
        CONCAT(@Prefix, (
        SELECT  TOP (Length) 
                -- choose what char combination to use for the random part
                CASE @CharType 
                    WHEN 1 THEN LOWER
                    WHEN 2 THEN UPPER
                    WHEN 3 THEN IIF(Rnd % 2 = 0, LOWER, UPPER)
                    WHEN 4 THEN Digit
                    WHEN 5 THEN IIF(Rnd % 2 = 0, LOWER, Digit)
                    WHEN 6 THEN IIF(Rnd % 2 = 0, UPPER, Digit)
                    WHEN 7 THEN 
                        CASE Rnd % 3
                            WHEN 0 THEN LOWER
                            WHEN 1 THEN UPPER
                            ELSE Digit
                        END
                END
        FROM Tally As T0  
        -- create a random number from the guid using the GuidGenerator view
        CROSS APPLY (SELECT ABS(CHECKSUM(NewGuid)) As Rnd FROM GuidGenerator) AS RAND
        CROSS APPLY
        (
            -- generate a random lower-case char, upper-case char and digit
            SELECT  CHAR(97 + Rnd % 26) As LOWER, -- Random lower case letter
                    CHAR(65 + Rnd % 26) As UPPER,-- Random upper case letter
                    CHAR(48 + Rnd % 10) As Digit -- Random digit
        ) AS Chars
        WHERE  T0.N <> -T1.N -- Needed for the subquery to get re-evaluated for each row
        FOR XML PATH('') 
        ), @Suffix) As RandomString
FROM Tally As T1 
CROSS APPLY
(
    -- Select a random length between @MinLength and @MaxLength (inclusive)
    SELECT TOP 1 N As Length
    FROM Tally As T2
    CROSS JOIN GuidGenerator 
    WHERE T2.N >= @MinLength
    AND T2.N <= @MaxLength
    AND T2.N <> t1.N
    ORDER BY NewGuid
) As Lengths;

上面的函数将根据其参数提供随机字符串。例如,下面的查询将生成100个随机字符串,形成Test_Product_。结果集具有需要忽略的重复值。我尝试应用row_number,但是它降低了查询性能,并且也要求计数不再显示。

Above function will provide the random string based on its parameter. For example below query will generate 100 random strings with formation of Test_Product_. the result sets having duplicate values which needs to be ignore. I have tried applying row_number but its slow down the query performance also requesting count is not coming.

SELECT * FROM dbo.MaskGenerator('Test_Product_',null,1,4,100,4) ORDER BY 2

我在这里做了小提琴演示: SQL小提琴 ,而我的尝试也 此处

I have made fiddle demo here : SQL Fiddle and my attempt also here

推荐答案

基本上,这是生日的影响问题

到目前为止,我能提供的最佳解决方案是生成所需数量两倍的随机字符串,然后从中选择前100个不同的值:

Basically, this is an effect of the birthday problem.
The best solution I can offer as of now is to generate twice as many random strings you need, then select top 100 distinct values from them:

SELECT TOP 100 RandomString, ROW_NUMBER() OVER(ORDER BY @@SPID) As Number
FROM 
(
  SELECT DISTINCT RandomString 
  FROM dbo.MaskGenerator('Test_Product_',null,1,4,200,4)
) As Rnd
ORDER BY RandomString

由于您生成的随机字符串数量是所需的两倍,因此这似乎有些腰围,但是:

This might seem like a waist since you're generating twice as many random strings as you need, However:


  1. 我不确定情况是否如此。一旦具有100个不同的值,查询优化器可能会停止执行。

  1. I'm not sure that's actually the case. The query optimizer might just stop execution once you have 100 distinct values.

我为此功能执行的性能测试(在相对强大的SQL Server 2016上)表示它快如闪电,至少使用很少的字符串:

Performance tests I've done for this function (on a relatively strong SQL Server 2016) shows it is lightning-fast, at least with a small number of strings:


  • 平均在23毫秒左右生成200个字符串。

  • 平均大约55毫秒生成2000个字符串。

  • 平均大约2.8秒生成100,000个字符串。

但是,生成100万个字符串,平均大约需要30秒。

Generating 1 million strings, however, average around 30 seconds.

这篇关于SQL Server生成UNIQUE随机字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆