从 T-SQL 中的文本中删除所有注释 [英] Remove all comments from a text in T-SQL

查看:32
本文介绍了从 T-SQL 中的文本中删除所有注释的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从 NVARCHAR 值中删除所有注释.

I am trying to remove all the comments from a NVARCHAR value.

我不知道我将获得 NVARCHAR 变量的哪个值,我需要删除所有以 -- 开头的注释,直到行尾.

I don't know which value I will get to the NVARCHAR variable and I need to remove all the comments that start with -- until the end of the line.

例如:

-- Some Comments
SET NOCOUNT ON;

-- Some Comments

SELECT FirstName FROM dbo.Users WHERE Id = @Id;

删除评论后应该是这样的:

After removing the comments it should look like this:

SET NOCOUNT ON;



SELECT FirstName FROM dbo.Users WHERE Id = @Id;

在 T-SQL 中有什么简单的方法吗?

Is there any easy way doing it in T-SQL?

提前致谢.

推荐答案

使用 ngramsN4k:

CREATE FUNCTION dbo.NGramsN4K
(
  @string nvarchar(4000), -- Input string
  @N      int             -- requested token size
)
/****************************************************************************************
Purpose:
 A character-level N-Grams function that outputs a contiguous stream of @N-sized tokens 
 based on an input string (@string). Accepts strings up to 4000 nvarchar characters long.
 For more information about N-Grams see: http://en.wikipedia.org/wiki/N-gram. 

Compatibility: 
 SQL Server 2008+, Azure SQL Database

Syntax:
--===== Autonomous
 SELECT position, token FROM dbo.NGramsN4K(@string,@N);

--===== Against a table using APPLY
 SELECT s.SomeID, ng.position, ng.token
 FROM dbo.SomeTable s
 CROSS APPLY dbo.NGramsN4K(s.SomeValue,@N) ng;

Parameters:
 @string  = The input string to split into tokens.
 @N       = The size of each token returned.

Returns:
 Position = bigint; the position of the token in the input string
 token    = nvarchar(4000); a @N-sized character-level N-Gram token

Developer Notes:  
 1. NGramsN4K is not case sensitive

 2. Many functions that use NGramsN4K will see a huge performance gain when the optimizer
    creates a parallel execution plan. One way to get a parallel query plan (if the 
    optimizer does not chose one) is to use make_parallel by Adam Machanic which can be 
    found here:
 sqlblog.com/blogs/adam_machanic/archive/2013/07/11/next-level-parallel-plan-porcing.aspx

 3. When @N is less than 1 or greater than the datalength of the input string then no 
    tokens (rows) are returned. If either @string or @N are NULL no rows are returned.
    This is a debatable topic but the thinking behind this decision is that: because you
    can't split 'xxx' into 4-grams, you can't split a NULL value into unigrams and you 
    can't turn anything into NULL-grams, no rows should be returned.

    For people who would prefer that a NULL input forces the function to return a single
    NULL output you could add this code to the end of the function:

    UNION ALL 
    SELECT 1, NULL
    WHERE NOT(@N > 0 AND @N <= DATALENGTH(@string)) OR (@N IS NULL OR @string IS NULL);

 4. NGramsN4K is deterministic. For more about deterministic functions see:
    https://msdn.microsoft.com/en-us/library/ms178091.aspx

Usage Examples:
--===== Turn the string, 'abcd' into unigrams, bigrams and trigrams
 SELECT position, token FROM dbo.NGramsN4K('abcd',1); -- unigrams (@N=1)
 SELECT position, token FROM dbo.NGramsN4K('abcd',2); -- bigrams  (@N=2)
 SELECT position, token FROM dbo.NGramsN4K('abcd',3); -- trigrams (@N=3)

--===== How many times the substring "AB" appears in each record
 DECLARE @table TABLE(stringID int identity primary key, string nvarchar(100));
 INSERT @table(string) VALUES ('AB123AB'),('123ABABAB'),('!AB!AB!'),('AB-AB-AB-AB-AB');

 SELECT string, occurances = COUNT(*) 
 FROM @table t
 CROSS APPLY dbo.NGramsN4K(t.string,2) ng
 WHERE ng.token = 'AB'
 GROUP BY string;

----------------------------------------------------------------------------------------
Revision History:
 Rev 00 - 20170324 - Initial Development - Alan Burstein
****************************************************************************************/
RETURNS TABLE WITH SCHEMABINDING AS RETURN
WITH
L1(N) AS
(
  SELECT 1 FROM (VALUES -- 64 dummy values to CROSS join for 4096 rows
        ($),($),($),($),($),($),($),($),($),($),($),($),($),($),($),($),
        ($),($),($),($),($),($),($),($),($),($),($),($),($),($),($),($),
        ($),($),($),($),($),($),($),($),($),($),($),($),($),($),($),($),
        ($),($),($),($),($),($),($),($),($),($),($),($),($),($),($),($)) t(N)
),
iTally(N) AS 
(
  SELECT 
  TOP (ABS(CONVERT(BIGINT,((DATALENGTH(ISNULL(@string,''))/2)-(ISNULL(@N,1)-1)),0)))
    ROW_NUMBER() OVER (ORDER BY (SELECT NULL))    -- Order by a constant to avoid a sort
  FROM L1 a CROSS JOIN L1 b                       -- cartesian product for 4096 rows (16^2)
)
SELECT
  position = N,                                   -- position of the token in the string(s)
  token    = SUBSTRING(@string,CAST(N AS int),@N) -- the @N-Sized token
FROM iTally
WHERE @N > 0 
-- Protection against bad parameter values
AND @N <= (ABS(CONVERT(BIGINT,((DATALENGTH(ISNULL(@string,''))/2)-(ISNULL(@N,1)-1)),0)));

您可以使用以下解决方案来解决它.这将仅限于 NVARCHAR(4000) 但如果您需要,我可以组合一个 NVARCHAR(max) 版本.另请注意,我的解决方案忽略了以--"开头的行,并将所有内容抓取到--",其中注释更深.我不是在处理 /* 这种注释样式 */ 但是可以修改为这样做.

You can solve it using the solution below. This will be limited to NVARCHAR(4000) but I can put together an NVARCHAR(max) version if you need one. Also note that my solution ignores lines that begin with "--" and grabs everything up to "--" where the comment is deeper in. I'm not adressing /* this comment style */ but could be modified to do so.

解决方案

-- sample stored proc
declare @storedproc varchar(8000) =
'-- Some Comments
SET NOCOUNT ON;

-- Some Comments

SELECT FirstName -- we only need the first name
FROM dbo.Users WHERE Id = @Id;';

--select @storedproc;

-- Solution
select cleanedProc = 
(
  select substring(item, 1, isnull(nullif(charindex('--', item),0)-1,nextPos))+br
  from
  (
    select 0 union all
    select position from dbo.ngramsN4k(@storedproc,1) 
    where token = char(10)
  ) d(position)
  cross apply (values (char(10), d.position+1,
           isnull(nullif(charindex(char(10), @storedproc, d.position+1),0),8000))
  ) p(br, startPos, nextPos)
  cross apply (values (substring(@storedproc, startPos, nextPos-startPos))) split(item)
  where item not like '--%'
  order by position
  for xml path(''), type
).value('(text())[1]', 'varchar(8000)');

之前

-- Some Comments
SET NOCOUNT ON;

-- Some Comments

SELECT FirstName -- we only need the first name
FROM dbo.Users WHERE Id = @Id;

之后

SET NOCOUNT ON;


SELECT FirstName 
FROM dbo.Users WHERE Id = @Id;

这篇关于从 T-SQL 中的文本中删除所有注释的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆