如何在 sql server 2005 中选择包含非英文字符的行(它应该只过滤非英文字符,而不是特殊字符) [英] how to select rows that contains non-english characters in sql server 2005(it should filter only non-english chars, not special characters)

查看:34
本文介绍了如何在 sql server 2005 中选择包含非英文字符的行(它应该只过滤非英文字符,而不是特殊字符)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

因为我的表在列中包含非英语(包含不同语言的字符)字符和特殊字符.我只需要过滤非英文字符.它应该过滤任何特殊字符.

我尝试使用不同的方法进行过滤,但未能过滤几行.有人请帮助我.提前致谢.

例如:列名称 LOCATION 包含以下行:

第 1 行:துய இம்மானுவேல் தேவாலயம்,北街,Idyanvillai,泰米尔纳德邦,泰米尔纳德邦

第 2 行:Hakim M.Asgar Ali 博士的 ROY MEDICAL CENTRE™ Unani 诊所在印度喀拉拉邦,Thycaud 医院路,Opp.Amritha Hotel,, Thycaud.P.O.,, Thiruvananthapuram, 喀拉拉邦, 印度

第 3 行:ಕಾಳಿಕಾಂಬ ದೇವಿ ದೇವಸ್ಥಾನ, Shivaji Nagar, Davangere, Karnataka, India

因为上面包含了多种语言的字符.谁能帮我只选择第2行谢谢.

解决方案

T-SQL 的字符串处理能力非常初级.

如果非英语"字段通过使用 Unicode UTF-16 来区分,您可以尝试类似的方法

SELECT * FROM MyTable WHERE MyField = Cast(MyField AS VARCHAR)

仅提取可以用 UTF-8 表示的行.

我知道如何测试一个字段是否是从任意一组字符中提取的唯一方法是使用 用户定义函数,像这样:

CREATE FUNCTION IsAllowed (@input VARCHAR(MAX)) 返回位-- 如果允许字符串,则返回 1,否则返回 0.-- 用法:SELECT dbo.IsAllowed('Hello');-- 返回 1-- SELECT dbo.IsAllowed('Hello, world!');-- 返回 0-- 注意 CHARINDEX 不区分大小写,所以 @allowables 不需要两者.-- VARCHAR(MAX) 在 SQL Server 2005 下与 2008+ 不同--- 并且可能需要使用定义的 VARCHAR 大小.作为开始声明@allowables char(26) = 'abcdefghijklmnopqrstuvwxyz';声明@allowed int = 0;声明@index int = 1;而@index <= LEN(@input)开始IF CHARINDEX(SUBSTRING(@input,@index,1),@allowables)=0开始设置@允许= 0;休息;结尾别的开始设置@允许= 1;SET @index = @index+1;结尾结尾返回@允许结尾

用户定义的函数可以应用于 SELECT 中的列,如下所示:

SELECT * FROM MyTable WHERE dbo.IsAllowed(MyField) = 1

注意模式名称(在本例中为 dbo)对于用户定义的函数不是可选的.

如果 T-SQL 用户定义函数不够用,也可以使用 CLR 函数.然后您可以将正则表达式或其他任何内容应用于列.因为它们破坏了可移植性并带来了安全风险,所以许多系统管理员不允许使用 CLR 功能.(这包括 Microsoft 的 SQL Azure 产品.)

As my table contains non-English(contains characters in different languages) characters and special characters in a column. I need filter only non-English characters. It should filter any special characters.

i tried using different methods to filter but failed to filter few rows. someone please help me on this. Thanks in advance.

ex: column name LOCATION contains following rows :

row 1: துய இம்மானுவேல் தேவாலயம், North Street, Idyanvillai, Tamil Nadu, India

row 2:Dr.Hakim M.Asgar Ali's ROY MEDICAL CENTRE™ Unani Clinic In Kerala India, Thycaud Hospital Road, Opp. Amritha Hotel,, Thycaud.P.O.,, Thiruvananthapuram, Kerala, India

row 3: ಕಾಳಿಕಾಂಬ ದೇವಿ ದೇವಸ್ಥಾನ, Shivaji Nagar, Davangere, Karnataka, India

As the above contains characters in many language. can any one help me to select only row 2 thanks.

解决方案

T-SQL's string-handling capability is pretty rudimentary.

If the "non-English" fields are distinguished by their use of Unicode UTF-16, you can try something like

SELECT * FROM MyTable WHERE MyField = Cast(MyField AS VARCHAR)

to pull only rows that are expressible in UTF-8.

The only way I know how to test whether a field is drawn from an arbitrary set of characters is with a user-defined function, like this:

CREATE FUNCTION IsAllowed (@input VARCHAR(MAX)) RETURNS BIT
-- Returns 1 if string is allowed, 0 otherwise.
-- Usages: SELECT dbo.IsAllowed('Hello'); -- returns 1
--         SELECT dbo.IsAllowed('Hello, world!'); -- returns 0
-- Note CHARINDEX is not case sensitive so @allowables doesn't need both.
--      VARCHAR(MAX) is different under SQL Server 2005 than 2008+
---     and use of defined VARCHAR size might be necessary.
AS
BEGIN
  DECLARE @allowables char(26) = 'abcdefghijklmnopqrstuvwxyz';
  DECLARE @allowed int = 0; 
  DECLARE @index int = 1;
  WHILE @index <= LEN(@input)
    BEGIN
    IF CHARINDEX(SUBSTRING(@input,@index,1),@allowables)=0
      BEGIN
      SET @allowed = 0;
      BREAK;
      END
    ELSE
      BEGIN
      SET @allowed = 1;
      SET @index = @index+1;
      END
    END
  RETURN @allowed
END

User-defined functions can be applied to columns in SELECT, like this:

SELECT * FROM MyTable WHERE dbo.IsAllowed(MyField) = 1

Note the schema name (dbo in this case) is not optional with user-defined functions.

If a T-SQL user-defined function is inadequate, you can also use a CLR Function. Then you could apply a regexp or whatever to a column. Because they break portability and pose a security risk, many sysadmins don't allow CLR functions. (This includes Microsoft's SQL Azure product.)

这篇关于如何在 sql server 2005 中选择包含非英文字符的行(它应该只过滤非英文字符,而不是特殊字符)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆