测试列中是否包含特殊字符或仅包含字符/数字 [英] Test column for special characters or only characters / numbers

查看:62
本文介绍了测试列中是否包含特殊字符或仅包含字符/数字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试使用通用正则表达式属性和 NOT LIKE 子句查找特殊字符,但结果一直令人困惑.研究表明,它不能像在SQL Server或其他地方那样工作.

I tried finding special characters using generic regex attributes and NOT LIKE clause but have been getting confusing results. The research suggested that it does not work the way it works in SQL Server or elsewhere.

  1. 用于查找是否有任何字符
  2. 查找是否有数字
  3. 查找是否有特殊字符

之类的'%[^ 0-9]%''%[^ aZ]%'在查找非数字数据是否有效时效果不佳是否可用以及是否存在非字母数据

like '%[^0-9]%' or '%[^a-Z]%' does not work very well when finding if non-numeric data is available and if non-alphabetical data is present, respectively

SELECT column1 from some_table where column1 like '%[^0-9]%'; 
SELECT column1 from some_table where column1 like '%[^a-Z]%' 
SELECT column1 from some_table where column1 like '%[^a-Z0-9]%' 

还注意到人们使用-> 不喜欢'%[^ 0-9]%'

Have also noted that people use -> NOT like '%[^0-9]%'

推荐答案

Postgres 正则表达式运算符 .

Postgres LIKE does not support regular expressions.
You need the regular expression operator ~.

标准SQL还定义了 类似 作为上述内容的一种奇怪组合,但不要使用它.参见:

Standard SQL also defines SIMILAR TO as an odd mix of the above, but rather don't use that. See:

用于查找是否有任何字符

For finding if there is any character

...表示任何字符:

... WHERE col <> '';                        -- any character at all?

因此,既不为NULL也不为空.参见:

So neither NULL nor empty. See:

...表示任何字母字符(字母):

... meaning any alphabetic character (letter):

... WHERE col ~ '[[:alpha:]]';              -- any letters?

[[[:alpha:]] 是所有字母字符的字符类-不只是ASCII字母 [A-Za-z] ,还包括 [ÄéÒçòý]

[[:alpha:]] is the character class for all alphabetic characters - not just the ASCII letters [A-Za-z], includes letters like [ÄéÒçòý] etc.

查找是否有数字

...表示任何位数:

... WHERE col ~ '\d';                       -- any digits?

\ d [[:digit:]] 的类简写.

用于查找是否有特殊字符

For finding if there is any special character

...表示除数字和字母之外的任何内容:

... meaning anything except digits and letters:

... WHERE col ~ '\W';                       -- anything but digits & letters? 

\ W [^ [:alnum:] _] 的类简写(下划线已排除-该手册目前在那儿令人困惑).

\W is the class shorthand for [^[:alnum:]_] (underscore excluded - the manual is currently confusing there).

...表示除数字,字母和空格之外的任何内容:

... meaning anything except digits, letters and plain space:

... WHERE col ~ '[^[:alnum:]_ ]'            -- ... and space

这是拼写的类简写 \ W ,另外不包括空格.

That's the class shorthand \W spelled out, additionally excluding plain space.

...表示除数字,字母和任何空格外的任何内容:

... meaning anything except digits, letters and any white space:

... WHERE col ~ '[^[:alnum:]_\s]'           -- ... and any white space
... WHERE col ~ '[^[:alnum:]_[:space:]]'    -- ... the same spelled out

这次不包括Posix字符类 space 定义的所有空白.关于Unicode中的空白":

This time excluding all white space as defined by the Posix character class space. About "white space" in Unicode:

...表示任何非ASCII字符:

... meaning any non-ASCII character:

如果您的数据库集群使用UTF8编码运行,则有一个简单,非常快速的破解方法:

If your DB cluster runs with UTF8 encoding, there is a simple, very fast hack:

... WHERE octet_length(col) > length(col);  -- any non-ASCII letter?

octet_length()计算字符串中的 bytes ,而 length()(别名: character_length()>或 char_length())计数字符串中的字符.所有基本ASCII字符( [\ x00- \ x7F] )在UTF-8中均用1个字节编码,所有其他字符使用2-4个字节.字符串中的任何非ASCII字符都会使表达式 true .

octet_length()counts the bytes in the string, while length() (aliases: character_length() or char_length()) counts characters in the string. All basic ASCII characters ([\x00-\x7F]) are encoded with 1 byte in UTF-8, all other characters use 2 - 4 bytes. Any non-ASCII character in the string makes the expression true.

进一步阅读:

PostgreSQL 9.1在select中使用collat​​e陈述

这篇关于测试列中是否包含特殊字符或仅包含字符/数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆