MySQL REGEXP单词边界[[:&lt ;:]] [[:&gt ;:]]和双引号 [英] MySQL REGEXP word boundaries [[:<:]] [[:>:]] and double quotes

查看:80
本文介绍了MySQL REGEXP单词边界[[:&lt ;:]] [[:&gt ;:]]和双引号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将某些全字词表达式与MySQL REGEXP函数进行匹配.当涉及到双引号时,就会出现问题.

I'm trying to match some whole-word-expressions with the MySQL REGEXP function. There is a problem, when there are double quotes involved.

MySQL文档说:要在正则表达式中使用特殊字符的文字实例,请在其前面加上两个反斜杠()字符.

The MySQL documentation says: "To use a literal instance of a special character in a regular expression, precede it by two backslash () characters."

但是这些查询都返回0:

But these queries all return 0:

SELECT '"word"' REGEXP '[[:<:]]"word"[[:>:]]';             -> 0
SELECT '"word"' REGEXP '[[:<:]]\"word\"[[:>:]]';           -> 0
SELECT '"word"' REGEXP '[[:<:]]\\"word\\"[[:>:]]';         -> 0
SELECT '"word"' REGEXP '[[:<:]] word [[:>:]]';             -> 0
SELECT '"word"' REGEXP '[[:<:]][[.".]]word[[.".]][[:>:]]'; -> 0

我还能尝试得到1吗?还是这不可能?

What else can I try to get a 1? Or is this impossible?

推荐答案

让我引用文档首先:

[[:: <:]],[[:>:]]

[[:<:]], [[:>:]]

这些标记代表单词边界.他们匹配开始 字尾分别.单词是单词字符的序列 不能在单词字符之前或之后.一个字 字符是alnum类中的字母数字字符或 下划线(_).

These markers stand for word boundaries. They match the beginning and end of words, respectively. A word is a sequence of word characters that is not preceded by or followed by word characters. A word character is an alphanumeric character in the alnum class or an underscore (_).

从文档中我们可以看到问题背后的原因,而原因并非由逃逸引起.问题是您试图在字符串的开头匹配单词边界[[:<:]],这是行不通的,因为从文档中可以看到单词边界将单词字符与非单词字符分开,但是在您的情况下,第一个字符是",它不是单词字符,因此没有单词边界,最后一个"[[:>:]]也是如此.

From the documentation we can see the reason behind your problem and it is not caused by escaping whatsoever. The problem is that you are trying to match the word boundary [[:<:]] right at the beginning of the string which won't work because a word boundary as you can see from the documentation separates a word character from a non-word character, but in your case the first character is a " which isn't a word character so there is no word boundary, the same goes for the last " and [[:>:]].

要使其正常工作,您需要将表达式更改为以下形式:

In order for this to work, you need to change your expression a bit to this one:

"[[:<:]]word[[:>:]]"
 ^^^^^^^    ^^^^^^^

请注意单词边界如何在字符串的开头将非单词字符"与单词字符w分开,并将"d字符串的末尾分开.

Notice how the word boundary separates a non-word character " from a word character w in the beginning and a " from d at the end of the string.

编辑:如果您始终想在字符串的开头和结尾使用单词边界,而又不知道是否会有实际边界,则可以使用以下表达式:

If you always want to use a word boundary at the start and end of the string without knowing if there will be an actual boundary then you might use the following expression:

([[:<:]]|^)"word"([[:>:]]|$)

这将匹配单词边界在字符串的开头或开头^,并且匹配单词边界或字符串结尾的末尾.我真的建议您研究要匹配的数据,寻找常见的模式,如果它们不是适合工作的正确工具,请不要使用正则表达式.

This will either match a word boundary at the beginning or the start-of-string ^ and the same for the end of the word boundary or end-of-string. I really advise you to study the data you are trying to match and look for common patterns and don't use regular expressions if they are not the right tool for the job.

SQL小提琴演示

这篇关于MySQL REGEXP单词边界[[:&lt ;:]] [[:&gt ;:]]和双引号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆