mysql regex utf-8字符 [英] mysql regex utf-8 characters

查看:99
本文介绍了mysql regex utf-8字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试通过REGEXMySQL数据库获取数据,无论是否带有特殊的utf-8字符.

I am trying to get data from MySQL database via REGEX with or without special utf-8 characters.

让我解释一个例子:

如果用户输入类似sirena的单词,则应返回包含诸如sirenasirénašíreňá等单词的行. 并且当他输入siréná时它应该向后工作,它应该返回相同的结果.

If user enters word like sirena it should return rows which include words like sirena,siréna,šíreňá .. and so on.. also it should work backwards when he enters siréná it should return the same results..

我正在尝试通过REGEX搜索它,我的查询看起来像这样:

I am trying to search it via REGEX, my query looks like this :

SELECT * FROM `content` WHERE `text` REGEXP '[sšŠ][iíÍ][rŕŔřŘ][eéÉěĚ][nňŇ][AaáÁäÄ0]'

仅当数据库中的单词为sirena时有效,而当单词为siréňa时无效..

It works only when in database is word sirena but not when there is word siréňa..

是因为UTF-8和MySQL有问题吗? (mysql列的排序规则是utf8_general_ci)

Is it because something with UTF-8 and MySQL? (collation of mysql column is utf8_general_ci)

谢谢!

推荐答案

MySQL的正则表达式库不支持utf-8.

MySQL's regular expression library does not support utf-8.

请参见错误#30241正则表达式问题,该问题自2007年开始开放他们必须先更改所使用的正则表达式库,然后才能对其进行修复,但我还没有发现有关何时或是否会这样做的任何公告.

See Bug #30241 Regular expression problems, which has been open since 2007. They will have to change the regular expression library they use before that can be fixed, and I haven't found any announcement of when or if they will do this.

我看到的唯一解决方法是搜索特定的十六进制字符串:

The only workaround I've seen is to search for specific HEX strings:

mysql> SELECT * FROM `content` WHERE HEX(`text`) REGEXP 'C3A9C588';
+----------+
| text     |
+----------+
| siréňa   |
+----------+


发表您的评论


Re your comment:

不,我不知道MySQL的任何解决方案.

No, I don't know of any solution with MySQL.

您可能必须切换到PostgreSQL,因为RDBMS在

You might have to switch to PostgreSQL, because that RDBMS supports \u codes for UTF characters in their regular expression syntax.

这篇关于mysql regex utf-8字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆