关于Unicode和排序规则,如何使MySQL像SQLite一样处理字符串? [英] How to make MySQL handle strings like SQLite does, with regard to Unicode and collation?

查看:142
本文介绍了关于Unicode和排序规则,如何使MySQL像SQLite一样处理字符串?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经在SO,MySQL文档和其他地方研究了这个问题几个小时,但仍然找不到令人满意的解决方案。问题是:

I've been researching this question for several hours now, on SO, in MySQL docs, and elsewhere, but still can't find a satisfactory solution. The problem is:

让MySQL像SQLite一样对待字符串而没有任何额外的智能转换的最简单方法是什么?

例如,以下代码在SQLite中可完美运行:

For example, the following works perfectly in SQLite:

CREATE TABLE `dummy` (`key` VARCHAR(255) NOT NULL UNIQUE);

INSERT INTO `dummy` (`key`) VALUES ('one');
INSERT INTO `dummy` (`key`) VALUES ('one ');
INSERT INTO `dummy` (`key`) VALUES ('One');
INSERT INTO `dummy` (`key`) VALUES ('öne');

SELECT * FROM `dummy`;

但是,在MySQL中,具有以下设置:

However, in MySQL, with the following settings:

[client]
default-character-set = utf8mb4

[mysql]
default-character-set = utf8mb4

[mysqld]
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_bin

和以下 CREATE DATABASE 语句:

CREATE DATABASE `dummydb` DEFAULT CHARACTER SET utf8mb4 DEFAULT COLLATE utf8mb4_bin;

在第二个 INSERT 上仍然失败。

it still fails on the second INSERT.

我宁愿使字符串列声明尽可能简单,SQLite的 TEXT 是理想的选择。 看起来很像 VARBINARY ,但我仍然想听听您对其他的看法更好 选项

I'd rather keep string column declarations as simple as possible, SQLite's TEXT being the ideal. Looks like VARBINARY is the way to go, but I would still like to hear your opinions on any other, potentially better options.

附录 >: SHOW CREATE TABLE虚拟输出是

mysql> SHOW CREATE TABLE dummy;
+-------+-----------------------------------------------------
| Table | Create Table                                        
+-------+-----------------------------------------------------
| dummy | CREATE TABLE `dummy` (
  `key` varchar(255) COLLATE utf8mb4_bin NOT NULL,
  UNIQUE KEY `key` (`key`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin |
+-------+-----------------------------------------------------
1 row in set (0.00 sec)


推荐答案

MySQL希望在执行 INSERT SELECT 。转换是在您声明 client 所拥有的内容与声明 column 所要存储的内容之间。

MySQL wants to convert strings when doing INSERT and SELECT. The conversion is between what you declare the client to have and what the column is declared to be storing.

避免这种情况的唯一方法是使用 VARBINARY BLOB 而不是 VARCHAR TEXT

The only way to avoid that is with VARBINARY and BLOB instead of VARCHAR and TEXT.

使用 COLLATION utf8mb4_bin 不能避免与 CHARACTER SET utf8mb4 之间的转换;它只是说 WHERE ORDER BY 应该比较位而不是处理重音和大小写折叠。

The use of COLLATION utf8mb4_bin does not avoid conversion to/from CHARACTER SET utf8mb4; it merely says that WHERE and ORDER BY should compare the bits instead of dealing with accents and case folding.

请记住,字符集utf8mb4 是一种编码文本的方式; COLLATION utf8mb4 _ * 是用于比较该编码文本的规则。 _bin 很简单。

Keep in mind that CHARACTER SET utf8mb4 is a way to encode text; COLLATION utf8mb4_* is rules for comparing texts in that encoding. _bin is simpleminded.

UNIQUE 涉及比较是否相等,因此 COLLATION 。在大多数utf8mb4归类中,这3个(不带空格)将比较相等。 utf8mb4_bin 会将这3个视为不同。 utf8mb4_hungarian_ci 对待one = One>öne。

UNIQUE involves comparing for equality, hence COLLATION. In most utf8mb4 collations, the 3 (without spaces) will compare equal. utf8mb4_bin will treat the 3 as different. utf8mb4_hungarian_ci treats one=One>öne.

尾随空格由列的数据类型控制( VARCHAR 或其他)。最新版本甚至具有关于是否考虑尾随空格的设置。

The trailing spaces are controlled by the datatype of the column (VARCHAR or other). The latest version even has a setting relating to whether to consider trailing spaces.

这篇关于关于Unicode和排序规则,如何使MySQL像SQLite一样处理字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆