具有唯一字段的MySQL数据库忽略结尾空格 [英] MySQL database with unique fields ignored ending spaces

查看:554
本文介绍了具有唯一字段的MySQL数据库忽略结尾空格的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的项目需要从用户开始输入,输入时要在单词的左边和右边间隔,例如"apple".如果用户键入"apple"或"apple",无论该单词的左边还是右边是一个空格还是多个空格,我都需要以这种方式存储.

My projects requires to start inputs from the user with the spacing on the left and spacing on the right of a word, for example 'apple'. If the user types in ' apple' or 'apple ', whether it is one space or multiple space on the left or right of the word, I need to store it that way.

此字段具有 Unique 属性,但是我尝试在左侧插入带有空格的单词,并且效果很好.但是,当我尝试在单词右侧插入空格时,它会修剪掉单词右侧的所有间距.

This field has the Unique attribute, but I attempt to insert the word with spacing on the left, and it works fine. But when I attempt to insert the word with spacing on the right it trims off all the spacing from the right of the word.

因此,我正在考虑在空格后的单词右边添加一个特殊字符.但我希望对此问题有更好的解决方案.

So I am thinking of adding a special character to the right of the word after the spacing. But I am hoping there is a better solution for this issue.

CREATE TABLE strings
( id bigint(20) unsigned NOT NULL AUTO_INCREMENT,
string varchar(255) COLLATE utf8_bin NOT NULL,
created_ts timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (id), UNIQUE KEY string (string) )
ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8 COLLATE=utf8_bin

推荐答案

问题是MySQL在进行字符串比较时会忽略尾随空格.看 http://dev.mysql.com/doc/refman/5.7/en /char.html

The problem is that MySQL ignores trailing whitespace when doing string comparison. See http://dev.mysql.com/doc/refman/5.7/en/char.html

所有MySQL归类均为PADSPACE类型.这意味着将比较MySQL中的所有CHAR,VARCHAR和TEXT值,而不考虑任何尾随空格.

All MySQL collations are of type PADSPACE. This means that all CHAR, VARCHAR, and TEXT values in MySQL are compared without regard to any trailing spaces.

...

对于剥离尾随字符或比较忽略它们的情况,如果一列具有要求唯一值的索引,则将仅尾随字符数不同的值插入列中将导致重复键错误.例如,如果表包含"a",则尝试存储"a"会导致重复键错误.

For those cases where trailing pad characters are stripped or comparisons ignore them, if a column has an index that requires unique values, inserting into the column values that differ only in number of trailing pad characters will result in a duplicate-key error. For example, if a table contains 'a', an attempt to store 'a ' causes a duplicate-key error.

(此信息适用于5.7;更改为8.0,请参见下文)

(This information is for 5.7; for 8.0 this changed, see below)

like运算符的部分提供了此行为的示例(并显示like确实尊重尾随空白):

The section for the like operator gives an example for this behavior (and shows that like does respect trailing whitespace):

mysql> SELECT 'a' = 'a ', 'a' LIKE 'a ';
+------------+---------------+
| 'a' = 'a ' | 'a' LIKE 'a ' |
+------------+---------------+
|          1 |             0 |
+------------+---------------+
1 row in set (0.00 sec)

不幸的是,UNIQUE索引似乎使用标准字符串比较来检查是否已经存在这样的值,因此忽略了结尾的空格. 这与使用VARCHARCHAR是独立的,在两种情况下都将拒绝插入,因为唯一检查失败.如果可以使用like语义进行UNIQUE检查,那么我不知道.

Unfortunately the UNIQUE index seems to use the standard string comparison to check if there is already such a value, and thus ignores trailing whitespace. This is independent from using VARCHAR or CHAR, in both cases the insert is rejected, because the unique check fails. If there is a way to use like semantics for the UNIQUE check then I do not know it.

您可以做的是将值存储为VARBINARY:

What you could do is store the value as VARBINARY:

mysql> create table test_ws ( `value` varbinary(255) UNIQUE );
Query OK, 0 rows affected (0.13 sec)

mysql> insert into test_ws (`value`) VALUES ('a');
Query OK, 1 row affected (0.08 sec)

mysql> insert into test_ws (`value`) VALUES ('a ');
Query OK, 1 row affected (0.06 sec)

mysql> SELECT CONCAT( '(', value, ')' ) FROM test_ws;
+---------------------------+
| CONCAT( '(', value, ')' ) |
+---------------------------+
| (a)                       |
| (a )                      |
+---------------------------+
2 rows in set (0.00 sec)

您最好不要在该列上执行按字母顺序排序的操作,因为排序将改为在字节值上进行,而这并不是用户期望的(无论如何,大多数用户).

You better do not want to do anything like sorting alphabetically on this column, because sorting will happen on the byte values instead, and that will not be what the users expect (most users, anyway).

另一种方法是修补MySQL并编写自己的归类类型NO PAD的排序规则.不知道是否有人要这样做,但是如果您愿意,请告诉我;)

The alternative is to patch MySQL and write your own collation which is of type NO PAD. Not sure if someone wants to do that, but if you do, let me know ;)

同时,根据 https://dev.mysql.com/doc/refman/8.0/en/char.html :

大多数MySQL归类都具有PAD SPACE的pad属性.例外是基于UCA 9.0.0及更高版本的Unicode归类,其填充属性为NO PAD.

Most MySQL collations have a pad attribute of PAD SPACE. The exceptions are Unicode collations based on UCA 9.0.0 and higher, which have a pad attribute of NO PAD.

https://dev.mysql.com /doc/refman/8.0/en/charset-unicode-sets.html

基于4.0.0以后的UCA版本的Unicode归类在归类名称中包含该版本.因此,utf8mb4_unicode_520_ci基于UCA 5.2.0权重密钥,而utf8mb4_0900_ai_ci基于UCA 9.0.0权重密钥.

Unicode collations based on UCA versions later than 4.0.0 include the version in the collation name. Thus, utf8mb4_unicode_520_ci is based on UCA 5.2.0 weight keys, whereas utf8mb4_0900_ai_ci is based on UCA 9.0.0 weight keys.

所以,如果您尝试:

  create table test_ws ( `value` varbinary(255) UNIQUE )
    character set utf8mb4 collate utf8mb4_0900_ai_ci;

您可以插入带空格或不带空格的值

you can insert values with and without trailing whitespace

您可以通过以下方式找到所有可用的NO PAD归类:

You can find all available NO PAD collations with:

 show collation where Pad_attribute='NO PAD';

这篇关于具有唯一字段的MySQL数据库忽略结尾空格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆