在MySQL 5.6中将整数与显式排序规则进行比较时如何避免无效错误 [英] How to avoid invalid errors when comparing integers with explicit collation in MySQL 5.6

查看：49 发布时间：2021/4/21 20:27:21 mysql character-encoding utf8mb4

本文介绍了在MySQL 5.6中将整数与显式排序规则进行比较时如何避免无效错误的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

好吧，这不是有史以来最清晰的标题；随时改善.

Okay, not the clearest title ever; feel free to improve.

我有一张表格，代表数千种语言形式.其中许多都大量使用变音符号，因此 aha ， áha̱ 和 ā̧́ḫà̀ 都可能出现.该表(和数据库)使用UTF-8作为字符集，并使用 utf8mb4_unicode_520_ci 作为默认归类方案，因为搜索应区分大小写和变音符(因此搜索 aha 应该调出所有三个).这些表格都是由人工手动输入的，因此不可避免地会有重复.

I have a table representing thousands of linguistic forms. Many of these make heavy use of diacritics, so all of aha, áha̱ and ā̧́ḫà̀ may appear. The table (and the database) uses UTF-8 as character set and utf8mb4_unicode_520_ci as the default collation scheme, since searching should be case- and diacritic-agnostic (so searching for aha should bring up all three). These forms have all been entered in manually by human beings, though, so there are inevitably duplicates.

我目前正在尝试获取完全相同形式的列表，以消除重复(手动-每个令牌都必须在删除之前进行检查)，但是在这种情况下，我需要在变音符号中进行搜索-感知方式–也就是说，鉴于上面列出的三个标记，由于变音符号，它们是三种不同的形式，因此我希望搜索不会产生任何结果.

I’m currently trying to get a list of exactly identical forms in order to get rid of duplicates (manually – each token would have to be checked before being removed), but in this case I need to search in a diacritic-aware manner – that is, given the three tokens listed above, I would expect a search to yield no results, since they are three different forms because of the diacritics.

我认为这应该是一个相当容易的任务；只是做:

I figured this should be a fairly easy task; just do:

SELECT token FROM table GROUP BY token HAVING COUNT(token) > 1 COLLATE utf8mb4_bin

但是a，那行不通.相反，它给我一个错误消息，即"COLLATION utf8mb4_bin对字符集latin1无效".我应该注意到，我在任何地方都没有拉丁文-1，没有字符集，没有排序规则，没有服务器字符集，也没有.也没有存储过程或其他任何可能会出现Latin-1的内容.

But alas, that does not work. Instead, it gives me an error message that "COLLATION utf8mb4_bin is not valid for CHARACTER SET latin1". I should note that I have absolutely nothing Latin-1 anywhere – no character sets, no collations, no server charsets, nothing. There are also no stored procedures or anything else where Latin-1 might creep in.

不，这是因为此错误，该错误已修复从5.7开始；请参阅底部的说明:

No, this is because of this bug, which is apparently fixed from 5.7 onwards; see the description at the bottom:

对于诸如ORDER BY numeric_expr COLLATE collation_name的构造，该表达式的字符集被视为latin1，如果COLLATE之后指定的排序规则与latin1不兼容，则会导致错误.现在，当在存在COLLATE的情况下将数字表达式隐式转换为字符表达式时，使用的字符集就是与命名排序规则关联的字符集.

For constructs such as ORDER BY numeric_expr COLLATE collation_name, the character set of the expression was treated as latin1, which resulted in an error if the collation specified after COLLATE is incompatible with latin1. Now when a numeric expression is implicitly cast to a character expression in the presence of COLLATE, the character set used is the one associated with the named collation.

很遗憾，我的系统是5.6，并且无法进行升级(烦人).将数据转换为Latin-1也是一种选择，也不能更改表上的排序规则.

Unfortunately, I’m on 5.6, and I don’t have the option of upgrading (annoyingly). Converting the data to Latin-1 is also not an option, nor is changing the collation on the table.

有没有一种方法可以运行我的查询或等效的查询，得到我想要的结果集，而不会出现排序规则错误?

Is there a way to run my query or an equivalent one yielding the result set I’m after, without getting the collation error?

在MySQL 5.6中将整数与显式排序规则进行比较时如何避免无效错误 [英] How to avoid invalid errors when comparing integers with explicit collation in MySQL 5.6

问题描述

推荐答案

相关文章

数据库最新文章

热门教程

热门工具

登录关闭

在MySQL 5.6中将整数与显式排序规则进行比较时如何避免无效错误 [英] How to avoid invalid errors when comparing integers with explicit collation in MySQL 5.6

问题描述

推荐答案

相关文章

数据库最新文章

热门教程

热门工具

登录 关闭

登录关闭