是"SET CHARACTER SET utf8"吗?必要的? [英] Is "SET CHARACTER SET utf8" necessary?

查看:41
本文介绍了是"SET CHARACTER SET utf8"吗?必要的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在重新编写我们的数据库类(基于PDO),并陷入了困境.在PHP和MySQL中使用UTF-8时,我被教成同时使用SET NAMES utf8SET CHARACTER SET utf8.

I´m rewritting our database class (PDO based), and got stuck at this. I´ve been taught to both use SET NAMES utf8 and SET CHARACTER SET utf8 when working with UTF-8 in PHP and MySQL.

在PDO中,我现在想使用PDO::MYSQL_ATTR_INIT_COMMAND参数,但是它仅支持一个查询.

In PDO I now want to use the PDO::MYSQL_ATTR_INIT_COMMAND parameter, but it only supports one query.

SET CHARACTER SET utf8是否必要?

推荐答案

在使用SET NAMES utf8后使用SET CHARACTER SET utf8实际上会将character_set_connectioncollation_connection重置为
@@character_set_database@@collation_database.

Using SET CHARACTER SET utf8 after using SET NAMES utf8 will actually reset the character_set_connection and collation_connection to
@@character_set_database and @@collation_database respectively.

手册指出

  • SET NAMES x等同于

SET character_set_client = x;
SET character_set_results = x;
SET character_set_connection = x;

  • SET CHARACTER SET x等同于

    SET character_set_client = x;
    SET character_set_results = x;
    SET collation_connection = @@collation_database;
    

  • SET collation_connection = x也在内部执行SET character_set_connection = <<character_set_of_collation_x>>,而SET character_set_connection = x也在内部执行SET collation_connection = <<default_collation_of_character_set_x.

    whereas SET collation_connection = x also internally executes SET character_set_connection = <<character_set_of_collation_x>> and SET character_set_connection = x internally also executes SET collation_connection = <<default_collation_of_character_set_x.

    因此,基本上,您要将character_set_connection重置为@@character_set_database,将collation_connection重置为@@collation_database.手册说明了这些变量的用法:

    So essentially you're resetting character_set_connection to @@character_set_database and collation_connection to @@collation_database. The manual explains the usage of these variables:

    服务器应使用什么字符集 将语句翻译为after 收到吗?

    What character set should the server translate a statement to after receiving it?

    为此,服务器使用 character_set_connection和 collat​​ion_connection系统变量. 它转换由 客户从character_set_client到 character_set_connection(除了 具有 简介,例如_latin1或_utf8). collat​​ion_connection对 文字字符串的比较.为了 列与字符串的比较 值,collat​​ion_connection不 很重要,因为列有自己的 排序规则,它具有较高的 排序规则优先级.

    For this, the server uses the character_set_connection and collation_connection system variables. It converts statements sent by the client from character_set_client to character_set_connection (except for string literals that have an introducer such as _latin1 or _utf8). collation_connection is important for comparisons of literal strings. For comparisons of strings with column values, collation_connection does not matter because columns have their own collation, which has a higher collation precedence.

    总而言之,MySQL用于处理查询的编码/代码转换过程及其结果是一个多步骤的事情:

    To sum this up, the encoding/transcoding procedure MySQL uses to process the query and its results is a multi-step-thing:

    1. MySQL将传入的查询视为在character_set_client中进行了编码.
    2. MySQL将语句从character_set_client转码为character_set_connection
    3. 在将字符串值与列值进行比较时,MySQL将character_set_connection中的字符串值转码为给定数据库列的字符集,并使用列排序规则进行排序和比较.
    4. MySQL建立以character_set_results编码的结果集(其中包括结果数据以及结果元数据,例如列名等)
    1. MySQL treats the incoming query as being encoded in character_set_client.
    2. MySQL transcodes the statement from character_set_client into character_set_connection
    3. when comparing string values to column values MySQL transcodes the string value from character_set_connection into the character set of the given database column and uses the column collation to do sorting and comparison.
    4. MySQL builds up the result set encoded in character_set_results (this includes result data as well as result metadata such as column names and so on)

    因此,SET CHARACTER SET utf8可能不足以提供完整的UTF-8支持.考虑默认数据库字符集latin1和用utf8 -charset定义的列,并执行上述步骤.由于latin1不能覆盖UTF-8可以覆盖的所有字符,因此在步骤 3 中可能会丢失字符信息.

    So it could be the case that a SET CHARACTER SET utf8 would not be sufficient to provide full UTF-8 support. Think of a default database character set of latin1 and columns defined with utf8-charset and go through the steps described above. As latin1 cannot cover all the characters that UTF-8 can cover you may lose character information in step 3.

    • 步骤 3 ::假设您的查询使用UTF-8编码,并且包含无法用latin1表示的字符,则从utf8latin1(默认数据库字符集)使查询失败.
    • Step 3: Given that your query is encoded in UTF-8 and contains characters that cannot be represented with latin1, these characters will be lost on transcoding from utf8 to latin1 (the default database character set) making your query fail.

    因此,我认为可以肯定地说SET NAMES ...是处理字符集问题的正确方法.即使我可能会补充说,正确设置MySQL服务器变量(所有必需的变量都可以在my.cnf中静态设置)可以免除每次连接所需的额外查询的性能开销.

    So I think it's safe to say that SET NAMES ... is the correct way to handle character set issues. Even though I might add that setting up your MySQL server variables correctly (all the required variables can be set statically in your my.cnf) frees you from the performance overhead of the extra query required on every connect.

    这篇关于是"SET CHARACTER SET utf8"吗?必要的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆