utf8在回显的文本中未正确显示连字符 [英] utf8 not showing hyphens correctly in echoed text

查看:123
本文介绍了utf8在回显的文本中未正确显示连字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的MySQL数据库设置为utf8_unicode_ci,当我从查询连字符中回显文本时,我将$ pdo-> exec('SET NAMES"utf8"')作为以下php代码的一部分-看起来像这样– ".我在做什么错,为什么连字符不能正确显示?

my MySQL database is set to utf8_unicode_ci and I have $pdo->exec('SET NAMES "utf8"') as part of the following php code yet when I echo text from the query a hyphen - looks likes this â€". What am I doing wrong, why is the hyphen not displaying correctly?

<?php    
    try {
        $pdo = new PDO('mysql:host=localhost;dbname=danville_tpf', 'danville_dan', 'password');
        $pdo->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
        $pdo->exec('SET NAMES "utf8"');
    } catch (PDOException $e) {
        $output = 'Unable to connect to the database server.';
        include 'output.html.php';
        exit();
    }

    $output = 'Theme Park Database initialized';
    //include 'output.html.php';//

    try {
        $park_id = $_GET['park_id'];
        $query = "SELECT * FROM tpf_parks WHERE park_id = $park_id";
        $result = $pdo->query($query);
    } catch (PDOException $e) {
        $output = 'Unable to connect to the database server.';
        //include 'output.html.php';//
    }

    $output = 'Sucessfully pulled park';
    //include 'output.html.php';//

    foreach ($result as $row) {
        $parkdetails[] = array(
            'name' => $row['name'],
            'blurb' => $row['blurb'],
            'website' => $row['website'],
            'address' => $row['address'],
            'logo' => $row['logo']
        );    
    }
?>

请帮助.

推荐答案

â€是常见的 mojibake 表示破折号() ,这是与连字符不同的字符.

â€" is common mojibake for an en dash (), which is a different character from a hyphen.

这是采用UTF-8编码的破折号(0xe2 0x80 0x93)并错误地假定它实际上是使用

It is the result of taking the UTF-8–encoded form of the dash (0xe2 0x80 0x93) and incorrectly assuming that it is actually encoded using Windows-1252.

将这三个字节解释为Windows-1252:0xe20x800x93分别表示â.

Interpreting those three bytes as Windows-1252: 0xe2, 0x80 and 0x93 separately represent â, and ".

假设有问题的字符在blurb字段中,如果查询SELECT HEX(blurb) FROM tpf_parks(带有合适的WHERE子句),您将看到有问题的字节的十六进制编码.

Assuming the offending character is in the blurb field, if you query SELECT HEX(blurb) FROM tpf_parks (with a suitable WHERE clause), you will see the hex encoding of the offending bytes.

如果在其中看到E28093,则数据库值已正确编码为UTF-8,并且客户端或服务器配置中的字符编码不匹配.

If you see E28093 in there, then the database value is correctly encoded as UTF-8 and there will be a character encoding mismatch in your client or server configuration.

但是,如果看到C3A2E282ACE2809C,则说明该字符在数据库中的编码不正确-即解释不正确,然后另存为这3个字符的UTF-8表示形式.如果是这种情况,则需要更新数据以解决此问题.您可以使用 iconv :

If, however, you see C3A2E282ACE2809C, then the character has already been encoded incorrectly in the database — i.e. interpreted incorrectly, then saved as the UTF-8 representation of those 3 characters. If this is the case you'll need to update the data to fix the issue. You could do this using iconv:

$fixedData = iconv("utf-8", "windows-1252", $badData);

这会将经过双重转换的字节转换回UTF-8编码.

This will convert the doubly-converted bytes back to the UTF-8 encoding.

这篇关于utf8在回显的文本中未正确显示连字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆