utf8在回显的文本中未正确显示连字符 [英] utf8 not showing hyphens correctly in echoed text
问题描述
我的MySQL数据库设置为utf8_unicode_ci,当我从查询连字符中回显文本时,我将$ pdo-> exec('SET NAMES"utf8"')作为以下php代码的一部分-看起来像这样– ".我在做什么错,为什么连字符不能正确显示?
my MySQL database is set to utf8_unicode_ci and I have $pdo->exec('SET NAMES "utf8"') as part of the following php code yet when I echo text from the query a hyphen - looks likes this â€". What am I doing wrong, why is the hyphen not displaying correctly?
<?php
try {
$pdo = new PDO('mysql:host=localhost;dbname=danville_tpf', 'danville_dan', 'password');
$pdo->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$pdo->exec('SET NAMES "utf8"');
} catch (PDOException $e) {
$output = 'Unable to connect to the database server.';
include 'output.html.php';
exit();
}
$output = 'Theme Park Database initialized';
//include 'output.html.php';//
try {
$park_id = $_GET['park_id'];
$query = "SELECT * FROM tpf_parks WHERE park_id = $park_id";
$result = $pdo->query($query);
} catch (PDOException $e) {
$output = 'Unable to connect to the database server.';
//include 'output.html.php';//
}
$output = 'Sucessfully pulled park';
//include 'output.html.php';//
foreach ($result as $row) {
$parkdetails[] = array(
'name' => $row['name'],
'blurb' => $row['blurb'],
'website' => $row['website'],
'address' => $row['address'],
'logo' => $row['logo']
);
}
?>
请帮助.
推荐答案
â€
是常见的 mojibake 表示破折号(–
) ,这是与连字符不同的字符.
â€"
is common mojibake for an en dash (–
), which is a different character from a hyphen.
这是采用UTF-8编码的破折号(0xe2 0x80 0x93
)并错误地假定它实际上是使用
It is the result of taking the UTF-8–encoded form of the dash (0xe2 0x80 0x93
) and incorrectly assuming that it is actually encoded using Windows-1252.
将这三个字节解释为Windows-1252:0xe2
,0x80
和0x93
分别表示â
,€
和.
Interpreting those three bytes as Windows-1252: 0xe2
, 0x80
and 0x93
separately represent â
, €
and "
.
假设有问题的字符在blurb
字段中,如果查询SELECT HEX(blurb) FROM tpf_parks
(带有合适的WHERE子句),您将看到有问题的字节的十六进制编码.
Assuming the offending character is in the blurb
field, if you query SELECT HEX(blurb) FROM tpf_parks
(with a suitable WHERE clause), you will see the hex encoding of the offending bytes.
如果在其中看到E28093
,则数据库值已正确编码为UTF-8,并且客户端或服务器配置中的字符编码不匹配.
If you see E28093
in there, then the database value is correctly encoded as UTF-8 and there will be a character encoding mismatch in your client or server configuration.
但是,如果看到C3A2E282ACE2809C
,则说明该字符在数据库中的编码不正确-即解释不正确,然后另存为这3个字符的UTF-8表示形式.如果是这种情况,则需要更新数据以解决此问题.您可以使用 iconv
:
If, however, you see C3A2E282ACE2809C
, then the character has already been encoded incorrectly in the database — i.e. interpreted incorrectly, then saved as the UTF-8 representation of those 3 characters. If this is the case you'll need to update the data to fix the issue. You could do this using iconv
:
$fixedData = iconv("utf-8", "windows-1252", $badData);
这会将经过双重转换的字节转换回UTF-8编码.
This will convert the doubly-converted bytes back to the UTF-8 encoding.
这篇关于utf8在回显的文本中未正确显示连字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!