XMLWriter(PHP)的编码问题 [英] Encoding issues with XMLWriter (PHP)
问题描述
使用以下简单的PHP代码:
Take this simple PHP code:
$xmlWriter = new XMLWriter();
$xmlWriter->openURI('php://output');
$xmlWriter->startDocument('1.0', 'utf-8');
$xmlWriter->writeElement('test', $data);
$xmlWriter->endDocument();
$xmlWriter->flush();
XMLWriter类具有一个不错的功能:它将把您提供给它的所有数据转换为输出编码.例如,在这里它将把$data
转换为UTF-8,因为我在startDocument
函数中传递了'utf-8'
.
The XMLWriter class has a nice feature: it will convert any data you give to it to the output encoding. For example here it will convert $data
to UTF-8 because I passed 'utf-8'
in the startDocument
function.
问题在于,在我的情况下,$data
的内容来自其输出格式为UTF-8的数据库,因此已经为UTF-8 了. XMLWriter 可能认为数据在ISO-8859-1中,然后再次将其转换为UTF-8,我得到了奇怪的符号,应该重音.
The problem is that in my case the content of $data
comes from a database whose output format is UTF-8 and is therefore already in UTF-8. The XMLWriter probably thinks the data is in ISO-8859-1 and converts it again to UTF-8, and I get weird symbols where I should get accents.
目前,我正在对来自数据库的每个字符串使用utf8_decode
,这意味着我正在从UTF-8转换为ISO-8859-1,然后XMLWriter将其转换回UTF-8.
Currently I'm using utf8_decode
around each string coming from the database, which means I'm converting from UTF-8 to ISO-8859-1, and then XMLWriter turns it back into UTF-8.
这有效,但不干净:
$xmlWriter->writeElement('test', utf8_decode($data));
有更清洁的解决方案吗?
Is there a cleaner solution ?
编辑:显示完整示例
$xmlWriter = new XMLWriter();
$xmlWriter->openURI('php://output');
$xmlWriter->startDocument('1.0', 'utf-8');
$xmlWriter->startElement('usersList');
$database = new PDO('mysql:host=localhost;dbname=xxxxx', 'xxxxx', 'xxxxx');
$database->exec('SET CHARACTER SET UTF8');
$database->exec('SET NAMES UTF8');
foreach ($database->query('SELECT name FROM usersList') as $user)
$xmlWriter->writeElement('user', $user[0]); // if the user's name is 'hervé' in the database, it will print 'hervé' instead
$xmlWriter->endElement();
$xmlWriter->endDocument();
$xmlWriter->flush();
推荐答案
我不确定您从何处想到XMLWriter
会转换编码.没有.您必须为其提供utf-8.它可以输出不同的编码,但是输入字符串必须为utf-8.
I'm not sure where you got the idea that XMLWriter
converts encodings. It doesn't. You must supply it with utf-8. It can output different encodings, but input strings must be utf-8.
这里可能发生以下两种情况之一:
One of two things may be going on here:
- 无论您使用什么视图,输出文档都将字符串解释为win-1252.如果在浏览器中查看输出,则可能需要设置content-type标头,如下所示:
header('Content-Type: application/xml; charset=UTF-8');
- 您将数据错误地存储在数据库中,并且您的é"实际上是两个Unicode字符é".解决这个问题很困难.
- Whatever you are using to view your output document is interpreting the string as win-1252. If you are viewing your output in a browser, you may need to set the content-type header like so:
header('Content-Type: application/xml; charset=UTF-8');
- You stored your data in your database incorrectly, and your "é" is actually two unicode characters "é". Fixing this is difficult.
这篇关于XMLWriter(PHP)的编码问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!