在Coldfusion中使用Unicode字符解析XML [英] Parsing XML with Unicode characters in Coldfusion
问题描述
我使用cfhttp连接到外部API,返回的数据为XML格式。我不能控制API或其返回的格式。
当数据返回时,我循环通过它,并执行cfquery插入到我自己的MySQL数据库有一个UTF8字符集。
然而,一些数据似乎有unicode字符(它似乎应该是£(井号)的标志,但是当我cfdump XMLParsed数据,它显示为一个钻石与??)。我附上了裁剪过的屏幕截图,显示了显示此内容的cfdump的一部分;
问题是cfquery插入 - 当它得到这些字符,它返回这个错误;
错误执行数据库查询。
字符串值不正确:'\xEF\xBF\xBD10 ...'for column'voucherTitle '在第1行。
我试过在cfhttp调用中设置字符集,但得到相同的结果。
有任何方法,我可以编码/解码这些,或者完全修剪它们(数据进一步编辑下去,无论如何,所以手动添加正确的符号
更新:从MySQL 5.5.3开始,也是 UTF8mb4,通常建议使用UTF8 。
我记得另一个类似的东西线程。使用 INFORMATION_SCHEMA.COLUMNS,仔细检查该列的排序规则和字符集视图:
SELECT *
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME ='YourTableName'
如果不是UTF-8,可以使用ALTER TABLE命令更改它。根据需要修改列大小 M
ALTER TABLE YourTableName
MODIFY YourColumnName VARCHAR(M)
CHARACTER SET utf8;
注意:如果数据重要,请务必备份
另请参阅: 11.1.15 MySQL支持的字符集和排序
I'm connecting to an external API using cfhttp, with the returned data in XML format. I have no control over the API or the format it's returned in.
When the data is returned, I loop through it and do cfquery inserts into my own MySQL database, which has a UTF8 charset.
However, some of the data appears to have unicode characters (it appears it should be the £ (pound) sign, but when I cfdump the XMLParsed data, it's showing as a diamond with a ? inside). I've attached a cropped screenshot showing part of the cfdump showing this;
The problem is the cfquery insert - when it gets to those characters, it's returning this error;
Error Executing Database Query.
Incorrect string value: '\xEF\xBF\xBD10 ...' for column 'voucherTitle' at row 1
I've tried setting the charset in the cfhttp call, but get the same result.
Is there any way I can either encode/decode these, or alternatively trim them out altogether (the data gets edited further down the line anyway, so manually adding the correct symbols isn't a huge issue).
UPDATE: As of MySQL 5.5.3, there is also UTF8mb4 which is often recommended over UTF8.
(From the comments)
I recall something similar on another thread. Double check the collation and character set for that column using the INFORMATION_SCHEMA.COLUMNS view:
SELECT *
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'YourTableName'
If it is not UTF-8, you can change it using the ALTER TABLE command. Modify the column size M
as needed.
ALTER TABLE YourTableName
MODIFY YourColumnName VARCHAR(M)
CHARACTER SET utf8;
NB: If the data is important, always make a backup of the table before applying any modifications.
See also: 11.1.15 Character Sets and Collations Supported by MySQL
这篇关于在Coldfusion中使用Unicode字符解析XML的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!