内部的libxml和输出编码 [英] LibXML internal and output encodings

查看:527
本文介绍了内部的libxml和输出编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图写在ISO-8859-1 libxml2的XML文件。
但是从文档似乎为我创造我将不得不转换为UTF-8这是的libxml的内部编码每个文本节点。然后,当调用xmlSaveFor​​matFileEnc()的libxml转换为目标编码,并将该编码属性的文档

I'm trying to write XML files with libxml2 in ISO-8859-1. But from the documentation it seems that for each text node that I create I'll have to convert to UTF-8 which is libxml's internal encoding. Then when calling xmlSaveFormatFileEnc() libxml converts to the target encoding and adds the encoding attribute to the document.

这是假设是正确的?
现在我的code去大致是这样的:

Is this assumption correct? For now my code goes roughly like this:


    的xmlNode * root_element = NULL,*节点4 = NULL;
    xmlDoc中* DOC = NULL;

xmlNode *root_element = NULL, *node4 = NULL; xmlDoc *doc = NULL;

doc = xmlNewDoc(BAD_CAST XML_DEFAULT_VERSION);
root_element = xmlNewDocNode(doc, NULL, BAD_CAST("root"),
                    NULL);
char * input_str = getLatin1Data();
isolat1ToUTF8(utf8_str, &file_size, input_str, &inlen);

node4 = xmlNewCDataBlock(doc, BAD_CAST list_content, xmlStrlen(BAD_CAST utf8_str));

xmlAddChild(root_element, node4);
xmlSaveFormatFileEnc("test_file.xml", doc, "UTF-8", 1);
xmlFreeDoc(doc);

推荐答案

您的假设是正确的。当 XMLCHAR 预计,像 xmlNewCDataBlock xmlNewText ,它始终是UTF-8:

Your assumption is right. When xmlChar is expected, like in xmlNewCDataBlock, xmlNewText, it is always UTF-8:

包含/的libxml / xmlstring.h (libxml的2.8.0):

From include/libxml/xmlstring.h (libxml 2.8.0):

/**
 * xmlChar:
 *
 * This is a basic byte in an UTF-8 encoded string.
 * It's unsigned allowing to pinpoint case where char * are assigned
 * to xmlChar * (possibly making serialization back impossible).
 */

这篇关于内部的libxml和输出编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆