从XML读取德语文本并写入PDF [英] Read German text from XML and write to a PDF

查看:120
本文介绍了从XML读取德语文本并写入PDF的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个XML(使用UTF-8).我必须使用 PugiXML 库从其中读取std::string变量的值.读取值后,我将其打印在控制台上,但是在我的实际项目中,我必须将该值放入PDF(使用LibHaru库).我的 MWE 如下:

I have an XML (in UTF-8). I have to read a value of a std::string variable from it using PugiXML libraries. After reading the value, I am printing it on console but in my actual project, I have to put that value to a PDF (using LibHaru libraries). My MWE is following:

#include <iostream>
#include "pugiconfig.hpp"
#include "pugixml.hpp"

using namespace pugi;

int main()
{   
    pugi::xml_document doc;
    pugi::xml_parse_result result = doc.load_file(FILEPATH);

    xml_node root_node = doc.child("Report");
    xml_node SystemName_node = root_node.child("SystemName");

    std::string strSystemName = SystemName_node.child_value();

    std::cout<<" The name of the system is: "<<strSystemName<<std::endl;

    return 0;
}

我正在使用 Pugixml 库从XML文件读取变量std::string strSystemName的值.读取变量后,我将其打印在屏幕上(在我的实际项目中,我将其写入pdf文件). 问题: :在调试过程中,我发现已经从XML文件(该文件已经存在于UTF-8中)读取了奇怪的字符,如果我将变量打印在屏幕或将其放入pdf.

I am reading the value of a variable std::string strSystemName from a XML file using Pugixml libraries. After reading the variable I am printing it on screen (in my actual project, I am writing it to a pdf file). Problem: During debugging, I found that the strange characters have been read from the XML file (which is already in UTF-8), which appears if I print the variable on screen or put it to the pdf.

重要:打印到控制台并不是太重要.重要的是将其正确放置在同样使用UTF-8编码的PDF文件中.但是我认为将变量存储在std::string中会造成某种问题,因此将wrone值传递给PDF编写器.

IMPORTANT: Printing to console is not too important. Important is to put it properly to the PDF file which is also using UTF-8 encoding. But I think that storing the variable in std::string is somehow creating problem and therefore the wrone value is passed to the PDF writer.

PS::我正在使用没有C ++ 11的 VS2010 .

PS: I am using VS2010 which is without C++11.

推荐答案

这里的问题是std::cout只是将字符串中的UTF-8字节反映到控制台.通常,在Windows上,该控制台不是在UTF-8中运行,而是在(例如)代码页1252中运行,因此UTF-8'ä`的两个字节显示为两个字符.

The problem here is that std::cout is just reflecting the UTF-8 bytes in the string to the console. Normally on Windows, the console is not running in UTF-8, but in (for example) code page 1252, so the two bytes of a UTF-8 'ä` get displayed as two characters.

您的解决方案是将控制台转换为UTF-8(请参见答案),或者将您的UTF转换为将-8字符串转换为CP-1252字符串.我认为这将需要MultiByteToWideChar(指定UTF-8)+ WideCharToMultiByte(指定CP-1252)

Your solution is either to convert the console to UTF-8 (see this answer), or to convert your UTF-8 string into a CP-1252 string. I think this is going to require MultiByteToWideChar (specifying UTF-8) + WideCharToMultiByte (specifying CP-1252)

要调试您的 actual 问题(将UTF-8字符串传递到pugixml中),您需要查看字符串中的实际字节,并检查它们是否符合您的想法.

To debug your actual problem (passing UTF-8 strings into pugixml), you need to look at the actual bytes in the strings, and check they are what you think they are.

这篇关于从XML读取德语文本并写入PDF的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆