itextsharp-xmlworker unicode问题 [英] itextsharp - xmlworker unicode problem

查看:67
本文介绍了itextsharp-xmlworker unicode问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大家好,
您能帮我将HTML转换为PDF吗?我正在使用-"itextsharp.dll"(版本5.1.3)和"itextsharp.xmlworker.dll"(版本1.1.1).

我的问题是:创建的PDF不显示HTML中的Unicode字符.

这是我的代码:

Hi all,
would you please help me with my HTML to PDF conversion? I am using - "itextsharp.dll"(version 5.1.3) and "itextsharp.xmlworker.dll"(version 1.1.1).

My problem is: The created PDF doesn''t display unicode characters that are in my HTML.

Here is my code:

/************************************************************************************************/

            HttpContext.Current.Response.Clear();
            HttpContext.Current.Response.Charset = "";
            HttpContext.Current.Response.ContentType = "application/pdf";
            string strFileName = "123" + ".pdf";
            HttpContext.Current.Response.AddHeader("Content-Disposition", "inline; filename=" + strFileName);

            string outXml = @"<html><body style='font-family: MYSYLFAEN;'>
            <meta http-equiv='Content-Type' content='text/html; charset=UTF-8'/>
            eng-abc_arm-աբգ_rus-абв_eng-xyz</body></html>";

            MemoryStream memStream = new MemoryStream();

            TextReader xmlString = new StringReader(outXml);

            using (Document document = new Document())
            {
                PdfWriter writer = PdfWriter.GetInstance(document, memStream);
                document.SetPageSize(iTextSharp.text.PageSize.A4);
                document.Open();

                FontFactory.Register("C:/Windows/Fonts/sylfaen.ttf", "MYSYLFAEN");

                byte[] byteArray = System.Text.Encoding.UTF8.GetBytes(outXml);
                MemoryStream ms = new MemoryStream(byteArray);

                XMLWorkerHelper.GetInstance().ParseXHtml(writer, document, ms, System.Text.Encoding.UTF8);

                document.Close();
            }

            HttpContext.Current.Response.BinaryWrite(memStream.ToArray());
            HttpContext.Current.Response.End();
            HttpContext.Current.Response.Flush();

/************************************************************************************************/



请告诉我一个解决方案,亲爱的,以上这段代码有效,我的HTML唯一的问题是将HTML转换为PDF时要显示的非英文字符.

我不想使用HtmlWorker,因为它不了解边距,即"margin-left:100px;" HTML字符串中的表达式.

预先感谢,
致以最诚挚的问候,
Michael



Tell me a solution please dear all, this above code works, the only problem in my HTML is with non-english characters that I want to show when converting the HTML to a PDF.

I do not want to use HtmlWorker because it doesn''t understand margins, i.e. "margin-left: 100px;" expressions in the HTML string.

Thanks in advance,
with best regards,
Michael

推荐答案

PDF并没有很好地支持Unicode.为WinAnsiEncoding提供了标准字体,它是256个字符的非Unicode集.

我认为您无法在PDF中定义包含超过256个字符的字体编码,尽管自从我上次对此进行实际调查(PDF 1.4)以来,这种字体编码可能已更改.您可能必须创建一个子集字体,将其嵌入并用于Unicode字符-我怀疑iTextSharp不会为您自动实现该字体.

有时有用的一种技巧是针对希腊字符(即数学方程式),您可以将其转换为内置的Symbol字体.
PDF doesn''t really support Unicode well. The standard fonts are provided for the WinAnsiEncoding which is the 256 character non-Unicode set.

I don''t think you can have font encodings defined in PDF with more than 256 characters, though this might have changed since I was last really investigating this (PDF 1.4). You might have to create a subset font, embed that, and use it for the Unicode characters – and I suspect iTextSharp doesn''t automate that for you.

One hack that is sometimes useful is for Greek characters (i.e. in mathematical equations) for which you can translate to the built in Symbol font.


这篇关于itextsharp-xmlworker unicode问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆