Itextsharp无法在c＃中提取pdf unicode内容 [英] Itextsharp can't extract pdf unicode content in c#

查看：171 发布时间：2018/11/16 17:43:17 c# pdf unicode itextsharp persian

本文介绍了Itextsharp无法在c＃中提取pdf unicode内容的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试使用 itextsharp 获取pdf文件的内容，如您所见：

I am trying to get the content of pdf file using itextsharp as you can see :

static void Main(string[] args)
{
    StringBuilder text = new StringBuilder();
    using (PdfReader reader = new PdfReader(@"D:\a.pdf"))
    {
        for (int i = 1; i <= reader.NumberOfPages; i++)
        {
            text.Append(PdfTextExtractor.GetTextFromPage(reader, i));
        }
    }
    System.IO.File.WriteAllText(@"c:/a.txt",text.ToString());
    Console.ReadLine();
}

我的pdf内容是用波斯语写的，运行上面的代码后结果如下：

My pdf content is written in Persian ,and after running the above code to result is like this :

但这不是正确的结果。我应该在中设置任何选项itextsharp

But this is not correct result.should i set any option in itextsharp

Itextsharp无法在c＃中提取pdf unicode内容 [英] Itextsharp can't extract pdf unicode content in c#

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录关闭

Itextsharp无法在c＃中提取pdf unicode内容 [英] Itextsharp can&#39;t extract pdf unicode content in c#

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录 关闭

Itextsharp无法在c＃中提取pdf unicode内容 [英] Itextsharp can't extract pdf unicode content in c#

登录关闭