PDF文档使用itext sharp在C#.net中读取。 [英] PDF Document Reading in C#.net using itext sharp.

查看:92
本文介绍了PDF文档使用itext sharp在C#.net中读取。的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大家好,

i在提取pdf文档段落方面遇到问题,请帮帮我。



我的代码是,

Hi all,
i am facing problem in extracting text of pdf document paragraph wise, please help me out.

my code is,

private void ReadPdf(string _filePath)
        {
            PdfReader rd = new PdfReader(_filePath);
            int pageNumber = 1;
           // TextWriter oContent = TextWriter(Char);
            string oContent = "";
            while (pageNumber <= rd.NumberOfPages)
            {
                oContent += PdfString.STREAM.ToString();
                ++pageNumber;
            }
        }



我能读取pdf文本,但它逐行提取。但我想在段落中


in the above code i am able read the pdf text but it extracts line by line. but i want in paragraph wise

推荐答案

PdfReader reader = new PdfReader(path);
 StringWriter output = new StringWriter();
 for (int i = 1; i <= reader.NumberOfPages; i++)
 {
     Paragraph o = CreateSimpleHtmlParagraph(output.ToString());
     output.WriteLine(PdfTextExtractor.GetTextFromPage(reader, i, new SimpleTextExtractionStrategy()));
 }


您好b $ b

这将对您有帮助


Hi
This will helpful for u

protected void Page_Load(object sender, EventArgs e)
        {
            SqlServer server = new SqlServer("Data Source=KSHIT6773-G13\\SQLEXPRESS;Initial Catalog=Test;Integrated Security=True");
            string[] sql = { "SELECT E.Name, D.Name FROM Employee E, Department D WHERE D.DepartmentID = E.Department" };
            string[] table = { "EMPDEPT" };
            DataSet ds = new DataSet();
            ds = server.GetDataSet(sql, table, false);

            ReportCRtoPDF rptObj = new ReportCRtoPDF();
            rptObj.SetDataSource(ds);
            
            DiskFileDestinationOptions dsk = new DiskFileDestinationOptions();
            dsk.DiskFileName = Request.PhysicalApplicationPath + "files\\CrtoPDF.pdf";
            ExportOptions ex = new ExportOptions();
            ex.ExportDestinationType = ExportDestinationType.DiskFile;
            ex.ExportFormatType = ExportFormatType.PortableDocFormat;
            ex.ExportDestinationOptions = dsk;
            rptObj.Export(ex);
        }


这篇关于PDF文档使用itext sharp在C#.net中读取。的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆