如何使用iTextsharp在c＃.net中逐行读取带有空格（实际上）的pdf文件 [英] how to read pdf file with blank spaces (as it is) line by Line in c#.net using iTextsharp

查看：170 发布时间：2018/11/16 16:37:06 c# pdf itext vb.net-2010

本文介绍了如何使用iTextsharp在c＃.net中逐行读取带有空格（实际上）的pdf文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用iText（for .net）来阅读pdf文件。它读取文档但是当有空格时它只读取一个空格。

I am using iText (for .net) to read pdf files. It reads the document but when there are whitespaces it reads only one space.

这使得无法通过获取子字符串来提取数据。我想逐行读取数据与空格，所以我知道文本的实际位置，因为我想将数据写入数据库。

That makes it impossible to extract data by getting substrings. I want to read data line by line with whitespaces so I know the actual position of text because I want to write the data into a database.

该文件是银行对帐单，我想将其转储到数据库中以设计一个已对帐系统，

The file is a bank statement, I want to dump it into a database for designing a reconciled system,

这是一个文件的屏幕截图

Here is a screen shot of a file

以下是我正在使用的代码

Following is the code which I am using

            For page As Integer = 1 To pdfReader.NumberOfPages
            ' Dim strategy As ITextExtractionStrategy = New SimpleTextExtractionStrategy()

            Dim Strategy As ITextExtractionStrategy = New iTextSharp.text.pdf.parser.LocationTextExtractionStrategy()
            Dim currentText As String = PdfTextExtractor.GetTextFromPage(pdfReader, page, strategy)
            currentText = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.[Default], Encoding.UTF8, Encoding.[Default].GetBytes(currentText)))


            Dim delimiterChars As Char() = {ControlChars.Lf}

            Dim lines As String() = currentText.Split(delimiterChars)

            Dim Bnk_Name As Boolean = True
            Dim Br_Name As Boolean = False
            Dim Name_acc As Boolean = False
            Dim statment As Boolean = False
            Dim Curr As Boolean = False
            Dim Open As Boolean = False
            Dim BankName = ""
            Dim Branch = ""
            Dim AccountNo = ""
            Dim CompName = ""
            Dim Currency = ""
            Dim Statement_from = ""
            Dim Statement_to = ""
            Dim Opening_Balance = ""
            Dim Closing_Balance = ""
            Dim Narration As String = ""
            For Each line As String In lines

                line.Trim()

                'BANK NAME
                If Bnk_Name Then
                    If line.Trim() <> "" Then
                        BankName = line.Substring(0, 21)
                        Bnk_Name = False
                    Else
                        Bnk_Name = False

                    End If
                End If

的示例，但我希望因为它是空白来读取位置

but I want as it is as whitespaces to read position

如何使用iTextsharp在c＃.net中逐行读取带有空格（实际上）的pdf文件 [英] how to read pdf file with blank spaces (as it is) line by Line in c#.net using iTextsharp

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录关闭

如何使用iTextsharp在c＃.net中逐行读取带有空格（实际上）的pdf文件 [英] how to read pdf file with blank spaces (as it is) line by Line in c#.net using iTextsharp

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录 关闭

登录关闭