问题在iText的PdfTextExtractor! [英] Problem with PdfTextExtractor in itext!

查看:1846
本文介绍了问题在iText的PdfTextExtractor!的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先原谅我的英语不好!
我想在PDF文档中搜索,如你好一词。所以我必须在PdfTextExtractor阅读PDF的每一页。我做得很好。我可以阅读每一页的所有字单独的保存在字符串缓冲区。
,但是当我把这个代码在for循环中,(例如,从第1页至7在其搜索)早前页面的话会留在串buffer.I希望你们明白我的问题。
坦所有。
,这是我的代码:

first excuse me for my bad english! I want to search in pdf document for a word like "Hello" . So I must read each page in pdf by PdfTextExtractor. I did it well. I can read all words in each page separately an save it in string buffer. but when i push this code in For loop ,(for example from page 1 to 7 for search in it) earlier page's words will remain in string buffer.I hop you understand my problem. Tanx all. this is my code :

        PdfReader reader2 = new PdfReader(openFileDialog1.FileName);
        int pagen = reader2.NumberOfPages;
        reader2.Close();
        ITextExtractionStrategy its = new iTextSharp.text.pdf.parser.SimpleTextExtractionStrategy();
        for (int i = 1; i < pagen; i++)
        {
            textBox1.Text = "";
            PdfReader reader = new PdfReader(openFileDialog1.FileName);

            String  s = PdfTextExtractor.GetTextFromPage(reader, i, its);
            //MessageBox.Show(s.Length.ToString());
            //PdfTextArray h = new PdfTextArray(s);

            //
            // s = "";
            s = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(s)));
            textBox1.Text = s;
            reader.Close();



}

}

推荐答案

SimpleTextExtractionStrategy不会让你重置它不幸的是,所以你必须把你的新SimpleTextExtractionStrategy()内循环,而不是重用相同的对象。

SimpleTextExtractionStrategy doesn't let you reset it unfortunately, so you must move your "new SimpleTextExtractionStrategy()" inside the loop instead of reusing the same object.

这篇关于问题在iText的PdfTextExtractor!的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆