问题在iText的PdfTextExtractor! [英] Problem with PdfTextExtractor in itext!
问题描述
首先原谅我的英语不好!
我想在PDF文档中搜索,如你好一词。所以我必须在PdfTextExtractor阅读PDF的每一页。我做得很好。我可以阅读每一页的所有字单独的保存在字符串缓冲区。
,但是当我把这个代码在for循环中,(例如,从第1页至7在其搜索)早前页面的话会留在串buffer.I希望你们明白我的问题。
坦所有。
,这是我的代码:
first excuse me for my bad english! I want to search in pdf document for a word like "Hello" . So I must read each page in pdf by PdfTextExtractor. I did it well. I can read all words in each page separately an save it in string buffer. but when i push this code in For loop ,(for example from page 1 to 7 for search in it) earlier page's words will remain in string buffer.I hop you understand my problem. Tanx all. this is my code :
PdfReader reader2 = new PdfReader(openFileDialog1.FileName);
int pagen = reader2.NumberOfPages;
reader2.Close();
ITextExtractionStrategy its = new iTextSharp.text.pdf.parser.SimpleTextExtractionStrategy();
for (int i = 1; i < pagen; i++)
{
textBox1.Text = "";
PdfReader reader = new PdfReader(openFileDialog1.FileName);
String s = PdfTextExtractor.GetTextFromPage(reader, i, its);
//MessageBox.Show(s.Length.ToString());
//PdfTextArray h = new PdfTextArray(s);
//
// s = "";
s = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(s)));
textBox1.Text = s;
reader.Close();
}
}
推荐答案
SimpleTextExtractionStrategy不会让你重置它不幸的是,所以你必须把你的新SimpleTextExtractionStrategy()内循环,而不是重用相同的对象。
SimpleTextExtractionStrategy doesn't let you reset it unfortunately, so you must move your "new SimpleTextExtractionStrategy()" inside the loop instead of reusing the same object.
这篇关于问题在iText的PdfTextExtractor!的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!