PDF提取未完成 [英] PDF extraction not complete

查看：117 发布时间：2020/5/25 4:51:28 c# pdf itext

本文介绍了PDF提取未完成的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试从PDF文件提取文本: http://www.filedropper.com/copy_1 ，但是我从页面中得到的文字不到一半. 我正在使用iTextSharp:

I'm trying to extract text from the PDF file: http://www.filedropper.com/copy_1, but I get less than half of text from a page. I'm using iTextSharp:

PdfReader reader = new PdfReader(file);
string currentText =  PdfTextExtractor.GetTextFromPage(reader, 1);

我也使用了SimpleTextExtractionStrategy来代替默认的LocationTextExtractionStrategy:

I have used SimpleTextExtractionStrategy as well instead of default LocationTextExtractionStrategy:

PdfTextExtractor.GetTextFromPage(reader, 1, new SimpleTextExtractionStrategy())

该文件最初是从Microsoft Reporting Service(我无权访问)生成的，并且我已经提取了一页用于测试文本提取.

The file was originally generated from Microsoft Reporting Service (to which I don't have an access), and that I've extracted one page for testing the text extraction.

任何人都可以帮忙吗?

推荐答案

尝试一下:-

PdfReader reader = new PdfReader(file);
StringBuilder currentText= new StringBuilder();
for (int i= 1; i <= reader.NumberOfPages; i++)
{
    currentText.Append(PdfTextExtractor.GetTextFromPage(reader, i));
}

，然后对"currentText"执行所需的任何操作.

and then perform whatever operation you want on "currentText".

这篇关于PDF提取未完成的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

PDF提取未完成 [英] PDF extraction not complete

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录关闭

PDF提取未完成 [英] PDF extraction not complete

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录 关闭

登录关闭