C#PDFSharp:如何从PDF文本剥离的例子? [英] C# PDFSharp: Examples of how to strip text from PDF?
问题描述
我有一个相当简单的任务:我需要阅读PDF文件,并写出它的图像内容而忽视了其文本内容。所以基本上我需要做的另存为文本的补充。
I have a fairly simple task: I need to read a PDF file and write out its image contents while ignoring its text contents. So essentially I need to do the complement of "save as text".
在理想情况下,我宁愿避免任何形式的图像内容重新压缩,但如果它的不可能的,这是确定了。
Ideally, I would prefer to avoid any sort of re-compression of the image contents but if it's not possible, it's ok too.
是如何做到这一点的例子吗?
Are the examples of how to do it?
谢谢!
推荐答案
从PDFsharp一个PDF文件中提取文本不是一个简单的任务。
Extracting text from a PDF file with PDFsharp is not a simple task.
这是最近讨论了这一主题:
http://stackoverflow.com/a/9161732/162529
It was discussed recently in this thread: http://stackoverflow.com/a/9161732/162529
这篇关于C#PDFSharp:如何从PDF文本剥离的例子?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!