如何将OCR文本从一个PDF传输到另一个PDF? [英] How to transfer OCR text from one PDF to another PDF?

查看:115
本文介绍了如何将OCR文本从一个PDF传输到另一个PDF?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个相同的扫描PDF的两个版本.其中之一具有OCR层.如何将层转移到另一层?我已经安装了Ghostscript,但是我不知道下一步该怎么做.

I have two versions of one same scanned PDF. One of them has an OCR layer. How can I transfer the layer to the other one? I already install Ghostscript, but I don't know what to do next.

如何使用Ghostscript

推荐答案

PDF中没有"OCR层"之类的东西.

There's no such thing as an 'OCR layer' in PDF.

您最可能拥有的是一个PDF文件,其中包含扫描的图像以及使用OCR从该图像中提取的文本,这些文本被绘制为不可见"文本(文本呈现模式3).

Most likely what you have is a PDF file which has a scanned image and the text extracted from that image using OCR which has been drawn as 'invisible' text (text rendering mode 3).

通常,您无法在PDF文件之间复制和粘贴文本,因此很难执行您要的操作.我不知道有什么工具可以帮到您,我可以肯定地说Ghostscript绝对不会帮您.

In general you can't copy and paste text between PDF files, so its very hard to do what you are asking. I don't know of any tools which will help you here, I can say for certain that Ghostscript absolutely will not help you at all.

您很有可能还需要从PDF文件中复制字体(或CIDFont),如果它具有ToUnicode CMap,则您肯定也希望这样做或搜索将不起作用(并且在此没有什么意义)这种OCR除外).

Most likely you will also need to copy the Font (or CIDFont) from the PDF file as well, and if it has a ToUnicode CMap you'll definitely also want that or search won't work (and there's little point in this sort of OCR otherwise).

由于您有一个包含OCR文本的PDF文件,为什么不简单使用该PDF?我看不到您为什么要将其转移"到另一个PDF文件的任何原因.

Since you have a PDF file which includes the OCR'ed text, why not simply use that PDF ? I can't see any reason why you would want to 'transfer' it to another PDF file.

这篇关于如何将OCR文本从一个PDF传输到另一个PDF?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆