使用PHP从pdf提取内容 [英] extracting content from pdf using PHP

查看:175
本文介绍了使用PHP从pdf提取内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

能否请您告诉我如何使用PHP从PDF文档中提取内容?格式化是我在这里面临的主要问题.因此,如果有某种方法可以提取相同格式的内容并将其显示在在线文本编辑器上,请告诉我.

Could you please tell me how to extract content from PDF document using PHP? Formatting is the main problem im facing here. So let me know, if there are some ways to extract content with the same format and to display it on an online text editor.

谢谢

推荐答案

据我所知,不可能使用PHP实时将PDF转换为可编辑HTML,同时保留格式化.所有 try 周围都有许多桌面应用程序,它们可以从PDF中提取数据,有时结果可靠性更高,有时可靠性更低.我会说目前这实际上是不可能的,您所能做的就是使用XPDF或其他命令行工具提取纯文本.

As far as I can see, it is not possible to convert a PDF to editable HTML using PHP on the fly, while preserving formatting. There are a number of Desktop apps around that all try to extract data from PDFs with sometimes more, sometimes less reliable results. I would say this is not realistically possible at the moment and all you can do is to extract plain text using XPDF or other command line tools.

基于XML的新PDF格式可能有所不同,但我对此一无所知.

It may be different with that new XML-Based PDF format but I don't really know anything about that yet.

当然可以随时证明我是错的-如果有解决方案,我会很感兴趣.

Feel free to prove me wrong, of course - I'd be very interested myself if there were a solution.

这篇关于使用PHP从pdf提取内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆