在PHP中读取PDF元数据 [英] Reading PDF metadata in PHP

查看:240
本文介绍了在PHP中读取PDF元数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试读取附加到任意PDF的元数据:标题,作者,主题和关键字.

I'm trying to read metadata attached to arbitrary PDFs: title, author, subject, and keywords.

是否有一个PHP库(最好是开源库)可以读取PDF元数据?如果是这样,或者如果没有,那么如何使用该库(或缺少库)来提取元数据?

Is there a PHP library, preferably open-source, that can read PDF metadata? If so, or if there isn't, how would one use the library (or lack thereof) to extract the metadata?

需要明确的是,我对创建或修改PDF或其元数据不感兴趣,而且我也不关心PDF正文.我看过许多库,包括FPDF(每个人似乎都建议这样做),但它似乎仅用于PDF创建,而不用于元数据提取.

To be clear, I'm not interested in creating or modifying PDFs or their metadata, and I don't care about the PDF bodies. I've looked at a number of libraries, including FPDF (which everyone seems to recommend), but it appears only to be for PDF creation, not metadata extraction.

推荐答案

Zend框架包括 Zend_Pdf ,这真的很容易:

The Zend framework includes Zend_Pdf, which makes this really easy:

$pdf = Zend_Pdf::load($pdfPath);

echo $pdf->properties['Title'] . "\n";
echo $pdf->properties['Author'] . "\n";

限制:仅适用于小于16MB的未加密文件.

Limitations: Works only on files without encryption smaller then 16MB.

这篇关于在PHP中读取PDF元数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆