在PHP中读取PDF元数据 [英] Reading PDF metadata in PHP
问题描述
我正在尝试读取附加到任意PDF的元数据:标题,作者,主题和关键字.
I'm trying to read metadata attached to arbitrary PDFs: title, author, subject, and keywords.
是否有一个PHP库(最好是开源库)可以读取PDF元数据?如果是这样,或者如果没有,那么如何使用该库(或缺少库)来提取元数据?
Is there a PHP library, preferably open-source, that can read PDF metadata? If so, or if there isn't, how would one use the library (or lack thereof) to extract the metadata?
需要明确的是,我对创建或修改PDF或其元数据不感兴趣,而且我也不关心PDF正文.我看过许多库,包括FPDF(每个人似乎都建议这样做),但它似乎仅用于PDF创建,而不用于元数据提取.
To be clear, I'm not interested in creating or modifying PDFs or their metadata, and I don't care about the PDF bodies. I've looked at a number of libraries, including FPDF (which everyone seems to recommend), but it appears only to be for PDF creation, not metadata extraction.
推荐答案
Zend框架包括 Zend_Pdf ,这真的很容易:
The Zend framework includes Zend_Pdf, which makes this really easy:
$pdf = Zend_Pdf::load($pdfPath);
echo $pdf->properties['Title'] . "\n";
echo $pdf->properties['Author'] . "\n";
限制:仅适用于小于16MB的未加密文件.
Limitations: Works only on files without encryption smaller then 16MB.
这篇关于在PHP中读取PDF元数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!