如何使用PDFBOX检查完全嵌入和部分嵌入的字体 [英] How to check Fully embedded and subset embedded font using PDFBOX

查看:568
本文介绍了如何使用PDFBOX检查完全嵌入和部分嵌入的字体的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用PDFBOX检查PDF中字体的完全嵌入和子集嵌入. 我尝试使用以下逻辑进行检查:

Hi I want to check fully embedding and subset embedding of fonts in PDF using PDFBOX. I have tried using the following logic to check:

private boolean IsEmbedded(Map<String, PDFont> fontsMap, Set<String> keys) {
    for(String key:keys) {
        PDFont font = fontsMap.get(key);
        PDFontDescriptor  fontDescriptor = font.getFontDescriptor();
        if(null != fontDescriptor && fontDescriptor instanceof PDFontDescriptorDictionary){
            PDFontDescriptorDictionary fontDescriptorDictionary = (PDFontDescriptorDictionary)fontDescriptor;
            if(null == fontDescriptorDictionary.getFontFile() && null == fontDescriptorDictionary.getFontFile2() && null == fontDescriptorDictionary.getFontFile3())
                return false;
        }
    }
    return true;
}

但是似乎我无法找出如何区分完全嵌入或子集嵌入. 有人可以给我答案吗?

But seems I could not able to find out how to differentiate between Fully Embedding or sub-set embedding. Can anyone please give me the answer?

推荐答案

引用PDF规范

To quote the PDF specification ISO 32000-1 on font subsets (section 9.6.4):

PDF文档可能包含Type 1和TrueType字体的子集.描述字体子集的字体和字体描述符与普通字体略有不同.这些差异允许合格的阅读器识别字体子集,并合并包含同一字体的不同子集的文档. (有关字体描述符的更多信息,请参见9.8,字体描述符".)

PDF documents may include subsets of Type 1 and TrueType fonts. The font and font descriptor that describe a font subset are slightly different from those of ordinary fonts. These differences allow a conforming reader to recognize font subsets and to merge documents containing different subsets of the same font. (For more information on font descriptors, see 9.8, "Font Descriptors".)

对于字体子集,字体的PostScript名称(字体的 BaseFont 条目的值和字体描述符的 FontName 条目)应以标签开头,后跟标签加号(+).标签应正好由六个大写字母组成;字母的选择是任意的,但是同一PDF文件中的不同子集应具有不同的标签.

For a font subset, the PostScript name of the font — the value of the font’s BaseFont entry and the font descriptor’s FontName entry — shall begin with a tag followed by a plus sign (+). The tag shall consist of exactly six uppercase letters; the choice of letters is arbitrary, but different subsets in the same PDF file shall have different tags.

示例EOODIA + Poetica是Poetica®(一种1型字体)的子集的名称.

EXAMPLE EOODIA+Poetica is the name of a subset of Poetica®, a Type 1 font.

因此,在遵循此要求的PDF(必须",因此确实是一项要求)中,您可以通过子字体识别它们的名称.

In a PDF following up to this requirement ("shall", so it really is a requirement) you, therefore, can recognize subset fonts by their name.

但是请记住,在PDF之外,您可以通过仅包含选定的字形来从另一字体派生一种字体.这实质上是创建子字体,但是使用它的PDF创建软件可能不会注意到该事实,而是将其命名为完全嵌入的字体.因此,从本质上讲,您永远无法确定.

Keep in mind, though, that outside of PDFs you can derive a font from another one by including only selected glyphs. This essentially creates a subset font but a PDF creating software making use of it may not notice that fact and name it as a fully embedded font. So in essence you can never know for sure.

这篇关于如何使用PDFBOX检查完全嵌入和部分嵌入的字体的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆