使用一些实用程序或脚本将嵌入的 PDF 字体提取到外部 ttf 文件 [英] Extract embedded PDF fonts to an external ttf file using some utility or script

查看:13
本文介绍了使用一些实用程序或脚本将嵌入的 PDF 字体提取到外部 ttf 文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以使用一些实用程序或脚本将嵌入在 PDF 文件中的字体提取到外部 ttf 文件中?

Is it possible to extract fonts that are embedded in a PDF file to an external ttf file using some utility or script?

  1. 如果系统中存在嵌入(或未嵌入)PDF 文件的字体.使用来自 swftools 的 pdf2swf 和 swfextract 工具,我能够确定 PDF 文件中使用的字体的名称.然后我可以在运行时编译相应的系统字体,然后加载到我的 AIR 应用程序中.

  1. If the fonts that are embedded (or not embedded) to a PDF file are present in system. Using pdf2swf and swfextract tools from swftools I am able to determine names of the fonts used in a PDF file. Then I can compile respective system font(s) at run-time and then load to my AIR application.

但是如果 PDF 中使用的字体在系统中不存在,则有两种可能:

BUT if the fonts used in the PDF are absent in the system there are two possibilities:

2.1.如果PDF文件中也没有它们(未嵌入),我们只能根据字体名称使用类似的系统字体.

2.1. If they are absent in the PDF files as well (not embedded), we can only use similar system font basing on the font name.

2.2.如果它们嵌入在 PDF 文件中,那么我想知道是否有可能将它们提取到外部 ttf 文件中,以便我可以在运行时将它们编译为单独的 swf 文件?

2.2. If they are embedded in the PDF file, then I want to know is it possible at all to extract them to external ttf file so that I can compile each of them to separate swf files at run-time?

推荐答案

我知道你问这个问题已经有一段时间了,但我想我可以提供帮助.

I know it's been a while since you asked this, but I figured I might be able to help.

我不知道是否有任何实用程序可以让您提取字体文件,但您可以手动进行.

I don't know if there is any utility that will allow you to extract the Font files, but you can do it manually.

基本上,PDF 文件是具有不同对象的文本文件.您可以使用任何文本编辑器打开它并查找字体.

Basically a PDF file is a text file with different objects. You can open it with any text editor and look for the fonts.

字体在 FontDescriptor 对象中指定,例如:

The fonts are specified in FontDescriptor objects, e.g:

<</Type/FontDescriptor/FontName/ABCDEE+Algerian ... /FontFile2 24 0 R>>

这基本上是说,在对象 24 上指定了名称为 Algerian 的字体.您可以使用24 0 obj"行在文档中搜索对象 24,在此行之后,它显示流的属性字体文件,并在stream"关键字之后开始(其长度在 obj 之后的行中定义).

This basically says, a font with the name Algerian is specified on the object 24. You can search the document for the object 24 with the line "24 0 obj", after this line, it displays the properties of the stream with the font file and after the "stream" keyword it starts (its length is defined in the line after the obj).

此流包含压缩的 ttf 文件,您可以使用此方法对其进行解压缩:

This stream contains the ttf file, compressed, to decompress it you can use this method:

  private static byte[] DecodeFlateDecodeData(byte[] data)
  {
     MemoryStream outputStream;
     using (outputStream = new MemoryStream())
     {
        using (var compressedDataStream = new MemoryStream(data))
        {
           // Remove the first two bytes to skip the header (it isn't recognized by the DeflateStream class)
           compressedDataStream.ReadByte();
           compressedDataStream.ReadByte();

           var deflateStream = new DeflateStream(compressedDataStream, CompressionMode.Decompress, true);

           var decompressedBuffer = new byte[1024];
           int read;
           while ((read = deflateStream.Read(decompressedBuffer, 0, decompressedBuffer.Length)) != 0)
           {
              outputStream.Write(decompressedBuffer, 0, read);
           }
           outputStream.Flush();
           compressedDataStream.Close();
        }
        return GetStreamBytes(outputStream);
     }
  }

我希望这能帮助你...或帮助其他人

I hope this helps you... or helps somebody else

这篇关于使用一些实用程序或脚本将嵌入的 PDF 字体提取到外部 ttf 文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆