pdfbox可以提取矢量图像吗? [英] Can pdfbox extract vector images?

查看:120
本文介绍了pdfbox可以提取矢量图像吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据我的理解

1. .eps format images are vector images.
2. When we draw something in word (like a flowchart) that is stored 
as a vector image.  

我几乎可以肯定第一个,而不能确定第二个.如果我错了,请纠正我.

I am almost sure about the first, not sure about the second. Please correct me if I am wrong.

假设这两件事,当将乳胶文件(插入.eps图像的位置)或word文件(包含矢量图像)转换为pdf时,图像是否会转换为光栅图像?

Assuming this two things, when a latex file (where .eps images are inserted) or a word file (that contains vector images) is converted into pdf, do the images get converted into raster images?

此外,我认为PDFBox/xpdf只能从pdf中提取光栅图像(因为它们嵌入为XObjects),而不是矢量图像.这种理解正确吗? stackoverflow中的此问题是相关的,但没有已经回答了.

Also, I think PDFBox/xpdf can only extract raster images from the pdf (as they are embedded as XObjects), not vector images. Is that understanding correct? This question in stackoverflow is related, but have not been answered yet.

推荐答案

您的第1点不正确,eps文件是PostScript程序,它们可能包含矢量信息,文本或图像数据,或以上所有内容.

Your point 1 is incorrect, eps files are PostScript programs, they may contain vector information, or text or image data, or all of the above.

点2在PDF中没有矢量图像",图像表示位图,因此不能是矢量.

point 2 In PDF there isn't a 'vector image', an image means a bitmap and therefore cannot be vector.

如果将PostScript程序转换为PDF文件,则结果完全取决于您使用的转换程序.通常,矢量将保留为矢量,文本将保留为文本.但是,应用程序完全有可能呈现整个PostScript程序,并将结果作为图像插入PDF.

If you convert a PostScript program to a PDF file, then the result depends entirely on the conversion program you use. In general vectors will be retained as vectors, and text as text. However it is entirely possible that an application might render the entire PostScript program and insert the result as an image in the PDF.

因此,第一个问题(是否将图像转换为栅格图像")的答案是也许,但可能不是".

So the answer to your first question ("do the images get converted into raster images") is 'maybe, but probably not'.

恐怕我对PDFBox/xpdf的功能一无所知,但是由于向量的集合可能不会布置为图像"(它们可能被保存为Form XObjects或Patterns) )以任何原子的方式,没有任何明显的方式知道何时停止提取.以及您将以哪种格式存储结果?

I'm afraid I have no idea about the capabilities of PDFBox/xpdf, but since collections of vectors may not be arranged as 'images' (they could be held as Form XObjects, or Patterns) in any atomic fashion, there isn't any obvious way to know when to stop extracting. And what format would you store the result in anyway ?

这篇关于pdfbox可以提取矢量图像吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆