通过过滤隐藏对象来减少图的PDF文件大小 [英] Reduce PDF file size of plots by filtering hidden objects

查看:130
本文介绍了通过过滤隐藏对象来减少图的PDF文件大小的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在R中产生许多点的散点图(例如,使用 ggplot())时,可能会有很多点位于其他点之后,根本看不到。例如见下图:

While producing scatter plots of many points in R (using ggplot() for example), there might be many points that are behind the others and not visible at all. For instance see the plot below:

这是一个数十万分的散点图,但其中大多数都落后于其他点。问题是,当将输出转换为矢量文件(例如PDF文件)时,不可见点会使文件大小变大,并在查看文件时增加内存和CPU使用量。

This is a scatter plot of several hundreds of thousands points, but most of them are behind the other points. The problem is when casting the output to a vector file (a PDF file for example), the invisible points make the file size so big, and increase memory and cpu usage while viewing the file.

一个简单的解决方案是将输出转换为位图图片(例如TIFF或PNG),但它们失去了矢量质量,甚至可能更大。我尝试了一些在线PDF压缩器,但结果与原始文件大小相同。

A simple solution is to cast the output to a bitmap picture (TIFF or PNG for example), but they lose the vector quality and can be even larger in size. I tried some online PDF compressors, but the result was the same size as my original file.

有什么好的解决方案吗?例如某种方式来过滤点不可见的点,可能是在生成图表或通过编辑PDF文件之后?

Is there any good solution? For example some way to filter the points that are not visible, possibly during generating plot or after it by editing PDF file?

推荐答案

作为一个开始,你可以做这样的事情:

As a start you can do something like this:

set.seed(42)
DF <- data.frame(x=x<-runif(1e6),y=x+rnorm(1e6,sd=0.1))
plot(y~x,data=DF,pch=".",cex=4)

PDF大小:6334 KB

PDF size: 6334 KB

DF2 <- data.frame(x=round(DF$x,3),y=round(DF$y,3))
DF2 <- DF[!duplicated(DF2),]
nrow(DF2)
#[1] 373429
plot(y~x,data=DF2,pch=".",cex=4)

PDF大小:2373 KB

PDF size: 2373 KB

通过四舍五入,您可以控制要删除多少个值。你只需要修改它来处理不同的颜色。

With the rounding you can control how many values you want to remove. You only need to modify this to handle the different colours.

这篇关于通过过滤隐藏对象来减少图的PDF文件大小的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆