通过过滤隐藏对象来减小绘图的 PDF 文件大小 [英] Reduce PDF file size of plots by filtering hidden objects

查看:21
本文介绍了通过过滤隐藏对象来减小绘图的 PDF 文件大小的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 R 中生成许多点的散点图时(例如使用 ggplot()),可能有许多点在其他点之后并且根本不可见.例如见下图:

While producing scatter plots of many points in R (using ggplot() for example), there might be many points that are behind the others and not visible at all. For instance see the plot below:

这是几十万个点的散点图,但大部分都落后于其他点.问题是在将输出转换为矢量文件(例如 PDF 文件)时,不可见的点使文件大小变得如此之大,并且在查看文件时增加了内存和 CPU 使用率.

This is a scatter plot of several hundreds of thousands points, but most of them are behind the other points. The problem is when casting the output to a vector file (a PDF file for example), the invisible points make the file size so big, and increase memory and cpu usage while viewing the file.

一个简单的解决方案是将输出转换为位图图片(例如 TIFF 或 PNG),但它们会失去矢量质量并且尺寸可能更大.我尝试了一些在线 PDF 压缩器,但结果与我的原始文件大小相同.

A simple solution is to cast the output to a bitmap picture (TIFF or PNG for example), but they lose the vector quality and can be even larger in size. I tried some online PDF compressors, but the result was the same size as my original file.

有什么好的解决办法吗?例如一些过滤不可见点的方法,可能是在生成绘图期间或通过编辑 PDF 文件之后?

Is there any good solution? For example some way to filter the points that are not visible, possibly during generating plot or after it by editing PDF file?

推荐答案

作为开始,您可以执行以下操作:

As a start you can do something like this:

set.seed(42)
DF <- data.frame(x=x<-runif(1e6),y=x+rnorm(1e6,sd=0.1))
plot(y~x,data=DF,pch=".",cex=4)

PDF 大小:6334 KB

PDF size: 6334 KB

DF2 <- data.frame(x=round(DF$x,3),y=round(DF$y,3))
DF2 <- DF[!duplicated(DF2),]
nrow(DF2)
#[1] 373429
plot(y~x,data=DF2,pch=".",cex=4)

PDF 大小:2373 KB

PDF size: 2373 KB

通过舍入,您可以控制要删除的值的数量.你只需要修改它来处理不同的颜色.

With the rounding you can control how many values you want to remove. You only need to modify this to handle the different colours.

这篇关于通过过滤隐藏对象来减小绘图的 PDF 文件大小的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆