我怎样才能更快地创建一个1000页的pdf? [英] How can I create a 1000-page pdf faster?

查看:144
本文介绍了我怎样才能更快地创建一个1000页的pdf?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要在R中使用 ggplot2 来绘制超过1000页的PDF文件。除以下代码外,还有其他更快捷的方法:

  library(ggplot2)
data(diamonds)
pdf(name.pdf,width = 6,height = 6)
$($ in $ 1){
p1 < - ggplot(diamonds,aes(x = carat,y = price))+
geom_point()
print(p1 )
}
dev.off()

我的实际情况如此; (1)需要读取一个文件,并根据每行的值创建一个 data.frame (b)将该文件的每一行绘制成pdf。

  fa < -  read.table(file)
pdf(name.pdf,width = 6,height = 4)
for(i in 1:nrow(fa)){
new.data < - function(i)
p1 < - ggplot(new.data,...) + ...
print(p1)
}
dev.off()


ggplot2 的弱点之一。它需要一些工作,但是你通常可以在其他标准绘图软件包(基本或格子)中复制ggplot的外观;例如这一系列的博客文章以另一种方式(从格到ggplot),但这些例子应该是有帮助的。 (@GGothendieck下面评论 library(latticeExtra); xyplot(y〜x,diamonds,par.settings = ggplot2like(),lattice.options = ggplot2like.opts())

如果你真的绝望,我想你可以使用 parallel :: parApply 生成一个合理数量的单独PDF文件,然后使用外部工具(如 pdftk )将它们拼接在一起...



设置机器以在所有三个系统中生成(大致)相同的地块

  library(ggplot2)
图书馆(格子)
数据(钻石)
gg_plot< - 函数(){
cat(。)
print(ggplot(diamonds,aes x = carat,y = price))+
geom_point())
}
base_plot< - function(){
cat(+)
plot (y〜x,data = diamonds)
}
lattice_plot < - function(){
cat(/)
print(xyplot(y〜x,data =钻石))
}
换行< - 函数(f,npa
pdf(fn,width = 6,height = 6)
for(i in 1:npages){
f()
$ b $ dev.off()
unlink(fn)
}

library(rbenchmark)
benchmark(wrap(gg_plot),换行(base_plot),换行(lattice_plot),
replications = 10)

比我预期的要慢得多(我把它削减为每页20页,重复10次)。 (我最初认为赢了很多,但那是因为我忘记了 print()结果...... )



格子和底子都是ggplot的两倍... ...

  test replications elapsed relative user.self sys.self 
2 wrap(base_plot)10 75.693 1.249 74.053 1.596
1 wrap(gg_plot)10 120.397 1.987 117.507 2.832
3 wrap(lattice_plot )10 60.590 1.000 58.580 1.976


I need to plot more than 1000 pages to a PDF file using ggplot2 in R. Any faster way to do besides the following code:

library(ggplot2)
data(diamonds)
pdf("name.pdf", width = 6, height = 6)
for(i in 1:1000) {
  p1 <- ggplot(diamonds, aes(x = carat,  y = price)) +
        geom_point()
  print(p1)
}
dev.off()

My actual case like this;

(1) need to read a file, and create a data.frame according to the value for each line of it.

(2) make a plot of each line of that file to pdf.

fa <- read.table(file)
pdf(name.pdf, width = 6, height = 4)
for(i in 1:nrow(fa)) {
  new.data <- function(i)
  p1 <- ggplot(new.data,...) + ...
  print(p1)
}
dev.off()

解决方案

As commented above, speed is one of ggplot2's weaknesses. It takes some work but you can often replicate the appearance of a ggplot in one of the other standard plotting packages (base or lattice); e.g. this series of blog posts goes the other way (from lattice to ggplot), but the examples should be helpful. (@G.Grothendieck comments below that library(latticeExtra); xyplot(y ~ x, diamonds, par.settings = ggplot2like(), lattice.options = ggplot2like.opts()) will generate ggplot-like plots.)

If you were really desperate I suppose you could use parallel::parApply to generate a sensible number of separate PDFs and then use external tools such as pdftk to stitch them together ...

Set up machinery to generate (approximately) the same plots in all three systems

 library("ggplot2")
 library("lattice")
 data(diamonds)
 gg_plot <- function() {
    cat(".")
    print(ggplot(diamonds, aes(x = carat,  y = price)) +
    geom_point())
 }
 base_plot <- function() {
    cat("+")
    plot(y~x,data=diamonds)
 }
 lattice_plot <- function() {
    cat("/")
    print(xyplot(y~x,data=diamonds))
 }
 wrap <- function(f,npages=20,fn="name.pdf") {
    pdf(fn, width = 6, height = 6) 
    for(i in 1:npages) {
           f()
    }
    dev.off()
    unlink(fn)
 }

 library("rbenchmark")
 benchmark(wrap(gg_plot),wrap(base_plot),wrap(lattice_plot),
           replications=10)

OK, this was much slower than I expected (I cut it back to 20 pages per PDF and 10 replications). (I initially thought lattice won by a lot, but that's because I forgot to print() the results ...)

lattice and base are both about twice as fast as ggplot ...

                test replications elapsed relative user.self sys.self
2    wrap(base_plot)           10  75.693    1.249    74.053    1.596
1      wrap(gg_plot)           10 120.397    1.987   117.507    2.832
3 wrap(lattice_plot)           10  60.590    1.000    58.580    1.976

这篇关于我怎样才能更快地创建一个1000页的pdf?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆