我怎样才能更快地创建一个1000页的pdf? [英] How can I create a 1000-page pdf faster?
问题描述
我需要在R中使用 ggplot2
来绘制超过1000页的PDF文件。除以下代码外,还有其他更快捷的方法:
library(ggplot2)
data(diamonds)
pdf(name.pdf,width = 6,height = 6)
$($ in $ 1){
p1 < - ggplot(diamonds,aes(x = carat,y = price))+
geom_point()
print(p1 )
}
dev.off()
我的实际情况如此; (1)需要读取一个文件,并根据每行的值创建一个 data.frame
(b)将该文件的每一行绘制成pdf。
fa < - read.table(file)
pdf(name.pdf,width = 6,height = 4)
for(i in 1:nrow(fa)){
new.data < - function(i)
p1 < - ggplot(new.data,...) + ...
print(p1)
}
dev.off()
library(latticeExtra); xyplot(y〜x,diamonds,par.settings = ggplot2like(),lattice.options = ggplot2like.opts())$ c $ )
如果你真的绝望,我想你可以使用 parallel :: parApply
生成一个合理数量的单独PDF文件,然后使用外部工具(如 pdftk
)将它们拼接在一起...
设置机器以在所有三个系统中生成(大致)相同的地块
library(ggplot2)
图书馆(格子)
数据(钻石)
gg_plot< - 函数(){
cat(。)
print(ggplot(diamonds,aes x = carat,y = price))+
geom_point())
}
base_plot< - function(){
cat(+)
plot (y〜x,data = diamonds)
}
lattice_plot < - function(){
cat(/)
print(xyplot(y〜x,data =钻石))
}
换行< - 函数(f,npa
pdf(fn,width = 6,height = 6)
for(i in 1:npages){
f()
$ b $ dev.off()
unlink(fn)
}
library(rbenchmark)
benchmark(wrap(gg_plot),换行(base_plot),换行(lattice_plot),
replications = 10)
比我预期的要慢得多(我把它削减为每页20页,重复10次)。 (我最初认为格
赢了很多,但那是因为我忘记了 print()
结果...... )
格子和底子都是ggplot的两倍... ...
test replications elapsed relative user.self sys.self
2 wrap(base_plot)10 75.693 1.249 74.053 1.596
1 wrap(gg_plot)10 120.397 1.987 117.507 2.832
3 wrap(lattice_plot )10 60.590 1.000 58.580 1.976
I need to plot more than 1000 pages to a PDF file using ggplot2
in R. Any faster way to do besides the following code:
library(ggplot2)
data(diamonds)
pdf("name.pdf", width = 6, height = 6)
for(i in 1:1000) {
p1 <- ggplot(diamonds, aes(x = carat, y = price)) +
geom_point()
print(p1)
}
dev.off()
My actual case like this;
(1) need to read a file, and create a data.frame
according to the value for each line of it.
(2) make a plot of each line of that file to pdf.
fa <- read.table(file)
pdf(name.pdf, width = 6, height = 4)
for(i in 1:nrow(fa)) {
new.data <- function(i)
p1 <- ggplot(new.data,...) + ...
print(p1)
}
dev.off()
解决方案 As commented above, speed is one of ggplot2
's weaknesses. It takes some work but you can often replicate the appearance of a ggplot in one of the other standard plotting packages (base or lattice); e.g. this series of blog posts goes the other way (from lattice to ggplot), but the examples should be helpful. (@G.Grothendieck comments below that library(latticeExtra); xyplot(y ~ x, diamonds, par.settings = ggplot2like(), lattice.options = ggplot2like.opts())
will generate ggplot-like plots.)
If you were really desperate I suppose you could use parallel::parApply
to generate a sensible number of separate PDFs and then use external tools such as pdftk
to stitch them together ...
Set up machinery to generate (approximately) the same plots in all three systems
library("ggplot2")
library("lattice")
data(diamonds)
gg_plot <- function() {
cat(".")
print(ggplot(diamonds, aes(x = carat, y = price)) +
geom_point())
}
base_plot <- function() {
cat("+")
plot(y~x,data=diamonds)
}
lattice_plot <- function() {
cat("/")
print(xyplot(y~x,data=diamonds))
}
wrap <- function(f,npages=20,fn="name.pdf") {
pdf(fn, width = 6, height = 6)
for(i in 1:npages) {
f()
}
dev.off()
unlink(fn)
}
library("rbenchmark")
benchmark(wrap(gg_plot),wrap(base_plot),wrap(lattice_plot),
replications=10)
OK, this was much slower than I expected (I cut it back to 20 pages per PDF and 10 replications). (I initially thought lattice
won by a lot, but that's because I forgot to print()
the results ...)
lattice and base are both about twice as fast as ggplot ...
test replications elapsed relative user.self sys.self
2 wrap(base_plot) 10 75.693 1.249 74.053 1.596
1 wrap(gg_plot) 10 120.397 1.987 117.507 2.832
3 wrap(lattice_plot) 10 60.590 1.000 58.580 1.976
这篇关于我怎样才能更快地创建一个1000页的pdf?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!