在R中自动生成大圈子地图 [英] Automating great-circle map production in R
问题描述
我已经采取了一些我在流动数据中学到的东西伟大的圈子映射教程,并将它们与注释中链接的代码相结合,以防止当R绘制赤道大圈时发生奇怪的事情。这给我这个:
airport< - read.csv(/ home / geoff / Desktop / DissertationData / airports.csv ,header = TRUE)
航班< - read.csv(/ home / geoff / Desktop / DissertationData / ATL.csv,header = TRUE,as.is = TRUE)
库(地图)
库(geosphere)
checkDateLine< - function(l){
n <-0
k <-length(l)
k (j in 1:k){
n [j] < - l [j + 1] - l [j]
}
n < - abs(n)
m <-max(n,rm.na = TRUE)
ifelse(m> 30,TRUE,FALSE)
}
clean.Inter< - function(p1,p2,n,addStartEnd){
inter < - gcIntermediate(p1,p2,n = n,addStartEnd = addStartEnd)
if(checkDateLine(inter [ 1])){
m1 < - midPoint(p1,p2)
m1 [,1] < - (m1 [,1] +180)%% 360 - 180
a1 < - 对映码(m1)
l1 < - gcIntermediate(p1,a1,n = n,addStartEnd = addStartEnd)
l2 < - gcIntermediate(a1,p2,n = n,addStartEnd = addStartEnd )
l3< - rbind(l1, l2)
l3
}
else {
inter
}
}
#独特月份
monthyear< ; - 唯一(航班$月)
#颜色
pal< - colorRampPalette(c(#FFEA00,#FF0043))
colors< - pal (100)
(i in 1:length(monthyear)){
png(paste(monthyear,monthyear [i],.png,sep =),width = 750,height = 500)
map(world,col =#191919,fill = TRUE,bg =black,lwd = 0.05)
fsub< - 航班[航班$ month == monthyear [i],]
fsub < - fsub [order(fsub $ cnt),]
maxcnt < - max(fsub $ cnt )
for(j in 1:length(fsub $ month)){
air1< - airport [airport $ iata == fsub [j,] $ airport1,]
air2< - 机场[机场$ iata == fsub [j,] $ airport2,]
p1 < - c(air1 [1,] $ long,air1 [1,] $ lat)
p2 < - c(air2 [1,] $ long,air2 [1,] $ lat)
inter < - clean.Inter(p1,p2,n = 100,addStartEnd = TRUE)
colindex < - ((fsub [j,] $ cnt / maxcnt)* length(colors))
lines(inter,col = colors [ colindex],lwd = 1.0)
}
dev.off()
}
我想自动生成包含所有预定商业路线的大型数据集的地图 - 虚拟样本 - ATL与全球网络中的其他机场之间共享(机场.csv与流量数据相关)。最好是每月制作一张地图,我将用简短的视频作为框架,描绘亚特兰大机场网络空间的变化。
问题是: 我无法让循环生成多于一个的PNG - 从每个CSV中的第一个独特的月份 - 每次运行它。我相当肯定Aaron Hardin的代码打破在Flowing Data教程中使用的自动化。经过三天的混乱和追赶相关的R操作之后,我意识到我根本就是缺乏调和对方的意见。有没有人可以帮助我自动化这个过程?
有一篇论文确认给你!
发表评论的信息太多,所以我发表了一个答案。这是我的想法(并阅读到最后看看可能是什么可能是问题):
我已经尝试运行您的代码在原始数据流数据教程(显然,您必须为每月数据添加一列,因此我只需将该行添加到随机的月份:):
< - read.csv(http://datasets.flowingdata.com/tuts/maparcs/airports.csv,
header = TRUE)
航班< - read.csv(http: //datasets.flowingdata.com/tuts/maparcs/flights.csv,
header = TRUE,as.is = TRUE)
#添加具有月份$ b $的随机数据的列b航班$ month< - sample(month.abb [1:4],nrow(flights),replace = TRUE)
每当我有一个需要很长时间运行的循环,我通常会在那里贴一些代码,让我进行一个进度检查。使用你想要的东西: print
, cat
, tcltk :: tkProgressBar
。我使用消息
:
for(i in 1:length(monthyear )){
message(i)
#
#你的代码在这里
#
}
无论如何,我然后运行你的代码。一切都应该是正确的。因为我抽取了四个月的数据,我得到:
- 当前迭代的消息我打印四次
- 四个
png
图,每个都有一个黑暗的世界地图和明亮的黄线。以下是四行之一:
那么为什么它在我的机器上工作,而不是你的?
我只能猜到,但我的猜测是没有设置工作目录。您的代码中没有 setwd
,并且调用 png
只是给出文件名。我怀疑你的代码正在写入您系统中的任何工作目录。
默认情况下,在我的安装中,工作目录是:
getwd()
/ pre>
[1]C:/ Program Files / eclipse 3.7
要解决此问题,请执行以下操作之一:
- 使用
setwd()
将您的工作目录设置在脚本的顶部。
- 或在您的通话中使用完整的路径和文件名到
png()
I've taken some of the things I learned in a Flowing Data great circle mapping tutorial and combined them with code linked in the comments to prevent weird things from happening when R plots trans-equatorial great circles. That gives me this:
airports <- read.csv("/home/geoff/Desktop/DissertationData/airports.csv", header=TRUE) flights <- read.csv("/home/geoff/Desktop/DissertationData/ATL.csv", header=TRUE, as.is=TRUE) library(maps) library(geosphere) checkDateLine <- function(l){ n<-0 k<-length(l) k<-k-1 for (j in 1:k){ n[j] <- l[j+1] - l[j] } n <- abs(n) m<-max(n, rm.na=TRUE) ifelse(m > 30, TRUE, FALSE) } clean.Inter <- function(p1, p2, n, addStartEnd){ inter <- gcIntermediate(p1, p2, n=n, addStartEnd=addStartEnd) if (checkDateLine(inter[,1])){ m1 <- midPoint(p1, p2) m1[,1] <- (m1[,1]+180)%%360 - 180 a1 <- antipode(m1) l1 <- gcIntermediate(p1, a1, n=n, addStartEnd=addStartEnd) l2 <- gcIntermediate(a1, p2, n=n, addStartEnd=addStartEnd) l3 <- rbind(l1, l2) l3 } else{ inter } } # Unique months monthyear <- unique(flights$month) # Color pal <- colorRampPalette(c("#FFEA00", "#FF0043")) colors <- pal(100) for (i in 1:length(monthyear)) { png(paste("monthyear", monthyear[i], ".png", sep=""), width=750, height=500) map("world", col="#191919", fill=TRUE, bg="black", lwd=0.05) fsub <- flights[flights$month == monthyear[i],] fsub <- fsub[order(fsub$cnt),] maxcnt <- max(fsub$cnt) for (j in 1:length(fsub$month)) { air1 <- airports[airports$iata == fsub[j,]$airport1,] air2 <- airports[airports$iata == fsub[j,]$airport2,] p1 <- c(air1[1,]$long, air1[1,]$lat) p2 <- c(air2[1,]$long, air2[1,]$lat) inter <- clean.Inter(p1,p2,n=100, addStartEnd=TRUE) colindex <- round( (fsub[j,]$cnt / maxcnt) * length(colors) ) lines(inter, col=colors[colindex], lwd=1.0) } dev.off() }
I'd like to automate the production of maps for a large dataset containing all scheduled commercial routes — dummy sample — shared between ATL and other airports in the global network (airports.csv is linked to in the Flowing Data post). Preferably, I'd produce one map per month that I would use as frame in a short video depicting changes in the Atlanta airport network space.
The problem: I can't get the loop to produce any more than one PNG—from only the first unique month in each CSV—each time I run it. I'm fairly certain Aaron Hardin's code 'breaks' the automation as it is used in the Flowing Data tutorial. After three days of messing with it and chasing down any relevant R how-to's, I realize I simply lack the chops to reconcile one with the other. Can anybody help me automate the process?
There's a dissertation acknowledgement in it for you!
解决方案Too much information for a comment, so I post an answer instead. Here is what I think (and read to the end to see what could potentially be the problem):
I have tried to run your code on the original data in the Flowing Data tutorial. (Obviously you have to add a column for monthly data, so I simply added this line to randomise the month:):
airports <- read.csv("http://datasets.flowingdata.com/tuts/maparcs/airports.csv", header=TRUE) flights <- read.csv("http://datasets.flowingdata.com/tuts/maparcs/flights.csv", header=TRUE, as.is=TRUE) # Add column with random data for month flights$month <- sample(month.abb[1:4], nrow(flights), replace=TRUE)
Whenever I have a loop that takes a long time to run, I generally stick a bit of code in there that gives me a progress check. Use what takes your fancy:
cat
,tcltk::tkProgressBar
. I usemessage
:for (i in 1:length(monthyear)) { message(i) # # your code here # }
Anyway, I then ran your code. Everything works exactly as it should. Since I sampled four months worth of data, I get:
- The message with the current iteration of i prints four times
- Four
png
plots, each with a dark world map and bright yellow lines. Here is one of the four lines:
So, why does it work on my machine and not yours?
I can only guess, but my guess is that you haven't set the working directory. There is no setwd
in your code, and the call to png
just gives the filename. I suspect your code is being written to whatever your working directory is in your system.
By default, on my installation, the working directory is:
getwd()
[1] "C:/Program Files/eclipse 3.7"
To solve this, do one of the following:
- Use
setwd()
to set your working directory at the top of your script. - Or use the full path and file name in your call to
png()
这篇关于在R中自动生成大圈子地图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!