使用geom_bin2d覆盖地图与附加的统计信息 [英] Map with geom_bin2d overlay with additional stat info

查看:207
本文介绍了使用geom_bin2d覆盖地图与附加的统计信息的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用ggplot2来复制类似于此地图的内容:
这是什么我已经做到了:






  load(mapdata.Rdata) 
> ls()#2数据集:底图(geom_contour)的depth和data用于构造geom_bin2d
[1]datadepth
>头(数据)
纬度经度GRcounts
740 67.20000 -57.83333 0
741 67.11667 -57.80000 0
742 67.10000 -57.93333 1
743 67.06667 -57.80000 0
751 67.15000 -58.15000 0
762 67.18333 -58.15000 0
ggplot(data = data,aes(x = longitude,y = latitude))
+ theme_bw()
+ stat_bin2d (binwidth = c(0.5,0.5))
+ geom_contour(data = depth,aes(lon,lat,z = dn),color =black,bins = 5)
+ xlim (-67,-56))+ ylim(c(65,71))

哪个给我这个地图:



最后一步是在我的geom_bin2d圈子上显示大小与每个bin中计数(Grcounts)的总和成正比。



有关如何在ggplot(最好)中这样做的任何提示非常感谢






跟进问题:使用facet_wrap 时stat_bin2d和stat_summary2d之间的对齐不匹配>

当我在 diamonds 数据集中运行以下代码时,没有明显的问题:但是,如果我在我的数据上运行相同的代码,我会收到不对齐的问题。关于什么可能导致这个问题的任何想法?

  p< -ggplot(diamonds,aes(x = carat,y = color = cut))+ 
stat_summary2d(fun = sum,aes(z = depth,group = cut),bins = 10)
p + facet_wrap(〜cut)
df < ggplot_build(p)$ data [[1]]
summary(df)## now 5组,1面板
df $ x< -with(df,(xmin + xmax)/ 2)
df $ y< -with(df,(ymin + ymax)/ 2)
plot1< -ggplot(diamonds,aes(carat,price))+ stat_bin2d(bins = 10)
plot1 + geom_point(data = df,aes(x,y,size = value,group = group),color =red,shape = 1)+ facet_wrap(〜group)
pre>

这是我的Rcode和图:

  p< -ggplot(dat,aes(x = longitude,y = latitude,color = SizeClass))+ 
stat_summary2d(fun = sum,aes(z = GRcounts,group = SizeClass),bins = 10)
p + facet_wrap(〜SizeClass)
df< - ggplot_build(p)$ data [ 1]]
summary(df)## now 4组,1面板
df $ x< -with(df,(xmin + xmax)/ 2)
df $ y < (df,(ymin + ymax)/ 2)
plot1 <-ggplot(dat,aes(longitude,latitude))+ stat_bin2d(bins = 10)
plot1 + geom_point(data = df,aes (x,y,size = value,group = group),color =red,shape = 1)+ facet_wrap(〜group)

>头(dat [c(7,8,14,21)])### mydata
纬度经度GRcounts SizeClass
742 67.10000 -57.93333 1(100,150)
784 67.21667 -57.95000 1( 100,150)
756 67.11667 -57.80000 1(< 100)
1233 68.80000 -59.55000 2(100,150)
1266 68.68333 -59.60000 2(100,150)
1288 68.66667 -59.65000 1( 100,150)

我的数据集可以在这里下载: data

解决方案

As您的数据集在我的电脑上无法使用 diamonds 数据集作为示例。



制作新的数据使用 stat_summary2d()并设置 z = 作为要求的参数(在您的情况下 GRcounts ),并提供 fun = sum 来计算这些值。将它存储为一些对象。

  p <-ggplot(diamonds,aes(carat,price))+ stat_summary2d(fun = sum ,aes(z = depth))

使用函数 ggplot_build() code>获取用于情节的数据。矩形坐标列在 xmin xmax ymin ymax ,总和在列中。

  df<  -  ggplot_build(p)$ data [[1]] 
头(df)
填充xbin ybin值ymax ymin yint xmax xmin xint PANEL group
1#55B1F7 [0.2,0.36] [326,943] 641318.2 942.5667 326.0000 1 0.3603333 0.2000000 1 1 1
2#1A3955 [0.2,0.36](943,1.56e + 03] 75585.5 1559.1333 942.5667 2 0.3603333 0.2000000 1 1 1
3#132B43 [0.2,0.36](1.56e + 03,218e + 03] 415.8 2175.7000 1559.1333 3 0.3603333 0.2000000 1 1 1
4#132B43 [0.2,0.36](2.18e + 03 ,2.79e + 03] 304.4 2792.2667 2175.7000 4 0.3603333 0.2000000 1 1 1
5#244D71(0.36,0.521] [326,943] 179486.8 942.5667 326.0000 1 0.5206667 0.3603333 2 1 1
6#2D5F8A(0.36,0.521 ](943,1.56e + 03] 2 71688.9 1559.1333 942.5667 2 0.5206667 0.3603333 2 1 1

对于这些点,计算x和y位置的平均值 xmin xmax ymin

  df $ x< -with(df,(xmin + xmax)/ 2) 
df $ y< -with(df,(ymin + ymax)/ 2)

使用这个新的数据框,使用 stat_bin2d()添加点。

  ggplot(diamonds,aes(carat,price))+ stat_bin2d()+ 
geom_point(data = df,aes(x = x,y = y,size = value),color =red ,shape = 1)



更新 - 使用facet

解决方案

使用 facet_wrap()并将 stat_bin2d()和点相结合,您应该使用一些解决方法,因为似乎有一些问题。



首先,创建两个图 - 一个用于总和 stat_summary2d(),另一个用于 stat_bin2d()的计数。两个地块都应该分开。

  plot1<  -  ggplot(dat,aes(x = longitude,y = latitude))+ 
stat_summary2d(fun = sum,aes(z = GRcounts),bins = 10)+ facet_wrap(〜SizeClass)

plot2< - ggplot(dat,aes(longitude,latitude)) + stat_bin2d(bins = 10)+
facet_wrap(〜SizeClass)

两个图都使用 ggplot_build()并将其存储为对象。对于和数据框( df1 )计算x和y坐标,如上例所示。

 code> df1<  -  ggplot_build(plot1)$ data [[1]] 
df1 $ x< -with(df,(xmin + xmax)/ 2)
df1 $ y < with(df,(ymin + ymax)/ 2)

df2 <-ggplot_build(plot2)$ data [[1]]

现在使用这些新数据框绘制您的数据 - df1 用于积分和 df2 为矩形。使用 geom_rect(),您将获得 fill = 取决于计数的矩形。对于faceting使用列 PANEL

  ggplot()+ geom_rect(data = df2,aes(xmin = xmin,xmax = xmax,
ymin = ymin,ymax = ymax,fill = count))+
geom_point(data = df1,aes(x = x,y = y ,size = value),shape = 1,color =red)+
facet_wrap(〜PANEL)


I am trying to reproduce something similar to this map using ggplot2: This is what I've done so far:


    load("mapdata.Rdata")
    > ls() #2 datasets: "depth" for basemap (geom_contour) and "data" is use to construct geom_bin2d
    [1] "data"  "depth"
    > head(data)
        latitude longitude GRcounts
    740 67.20000 -57.83333        0
    741 67.11667 -57.80000        0
    742 67.10000 -57.93333        1
    743 67.06667 -57.80000        0
    751 67.15000 -58.15000        0
    762 67.18333 -58.15000        0
    ggplot(data=data,aes(x =longitude, y =latitude))
+theme_bw()
+ stat_bin2d(binwidth = c(0.5, 0.5))
+geom_contour(data=depth,aes(lon, lat, z=dn),colour = "black", bins=5)
+ xlim(c(-67,-56)) + ylim(c(65,71))

Which gives me this map:

The last step is to display over my geom_bin2d circles with size proportional to the sum of the counts (Grcounts) within each bin.

Any tips on how to do so in ggplot (preferably) would be much appreciated.


follow-up question: alignment mismatch between stat_bin2d and stat_summary2d when using facet_wrap

When I run the following code on the diamonds data set, there are no apparent problem: However if I do run the same code on my data, I do get misalignment problems. Any thoughts on what may cause this problem?

p<-ggplot(diamonds,aes(x =carat, y =price,colour=cut))+
   stat_summary2d(fun=sum,aes(z=depth,group=cut),bins=10)
p+facet_wrap(~cut)
df <- ggplot_build(p)$data[[1]]
summary(df)##now 5 groups, 1 panel
df$x<-with(df,(xmin+xmax)/2)
df$y<-with(df,(ymin+ymax)/2)
plot1<-ggplot(diamonds,aes(carat, price))+ stat_bin2d(bins=10)
plot1+geom_point(data=df,aes(x,y,size=value,group=group),color="red",shape=1)+facet_wrap(~group)

This is my Rcode and plot:

p<-ggplot(dat,aes(x =longitude, y =latitude,colour=SizeClass))+
   stat_summary2d(fun=sum,aes(z=GRcounts,group=SizeClass),bins=10)
p+facet_wrap(~SizeClass)
df <- ggplot_build(p)$data[[1]]
summary(df)##now 4 groups, 1 panel
df$x<-with(df,(xmin+xmax)/2)
df$y<-with(df,(ymin+ymax)/2)
plot1<-ggplot(dat,aes(longitude, latitude))+ stat_bin2d(bins=10)
plot1+geom_point(data=df,aes(x,y,size=value,group=group),color="red",shape=1)+facet_wrap(~group)

> head(dat[c(7,8,14,21)])###mydata
     latitude longitude GRcounts SizeClass
742  67.10000 -57.93333        1 (100,150)
784  67.21667 -57.95000        1 (100,150)
756  67.11667 -57.80000        1    (<100)
1233 68.80000 -59.55000        2 (100,150)
1266 68.68333 -59.60000        2 (100,150)
1288 68.66667 -59.65000        1 (100,150)

My data set can be downloaded here: data

解决方案

As your dataset doesn't work on my computer will use diamonds dataset as example.

Make new plot of your data with stat_summary2d() and set z= as argument you want to sum (in your case GRcounts) and provide fun=sum to sum those values. Store it as some object.

p<-ggplot(diamonds,aes(carat,price))+stat_summary2d(fun=sum,aes(z=depth))

Use function ggplot_build() to get data used for plot. Coordinates of rectangles are in columns xmin, xmax, ymin and ymax and sum are in column value.

df <- ggplot_build(p)$data[[1]]
head(df)
     fill         xbin                ybin    value      ymax      ymin yint      xmax      xmin xint PANEL group
1 #55B1F7   [0.2,0.36]           [326,943] 641318.2  942.5667  326.0000    1 0.3603333 0.2000000    1     1     1
2 #1A3955   [0.2,0.36]      (943,1.56e+03]  75585.5 1559.1333  942.5667    2 0.3603333 0.2000000    1     1     1
3 #132B43   [0.2,0.36] (1.56e+03,2.18e+03]    415.8 2175.7000 1559.1333    3 0.3603333 0.2000000    1     1     1
4 #132B43   [0.2,0.36] (2.18e+03,2.79e+03]    304.4 2792.2667 2175.7000    4 0.3603333 0.2000000    1     1     1
5 #244D71 (0.36,0.521]           [326,943] 179486.8  942.5667  326.0000    1 0.5206667 0.3603333    2     1     1
6 #2D5F8A (0.36,0.521]      (943,1.56e+03] 271688.9 1559.1333  942.5667    2 0.5206667 0.3603333    2     1     1

For the points calculate x and y positions as mean of xmin,xmax and ymin,ymax.

df$x<-with(df,(xmin+xmax)/2)
df$y<-with(df,(ymin+ymax)/2)

Use this new data frame to add points to your original plot with stat_bin2d().

ggplot(diamonds,aes(carat,price))+stat_bin2d()+
  geom_point(data=df,aes(x=x,y=y,size=value),color="red",shape=1)

UPDATE - solution with facetting

To use facet_wrap() and combine stat_bin2d() and points you should use some workaround as there seems to be some problem.

First, create two plots - one for sums with stat_summary2d() and one for counts with stat_bin2d(). Both plots should be faceted.

plot1 <- ggplot(dat,aes(x =longitude, y =latitude))+
  stat_summary2d(fun=sum,aes(z=GRcounts),bins=10)+facet_wrap(~SizeClass)

plot2 <- ggplot(dat,aes(longitude, latitude))+ stat_bin2d(bins=10)+
    facet_wrap(~SizeClass)

Now extract data from both plots using ggplot_build() and store them as objects. For the sums data frame (df1) calculated x and y coordinates as in example above.

df1 <- ggplot_build(plot1)$data[[1]]
df1$x<-with(df,(xmin+xmax)/2)
df1$y<-with(df,(ymin+ymax)/2)

df2<-ggplot_build(plot2)$data[[1]]

Now plot your data using those new data frames - df1 for points and df2 for rectangles. With geom_rect() you will get rectangles which fill= depend on count. For faceting use column PANEL.

ggplot()+geom_rect(data=df2,aes(xmin=xmin,xmax=xmax,
                          ymin=ymin,ymax=ymax,fill=count))+
  geom_point(data=df1,aes(x=x,y=y,size=value),shape=1,color="red")+
  facet_wrap(~PANEL)

这篇关于使用geom_bin2d覆盖地图与附加的统计信息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆