地图中的密度计数 [英] Density count in heatmaps

查看:163
本文介绍了地图中的密度计数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的热图有一个问题,它显示密度级别,但没有说明密度计数。 (例如,同一区域有多少个点)。



我的数据分成多列,但最重要的是:lat,lon。



我想要这样,但使用count:



这是部分的数据:

 纬度标签设备
1 43.33622 -83.67445 0 iPhone5
43.33582 -83.69964 0 iPhone5
3 43.33623 -83.68744 0 iPhone5
4 43.33584 -83.72186 0 iPhone5
4 43.33616 -83.67526 0 iPhone5 $ b $ 6 43.25040 -83.78234 0 iPhone5

(标记栏不重要)

解决方案

REVISED



我意识到我以前的答案需要修改。所以,在这里。如果你想知道在轮廓的每一层有多少数据点,你实际上有很多事情要做。如果您很高兴使用下面的传单选项,您的生活会更容易。 首先,让我们来看看

 图书馆(dplyr)
图书馆(ggplot2)
library(ggmap)

mymap< - get_map(location =Detroit,zoom = 8)

###创建一个样本数据
set .seed(123)
mydata < - data.frame(long = runif(min = -84,max = -82.5,n = 100),
lat = runif(min = 42,max = 42.7,n = 100))

现在,我们绘制一张地图并保存为<$ c $

  g < -  ggmap(mymap)+ 
stat_density2d( data = mydata,
aes(x = long,y = lat,fill = ..level ..),
size = 0.5,bins = 10,geom =polygon)



真正的工作从这里开始。为了找出所有级别的数据点的数量,您需要使用 ggplot 生成的数据帧。在这个数据框中你有多边形的数据。这些多边形用于绘制水平线。您可以在下面的图中看到,我在地图上绘制了三个级别。

  ###创建一个数据框我们可以找到每个级别存在多少个数据点
###。

mydf< - ggplot_build(g)$ data [[4]]

###检查多边形线的位置。这只是一个检查。

检查< - ggmap(mymap)+
geom_point(data = mydata,aes(x = long,y = lat))+
geom_path(data = subset(mydf ,group ==1-008),aes(x = x,y = y))+
geom_path(data = subset(mydf,group ==1-009),aes(x = x ,y = y))+
geom_path(data = subset(mydf,group ==1-010),aes(x = x,y = y))



下一步是为图例创建一个关卡矢量。我们按组对数据进行分组(例如, 1-010 ),并使用 slice()。然后,取消组合数据并选择第二列。最后,用 unlist()创建一个向量
。我们回到 lev 最后。

  mydf%> %
group_by(group)%>%
slice(1)%>%
ungroup%>%
select(2)%>%
unlist - > lev

现在我们按分组分割多边形数据(即mydf),并为每个级别创建一个多边形。由于我们有11个等级(11个多边形),我们使用 lapply()。在lapply循环中,我们需要做; 1)提取经度和纬度的列,2)创建多边形,3)将多边形转换为空间多边形,4)分配
CRS,5)创建一个虚拟数据框,以及6)创建SpatialPolygonsDataFrames。 b
$ b

  mylist<  -  split(mydf,f = mydf $ group)

test< - lapply(mylist,function (x){

xy < - x [,c(3,4)]

circle < - 多边形(xy,hole = as.logical(NA) )

SP< - SpatialPolygons(列表(多边形(列表(圆圈),ID =1)))

proj4string(SP)< - CRS( + proj = longlat + ellps = WGS84)

df < - data.frame(value = 1,row.names =1)

circleDF < SpatialPolygonsDataFrame(SP,data = df)

})

现在我们去回到原始数据。我们需要将数据帧转换为SpatialPointsDataFrame。这是因为我们需要对数据进行子集化并找出每个多边形中存在多少个数据点(在每个级别中)。首先,从你的数据框中获得长和经。请确保订单处于lon / lat。

  xy < -  mydata [,c(1,2)] 

然后,我们创建SPDF(SpatialPolygonsDataFrame)。您希望在空间多边形和空间点数据之间具有相同的proj4string。

  spdf<  -  SpatialPointsDataFrame(coords = xy,data = mydata,
proj4string = CRS(+ proj = longlat + ellps = WGS84))



<然后,我们使用每个多边形为数据( mydata )分组。 > aa< - lapply(test,function(y){

mydf< - as.data.frame(spdf [y,])

})

数据点在各个层次之间重叠;我们有重复。首先,我们尝试找出每个级别的独特数据点。我们在ana中绑定数据框并创建一个数据框,它是 foo1 。我们还创建了一个数据框,我们希望找到唯一数量的数据点。我们确保列名在 foo1 foo2 之间完全相同。使用 setdiff() nrow(),我们可以找到每个级别的唯一数据点数。

  total<  -  lapply(11:2,function(x){

foo1< - bind_rows(ana [c(11:x)])
foo2< - as.data.frame(ana [x-1])$ ​​b $ b names(foo2)< - names(foo1)
nrow(setdiff(foo2,foo1))
})

最后,我们需要找到最内层的数据点的数量,即11层。我们在 ana 中选择第11层的数据帧,并创建一个数据帧并计数

  bob < -  nrow(as.data.frame(ana [11]))
(bob,unlist(total))

###检查总数是否为100
### sum(out)
### [1] 100

我们将 out 作为名称。这是因为我们想要显示图例中每个级别有多少数据点。

  names(lev)<  - rev(out)

现在我们准备添加一个图例。

  final <-g + 
scale_fill_continuous(name =Total,
guide = guide_legend(),
breaks = lev)

final


I have a problem with my heatmap, which displays the density LEVEL, but doesn't say anything about the density count. (how many points are in the same area for example).

My data is divided in more columns, but the most important ones are: lat,lon.

I would like to have something like this, but with "count" : https://stackoverflow.com/a/24615674/5316566, however when I try to apply the code he uses in that answer, my maximum-"level" density doesn't reflect my density count.( Intead of 7500 I receive for example 6, even if I have thousands and thousands of data concentrated). That's my code:

us_map_g_str <- get_map(location = c(-90.0,41.5,-81.0,42.7), zoom = 7)
ggmap(us_map_g_str, extent = "device") + 
geom_tile(data = data1, aes(x = as.numeric(lon), y = as.numeric(lat)), size = 0.3) + 
stat_density2d(data = data1, aes(x = as.numeric(lon), y = as.numeric(lat), fill = ..level.., alpha = ..level..), size = 0.3, bins = 10, geom = "polygon") + 
scale_fill_gradient(name= "Ios",low = "green", high = "red", trans= "exp") + 
scale_alpha(range = c(0, 0.3), guide = FALSE)

This is what I get:

This is part of the data:

  lat       lon       tag  device
1 43.33622 -83.67445   0 iPhone5
2 43.33582 -83.69964   0 iPhone5
3 43.33623 -83.68744   0 iPhone5
4 43.33584 -83.72186   0 iPhone5
5 43.33616 -83.67526   0 iPhone5
6 43.25040 -83.78234   0 iPhone5

(The "tag" column is not important)

解决方案

REVISED

I realised that my previous answer needs to be revised. So, here it is. If you want to find out how many data points exist in each level of a contour, you actually have a lot of things to do. If you are happy to use the leaflet option below, your life would be much easier.

First, let's get a map of Detroit, and create a sample data frame.

library(dplyr)
library(ggplot2)
library(ggmap)

mymap <- get_map(location = "Detroit", zoom = 8)

### Create a sample data
set.seed(123)
mydata <- data.frame(long = runif(min = -84, max = -82.5, n = 100),
                     lat = runif(min = 42, max = 42.7, n = 100))

Now, we draw a map and save it as g.

g <- ggmap(mymap) +
     stat_density2d(data = mydata,
                    aes(x = long, y = lat, fill = ..level..),
                    size = 0.5, bins = 10, geom = "polygon")

The real job begins here. In order to find out the numbers of data points in all levels, you want to employ the data frame, which ggplot generates. In this data frame you have data for polygons. These polygons are used to draw level lines. You can see that in the following image, which I draw three levels on a map.

### Create a data frame so that we can find how many data points exist
### in each level.

mydf <- ggplot_build(g)$data[[4]]

### Check where the polygon lines are. This is just for a check.

check <- ggmap(mymap) +
         geom_point(data = mydata, aes(x = long, y = lat)) +
         geom_path(data = subset(mydf, group == "1-008"), aes(x = x, y = y)) +
         geom_path(data = subset(mydf, group == "1-009"), aes(x = x, y = y)) +
         geom_path(data = subset(mydf, group == "1-010"), aes(x = x, y = y)) 

The next step is to reate a level vector for a legend. We group the data by group (e.g., 1-010) and take the first row for each group using slice(). Then, ungroup the data and choose the 2nd column. Finally, create a vector with unlist(). We come back to lev in the end.

mydf %>%
group_by(group) %>%
slice(1) %>%
ungroup %>%
select(2) %>%
unlist -> lev

Now we split the polygon data (i.e., mydf) by group and create a polygon for each level. Since we have 11 levels (11 polygons), we use lapply(). In the lapply loop, we need to do; 1) extract column for longitude anf latitude, 2) create polygon, 3) convert polygons to spatial polygons, 4) assign CRS, 5) create a dummy data frame, and 6) create SpatialPolygonsDataFrames.

mylist <- split(mydf, f = mydf$group)

test <- lapply(mylist, function(x){

              xy <- x[, c(3,4)]

              circle <- Polygon(xy, hole = as.logical(NA))

              SP <- SpatialPolygons(list(Polygons(list(circle), ID = "1")))

              proj4string(SP) <- CRS("+proj=longlat +ellps=WGS84")

              df <- data.frame(value = 1, row.names = "1")

              circleDF <- SpatialPolygonsDataFrame(SP, data = df)

            })

Now we go back to the original data. What we need to to is to convert the data frame to SpatialPointsDataFrame. This is because we need to subset data and find how many data points exist in each polygon (in each level). First, get long and lat from your data.frame. Make sure that the order is in lon/lat.

xy <- mydata[,c(1,2)]

Then, we create SPDF (SpatialPolygonsDataFrame). You want to have an identical proj4string between spatial polygons and spatial points data.

spdf <- SpatialPointsDataFrame(coords = xy, data = mydata,
                               proj4string = CRS("+proj=longlat +ellps=WGS84"))

Then, we subset data (mydata) using each polygon.

ana <- lapply(test, function(y){

              mydf <- as.data.frame(spdf[y, ])

            })

Data points are overlapping across levels; we have duplication. First we try to find out unique data points for each level. We bind data frames in ana and create a data frame, which is foo1. We also create a data frame, which we want to find unique number of data points. We make sure that columns names are all identical between foo1 and foo2. Using setdiff() and nrow(), we can find the unique number of data points in each level.

total <- lapply(11:2, function(x){

                foo1 <- bind_rows(ana[c(11:x)])
                foo2 <- as.data.frame(ana[x-1])
                names(foo2) <- names(foo1)
                nrow(setdiff(foo2, foo1))               
              })

Finally, we need to find the number of data points for the most inner level, which is level 11. We choose a data frame for level 11 in ana and create a data frame and count the number of row.

 bob <- nrow(as.data.frame(ana[11]))
 out <- c(bob,unlist(total))

 ### check if total is 100
 ### sum(out)
 ### [1] 100

We assign reversed out as names for lev. This is because we want to show how many data points exist for each level in a legend.

 names(lev) <- rev(out)

Now we are ready to add a legend.

 final <- g +
          scale_fill_continuous(name = "Total",
                                guide = guide_legend(),
                                breaks = lev)

 final

LEAFLET OPTION

If you use leaflet package, you can group data points with different zooms. Leaflet counts data points in certain areas and indicate numbers in circles like the following figure. The more you zoom in, the more leaflet breaks up data points into small groups. In terms of workload, this is much lighter. In addition, your map is interactive. This may be a better option.

library(leaflet)
leaflet(mydf) %>%
addTiles() %>%
addMarkers(clusterOptions = markerClusterOptions())

这篇关于地图中的密度计数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆