在保留悬停信息的同时,将多边形添加到散点图 [英] Adding a polygon to a scatter plotly while retaining the hover info

查看:156
本文介绍了在保留悬停信息的同时,将多边形添加到散点图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用 R 的<$ c $绘制了5个群集的 x,y 数据c> plotly 。



以下是数据:

  set.seed(1)
df < - do.call(rbind,lapply(seq(1,20,4),function(i)data.frame(x = rnorm(50,平均值= i,sd = 1),y = rnorm(50,mean = i,sd = 1),cluster = i)))

以下是它们的绘图散点图:

 <$ c $ (plotly)
clusters.plot< - plot_ly(marker = list(size = 10),type ='scatter',mode =markers,x =〜df $ x,y =〜df $ y,color =〜df $ cluster,data = df)%>%hide_colorbar()%>%layout(xaxis = list(title =X,zeroline = F),yaxis = list(title =Y ,zeroline = F))

给出:



然后,继@Marco Sandri的



虽然这很好,但不幸的是它消除了 hoverinfo 在添加多边形之前已存在,现在只是每个多边形的轨迹。



更改继承 FALSE TRUE 结果与我写入的错误


I have 5 clusters of x,y data I'm plotting using R's plotly.

Here are the data:

set.seed(1)
df <- do.call(rbind,lapply(seq(1,20,4),function(i) data.frame(x=rnorm(50,mean=i,sd=1),y=rnorm(50,mean=i,sd=1),cluster=i)))

Here's their plotly scatter plot:

library(plotly)
clusters.plot <- plot_ly(marker=list(size=10),type='scatter',mode="markers",x=~df$x,y=~df$y,color=~df$cluster,data=df) %>% hide_colorbar() %>% layout(xaxis=list(title="X",zeroline=F),yaxis=list(title="Y",zeroline=F))

Which gives:

Then, following @Marco Sandri's answer, I add polygons circumscribing these clusters using this code:

Polygons code:

library(data.table)
library(grDevices)

splinesPolygon <- function(xy,vertices,k=3, ...)
{
  # Assert: xy is an n by 2 matrix with n >= k.
  # Wrap k vertices around each end.
  n <- dim(xy)[1]
  if (k >= 1) {
    data <- rbind(xy[(n-k+1):n,], xy, xy[1:k, ])
  } else {
    data <- xy
  }
  # Spline the x and y coordinates.
  data.spline <- spline(1:(n+2*k), data[,1], n=vertices, ...)
  x <- data.spline$x
  x1 <- data.spline$y
  x2 <- spline(1:(n+2*k), data[,2], n=vertices, ...)$y
  # Retain only the middle part.
  cbind(x1, x2)[k < x & x <= n+k, ]
}

clustersPolygon <- function(df)
{
  dt <- data.table::data.table(df)
  hull <- dt[,.SD[chull(x,y)]]
  spline.hull <- splinesPolygon(cbind(hull$x,hull$y),100)
  return(data.frame(x=spline.hull[,1],y=spline.hull[,2],stringsAsFactors=F))
}

library(dplyr)
polygons.df <- do.call(rbind,lapply(unique(df$cluster),function(l)
  clustersPolygon(df=dplyr::filter(df,cluster == l)) %>%
    dplyr::rename(polygon.x=x,polygon.y=y) %>%
    dplyr::mutate(cluster=l)))

And now adding the polygons:

clusters <- unique(df$cluster)

for(l in clusters) clusters.plot <- clusters.plot %>% 
 add_polygons(x=dplyr::filter(polygons.df,cluster == l)$polygon.x,
              y=dplyr::filter(polygons.df,cluster == l)$polygon.y,
              line=list(width=2,color="black"),
              fillcolor='transparent', inherit = FALSE)

Which gives:

Although this works great, unfortunately it eliminates the hoverinfo that existed prior to adding the polygons, and now is just the trace of each polygon.

Changing inherit from FALSE to TRUE results with the error I write about in that post. So my question is how to add the polygons without changing the hoverinfo of the original plot.

解决方案

I think part of the issue here is that the colorbar in plotly has some somewhat weird behavior and side effects when you start to mix and match trace types.

The simplest way to work around this (and it seems appropriate since you are coloring by clusters, not a continuous variable) is to change the class of your clustered column to be an ordered factor with the expression df$cluster <- ordered(as.factor(df$cluster)). (I believe this could be in a dplyr mutate statement as well.)

Packages and data generation functions


library(data.table)
library(grDevices)
library(dplyr)
library(plotly)

## Function Definitions 
splinesPolygon <- function(xy,vertices,k=3, ...) {
  # Assert: xy is an n by 2 matrix with n >= k.
  # Wrap k vertices around each end.
  n <- dim(xy)[1]
  if (k >= 1) {
    data <- rbind(xy[(n-k+1):n,], xy, xy[1:k, ])
  } else {
    data <- xy
  }
  # Spline the x and y coordinates.
  data.spline <- spline(1:(n+2*k), data[,1], n=vertices, ...)
  x <- data.spline$x
  x1 <- data.spline$y
  x2 <- spline(1:(n+2*k), data[,2], n=vertices, ...)$y
  # Retain only the middle part.
  cbind(x1, x2)[k < x & x <= n+k, ]
}

clustersPolygon <- function(df) {
  dt <- data.table::data.table(df)
  hull <- dt[,.SD[chull(x,y)]]
  spline.hull <- splinesPolygon(cbind(hull$x,hull$y),100)
  return(data.frame(x=spline.hull[,1],y=spline.hull[,2],stringsAsFactors=F))
}

Generate Data


The one critical difference here is to define your cluster as an ordered factor to keep it from being treated as a continuous variable that will invoke the colorbar weirdness.

set.seed(1)
df <- do.call(rbind,lapply(seq(1,20,4),function(i) data.frame(x=rnorm(50,mean=i,sd=1),y=rnorm(50,mean=i,sd=1),cluster=i)))

## Critical Step here: Make cluster an ordered factor so it will
## be plotted with the sequential viridis scale, but will not 
## be treated as a continuous spectrum that gets the colorbar involved
df$cluster <- ordered(as.factor(df$cluster))

## Make hull polygons
polygons.df <- do.call(rbind,lapply(unique(df$cluster),function(l) clustersPolygon(df=dplyr::filter(df,cluster == l)) %>% dplyr::rename(polygon.x=x,polygon.y=y) %>% dplyr::mutate(cluster=l)))
clusters <- unique(df$cluster)
clustersPolygon(df=dplyr::filter(df,cluster == l)) %>% dplyr::rename(polygon.x=x,polygon.y=y) %>% dplyr::mutate(cluster=l)))

Build a plotly object


Mostly the same here, but starting by initializing an empty plotly object and then adding the hull polygons before the raw data points.

## Initialize an empty plotly object so that the hulls can be added first
clusters.plot <- plot_ly()

## Add hull polygons sequentially
for(l in clusters) clusters.plot <- clusters.plot %>% 
  add_polygons(x=dplyr::filter(polygons.df,cluster == l)$polygon.x,
               y=dplyr::filter(polygons.df,cluster == l)$polygon.y,
               name = paste0("Cluster ",l),
               line=list(width=2,color="black"),
               fillcolor='transparent', 
               hoverinfo = "none",
               showlegend = FALSE,
               inherit = FALSE)  

## Add the raw data trace
clusters.plot <- clusters.plot %>% 
  add_trace(data=df, x= ~x,y= ~y,color= ~cluster,
            type='scatter',mode="markers",
            marker=list(size=10)) %>% 
  layout(xaxis=list(title="X",
                    zeroline=F),
         yaxis=list(title="Y",
                    zeroline=F))
## Print the output
clusters.plot

Gives the following output


这篇关于在保留悬停信息的同时,将多边形添加到散点图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆