如何从data.frame制作googleVis多个Sankey? [英] How to make a googleVis multiple Sankey from a data.frame?

查看:116
本文介绍了如何从data.frame制作googleVis多个Sankey?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的目标是使用googleVis包在R中制作多个Sankey.输出应类似于以下内容:

I am aiming to make a multiple Sankey in R using the googleVis package. The output should look similar to this:

我已经在R中创建了一些虚拟数据:

I've created some dummy data in R:

set.seed(1)

source <- sample(c("North","South","East","West"),100,replace=T)
mid <- sample(c("North ","South ","East ","West "),100,replace=T)
destination <- sample(c("North","South","East","West"),100,replace=T) # N.B. It is important to have a space after the second set of destinations to avoid a cycle
dummy <- rep(1,100) # For aggregation

dat <- data.frame(source,mid,destination,dummy)
aggdat <- aggregate(dummy~source+mid+destination,dat,sum)

到目前为止我已经尝试过的

如果我只有一个源和目标,但没有中间点,那么我可以用两个变量构建一个Sankey:

What I've tried so far

I can build a Sankey with 2 variables fine if I have just a source and destination, but not a middle point:

aggdat <- aggregate(dummy~source+destination,dat,sum)

library(googleVis)

p <- gvisSankey(aggdat,from="source",to="destination",weight="dummy")
plot(p)

代码会产生以下结果:

我如何修改

p <- gvisSankey(aggdat,from="source",to="destination",weight="dummy")

也接受mid变量吗?

推荐答案

函数gvisSankey确实直接接受中间级别.这些级别必须在基础数据中进行编码.

Function gvisSankey does accept mid-levels directly. These levels have to be coded in underlying data.

 source <- sample(c("NorthSrc", "SouthSrc", "EastSrc", "WestSrc"), 100, replace=T)
 mid <- sample(c("NorthMid", "SouthMid", "EastMid", "WestMid"), 100, replace=T)
 destination <- sample(c("NorthDes", "SouthDes", "EastDes", "WestDes"), 100, replace=T) 
 dummy <- rep(1,100) # For aggregation

现在,我们将重塑原始数据:

Now, we'll reshape original data:

 library(dplyr)

 datSM <- dat %>%
  group_by(source, mid) %>%
  summarise(toMid = sum(dummy) ) %>%
  ungroup()

数据框datSM总结了从源到中的单位数量.

Data frame datSM summarises number of units from Source to Mid.

  datMD <- dat %>%
   group_by(mid, destination) %>%
   summarise(toDes = sum(dummy) ) %>%
   ungroup()

数据帧datMD汇总了从中点到目的地的单位数.该数据帧将被添加到最终数据帧中.数据框必须为ungroup并且具有相同的colnames.

Data frame datMD summarises number of units from Mid to Destination. This data frame will be added to the final data frame. Data frame need to be ungroup and have same colnames.

  colnames(datSM) <- colnames(datMD) <- c("From", "To", "Dummy")

由于datMD作为最后一个追加,gvisSankey将自动识别中间步骤.

As the datMD is appended as the last one, gvisSankey will recognise the middle step automatically.

  datVis <- rbind(datSM, datMD)

  p <- gvisSankey(datVis, from="From", to="To", weight="dummy")
  plot(p)

这是情节:

这篇关于如何从data.frame制作googleVis多个Sankey?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆