从不同的数据创建图表 [英] Create a chart from different data

查看：104 发布时间：2018/4/25 21:32:25 r plot ggplot2 igraph

本文介绍了从不同的数据创建图表的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要帮助来创建图表。我解释得更好。

我创建了10个随机图，每个图都有N个节点。
我已经完成了N = 10 ^ 3,10 ^ 4,10 ^ 5。
所以共有30张图。

给他们每个人，我找到了他们拥有的多重链接和selfloops的百分比。

现在我想创建一个显示节点数量百分比函数的图形。
所以像这样：

所以我有3个列表：
- listNets 包含30张图
- listSelf 包含selfloops的百分比
- listMul 包含多重链接的百分比

这就是我所做的：

  listN <-c（（10 ^ 3），（ 10 ^ 4），（10 ^ 5））
 
＃网络列表
 listNets<  -  vector（mode =list，length = 0）
＃list of自循环的百分比
 listSelf<  -  vector（mode =list，length = 0）
多链接百分比的列表
 listMul<  -  vector（mode =list，length = 0）
 
 ... 
 
（listN中的N）{
 
 ... 
 
净值< ;  -  graph_from_adjacency_matrix（adjmatrix = adjacency_matrix，mode =undirected）＃它的工作，事实上，如果我绘制它，我看到一个正确的网络
 listNets<  -  c（listNets，net）＃我加net到l （$ net 
 
 
 
 
 
 
 
 
 
 
 
 
 
 $ -loops e multilinks 
 netmatr<  -  as_adjacency_matrix（net，sparse = FALSE）
 num_selfloops<  -  sum（diag（netmatr））
 num_multilinks<  -  sum（netmatr> 1）
 
＃我找到百分比
 per_self<  - （（num_selfloops / num_vertices）* 100）
 per_mul<  - （（num_multilinks / num_edges）* 100）
 
 listSelf<  -  c（listSelf，per_self）
 listMul<  -  c（listMul，per_mul）
}

现在，如果我以这种方式打印 listNets ，我有些奇怪：

 > print（listNets）
 [[1]] 
 [1] 9 
 
 [[2]] 
 [1] FALSE 
 
 [[3]] 
 [1] 7 6 3 8 8 8 
 
 [[4]] 
 [1] 0 1 2 4 5 7 
 
 [[5]] 
 [1] 2 1 0 3 4 5 
 
 [[6]] 
 [1] 0 1 2 3 4 5 
 
 [[7]] 
 [1] 0 0 0 0 1 1 1 2 3 6 
 
 [[8]] 
 [1] 0 1 2 3 3 4 5 5 6 6 
 
 [[9]] 
 [[9]] [[1]] 
 [1] 1 0 1 
 
 [[9]] [[2]] 
名单列表（）
 
 [[9]] [[3]] 
 list（）
 
 $ [$ 9]] [[4]] 
 list（）
 
 
 [[10]] 
< environment：0x000000001a6284a8> 
 $ b $ [[11]] 
 [1] 9 
 
 [[12]] 
 [1] FALSE 
 
 [[13]] 
 [1] 2 5 8 8 7 8 
 
 [[14]] 
 [1] 0 1 3 4 6 7 
 
 [[15]] 
 [1] 0 1 4 2 3 5 
 
 [[16]] 
 [1] 0 1 2 3 4 5 
 
 [[17]] 
 [1] 0 0 0 1 1 2 2 3 6 
 
 [[18]] 
 [1] 0 1 2 2 3 4 4 5 6 6 
 
 [[19]] 
 [[19]] [[1]] 
 [1] 1 0 1 
 
 [[19]] [[2]] 
名单列表（）
 
 [[19]] [[3]] 
 list（）
 
 $ [b] [b] 
 [b] 
 
 ...

相反，如果我打印另外两个列表（ listSelf 和 listMult 一切正常）。

现在，我如何绘制这些数据？

我阅读了关于数据框的内容，但我不明白如何使用它。
有人可以帮我吗？

我试图通过手工将一个可能的结果表写在一个csv文件中，然后尝试绘制它以查看如果我正朝着正确的方向前进。

这是代码，这就是结果。
注意：我手工创建的表格和我发明的百分比。

 > df<  -  read.csv（./ table.csv，sep =，）＃读取csv文件
> df 
 N perSelf perMul 
 1 10 ^ 3 2 1 
 2 10 ^ 3 5 1 
 3 10 ^ 3 98 15 
 4 10 ^ 3 50 51 
 5 10 ^ 3 41 52 
 6 10 ^ 3 21 100 
 7 10 ^ 3 36 80 
 8 10 ^ 3 70 20 
 9 10 ^ 3 80 55 
 10 10 ^ 3 100 44 
 11 10 ^ 4 2 1 
 12 10 ^ 4 5 18 
 13 10 ^ 4 100 20 
 14 10 ^ 4 50 51 
 15 10 ^ 4 51 52 
 16 10 ^ 4 21 100 
 17 10 ^ 4 36 80 
 18 10 ^ 4 70 20 
 19 10 ^ 4 73 85 
 20 10 ^ 4 100 98 
 21 10 ^ 5 100 10 
 22 10 ^ 5 5 1 
 23 10 ^ 5 98 15 
 24 10 ^ 5 50 51 
 25 10 ^ 5 41 52 
 26 10 ^ 5 21 85 
 27 10 ^ 5 36 80 
 28 10 ^ 5 65 20 
 29 10 ^ 5 80 55 
 30 10 ^ 5 100 44

有s

非常感谢

代码是：

 <$ c $从列表（list_all）创建一个矩阵
 mat < -  matrix（unlist（list_all），
 unique（lengths（list_all）），
 dimnames = list（NULL，c （N，％selfloops，％multilinks）））
 
＃将矩阵转换为数据帧
 df < -  as.data.frame（x = mat，row .names = NULL）
 df 
 
＃plot 
 dflong < -  melt（df，id.vars ='N'）
 
 x11（ ）
 ggplot（dflong，aes（x = N，y = value，color = variable））+ 
 geom_point（size = 5，alpha = 0.7，position = position_dodge（width = 0.3））+ 
 scale_x_discrete（labels = parse（text = as.character（unique（dflong $ N））））+ 
 scale_y_continuous（''，breaks = seq（0，100，25），labels = paste（seq （ 0'，100，25），'％'））+ 
 scale_color_manual（''，values = c（'red'，'blue'），
 labels = c（'Selfloop Percentage of'多重链接的百分比'））+ 
 theme_minimal（base_size = 14）

<$ c $

  N％selfloops％multilinks 
 1 10 11.111111 0.00000 
 2 10 11.111111 0.00000 
 3 10 0.000000 0.00000 
 4 20 0.000000 0.00000 
 5 20 0.000000 15.38462 
 6 20 0.000000 0.00000 
 7 30 3.448276 0.00000 
 8 30 3.448276 0.00000 
 9 30 0.000000 0.00000

解决方案

<以您的 df 数据框为起点，您可以分两步获得所需的结果：

1）使用 reshape2 将数据重塑为长格式：

  library（ reshape2）
 dflong < -  melt（df，i d.vars ='N'）

2）（ggplot2）：
ggplot（dflong，aes（x = N，ggplot2）：

  ，y = value，color = variable））+ 
 geom_point（size = 5，alpha = 0.7，position = position_dodge（width = 0.3））+ 
 scale_x_discrete（labels = parse（text = as.character （unique（dflong $ N））））+ 
 scale_y_continuous（''，breaks = seq（0,100,25），labels = paste（seq（0,100,25），'％'））+ 
 scale_color_manual（''，values = c（'red'，'blue'），
 labels = c（'selfloop百分比'，'多重链接百分比'））+ 
 theme_minimal（base_size = 14）

给出：

我使用透明度（ alpha = 0.7 ）能够

回应您的评论和问题中的第二个例子：

您必须稍微修改 ggplot2 代码：

更改 x aes 中的变量作为因子。 没有必要以解析标签的文字，从而删除该部分。

调整y值中的值和中断。

以下代码：

ggplot（dflong，aes（x = factor（N ），y = value，color = variable））+ geom_point（size = 5，alpha = 0.5，position = position_dodge（width = 0.3））+ xlab（'N'）+ $ b $ （0,20,5），'％'）， limits = c（0,20）））+ scale_color_manual（''， values = c（'red'，'blue'）， labels = c（'自我循环的百分比'，'多重链接的百分比'））+ theme_minimal（base_size = 14）
会给你：

使用的数据：

df< - structure（list（N =结构（c（1L，1L，1L，1L，1L，1L，1L，1L，1L，1L，2L，2L，2L，2L，2L，2L，2L，2L，2L，2L，3L，3L，3L ，3L，3L，3L，3L，3L，3L，3L），。标签= c（10 ^ 3，10 ^ 4，10 ^ 5），class =factor b perSelf = c（2L，5L，98L，50L，41L，21L，36L，70L，80L，100L，2L，5L，100L，50L，51L，21L，36L，70L，73L，100L，100L，5L，98L （1L，1L，15L，51L，52L，100L，80L，20L，55L，44L，1L，18L，20L，50L，41L，21L，36L，65L，80L，100L） 51L，52L，100L，80L，20L，85L，98L，10L，1L，15L，51L，52L，85L，80L，20L，55L， 44L））， .Names = c（N，perSelf，perMul），class =data.frame，row.names = c（NA，-30L））

I need help to create a chart. I explain better.

I created 10 random graphs, each with N nodes. I have done that for N = 10^3, 10^4, 10^5. So in total 30 graphs.

To each of them I found the percentage of multilinks and selfloops they have.

Now I would like to create a single graph that shows the percentage in function of the number of nodes. So something like:

So I have a 3 lists: - listNets containing 30 graphs - listSelf containing the percentage of selfloops - listMul containing the percentage of multilinks

This is what I did:
listN <- c((10^3), (10^4), (10^5)) # list of networks listNets <- vector(mode = "list", length = 0) # list of percentage of selfloops listSelf <- vector(mode = "list", length = 0) #list of percentage of multilinks listMul <- vector(mode = "list", length = 0) ... for(N in listN) { ... net <- graph_from_adjacency_matrix(adjmatrix = adjacency_matrix, mode = "undirected") # it's work, infact if I plot it i saw a correct networks listNets <- c(listNets, net) # I add net to list of networks x11() plot(net, layout = layout.circle(net)) ... # I find self-loops e multilinks netmatr <- as_adjacency_matrix(net, sparse = FALSE) num_selfloops <- sum(diag(netmatr)) num_multilinks <- sum(netmatr > 1) # I find percentage per_self <- ((num_selfloops/num_vertices)*100) per_mul <- ((num_multilinks/num_edges)*100) listSelf <- c(listSelf, per_self) listMul <- c(listMul, per_mul) }
Now if I print listNets in this way I have something strange:
> print(listNets) [[1]] [1] 9 [[2]] [1] FALSE [[3]] [1] 7 6 3 8 8 8 [[4]] [1] 0 1 2 4 5 7 [[5]] [1] 2 1 0 3 4 5 [[6]] [1] 0 1 2 3 4 5 [[7]] [1] 0 0 0 0 1 1 1 2 3 6 [[8]] [1] 0 1 2 3 3 4 5 5 6 6 [[9]] [[9]][[1]] [1] 1 0 1 [[9]][[2]] named list() [[9]][[3]] list() [[9]][[4]] list() [[10]] <environment: 0x000000001a6284a8> [[11]] [1] 9 [[12]] [1] FALSE [[13]] [1] 2 5 8 8 7 8 [[14]] [1] 0 1 3 4 6 7 [[15]] [1] 0 1 4 2 3 5 [[16]] [1] 0 1 2 3 4 5 [[17]] [1] 0 0 0 1 1 1 2 2 3 6 [[18]] [1] 0 1 2 2 3 4 4 5 6 6 [[19]] [[19]][[1]] [1] 1 0 1 [[19]][[2]] named list() [[19]][[3]] list() [[19]][[4]] list() [[20]] <environment: 0x000000001a859e28> ...
Instead if I print the other two lists (listSelf and listMult everything is ok).

Now, how can I plot this data?

I read about dataframes, but I don't understand how to use it in my case. Can someone help me please?

I tried to bring me back by writing a possible result table on a csv file by hand and tried to plot it to see if I was going in the right direction.

This is the code and this the result. Note: The table I created by hand and I invented the percentages.
> df <- read.csv("./table.csv", sep = ",") # read csv file > df N perSelf perMul 1 10^3 2 1 2 10^3 5 1 3 10^3 98 15 4 10^3 50 51 5 10^3 41 52 6 10^3 21 100 7 10^3 36 80 8 10^3 70 20 9 10^3 80 55 10 10^3 100 44 11 10^4 2 1 12 10^4 5 18 13 10^4 100 20 14 10^4 50 51 15 10^4 51 52 16 10^4 21 100 17 10^4 36 80 18 10^4 70 20 19 10^4 73 85 20 10^4 100 98 21 10^5 100 10 22 10^5 5 1 23 10^5 98 15 24 10^5 50 51 25 10^5 41 52 26 10^5 21 85 27 10^5 36 80 28 10^5 65 20 29 10^5 80 55 30 10^5 100 44

There is something wrong.

Thanks a lot

The code is:
# create a matrix from a list (list_all) mat <- matrix(unlist(list_all), unique(lengths(list_all)), dimnames = list(NULL, c("N", "% selfloops", "% multilinks"))) # convert matrix to data frame df <- as.data.frame(x = mat, row.names = NULL) df # plot dflong <- melt(df, id.vars = 'N') x11() ggplot(dflong, aes(x = N, y = value, color = variable)) + geom_point(size = 5, alpha = 0.7, position = position_dodge(width = 0.3)) + scale_x_discrete(labels = parse(text = as.character(unique(dflong$N)))) + scale_y_continuous('', breaks = seq(0, 100, 25), labels = paste(seq(0, 100, 25), '%')) + scale_color_manual('', values = c('red', 'blue'), labels = c('Percentage of selfloop','Percentage of multilinks')) + theme_minimal(base_size = 14)
df is:
N % selfloops % multilinks 1 10 11.111111 0.00000 2 10 11.111111 0.00000 3 10 0.000000 0.00000 4 20 0.000000 0.00000 5 20 0.000000 15.38462 6 20 0.000000 0.00000 7 30 3.448276 0.00000 8 30 3.448276 0.00000 9 30 0.000000 0.00000

解决方案
Taking your df dataframe as a starting point, you can get the desired result in two steps:

1) Reshape your data into long format with reshape2:
library(reshape2) dflong <- melt(df, id.vars = 'N')
2) Plot the data with ggplot2:
library(ggplot2) ggplot(dflong, aes(x = N, y = value, color = variable)) + geom_point(size = 5, alpha = 0.7, position = position_dodge(width = 0.3)) + scale_x_discrete(labels = parse(text = as.character(unique(dflong$N)))) + scale_y_continuous('', breaks = seq(0,100,25), labels = paste(seq(0,100,25),'%')) + scale_color_manual('', values = c('red','blue'), labels = c('Percentage of selfloop','Percentage of multilinks')) + theme_minimal(base_size = 14)
which gives:

I used a transparency (alpha = 0.7) in order to be able to see where points overlap.

In response to your comment and the second example in the question:

You have to alter the ggplot2 code a bit:

Change the x variable in the aes to a factor.

There is no need to parse the text for the labels anymore, thus that part can be removed.

Adjust the values and breaks in the y-scale.

The following code:
ggplot(dflong, aes(x = factor(N), y = value, color = variable)) + geom_point(size = 5, alpha = 0.5, position = position_dodge(width = 0.3)) + xlab('N') + scale_y_continuous('', breaks = seq(0, 20, 5), labels = paste(seq(0, 20, 5), '%'), limits = c(0,20)) + scale_color_manual('', values = c('red', 'blue'), labels = c('Percentage of selfloop','Percentage of multilinks')) + theme_minimal(base_size = 14)
will give you:

Used data:
df <- structure(list(N = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("10^3", "10^4", "10^5"), class = "factor"), perSelf = c(2L, 5L, 98L, 50L, 41L, 21L, 36L, 70L, 80L, 100L, 2L, 5L, 100L, 50L, 51L, 21L, 36L, 70L, 73L, 100L, 100L, 5L, 98L, 50L, 41L, 21L, 36L, 65L, 80L, 100L), perMul = c(1L, 1L, 15L, 51L, 52L, 100L, 80L, 20L, 55L, 44L, 1L, 18L, 20L, 51L, 52L, 100L, 80L, 20L, 85L, 98L, 10L, 1L, 15L, 51L, 52L, 85L, 80L, 20L, 55L, 44L)), .Names = c("N", "perSelf", "perMul"), class = "data.frame", row.names = c(NA, -30L))

这篇关于从不同的数据创建图表的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

从不同的数据创建图表 [英] Create a chart from different data

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

从不同的数据创建图表 [英] Create a chart from different data

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭