具有多个csv文件的聚类和堆叠条形图 [英] Clustered and stacked bar plot with multiple csv files

查看:52
本文介绍了具有多个csv文件的聚类和堆叠条形图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有多个具有相同数据结构的csv文件

I have multiple csv files with same structure of data

url, A,B,C,D
a.com,1,2,3,4
b.com,3,4,5,6

我可以创建一个堆积的条形图,其中在x轴上的URL和A,B,C,D彼此堆叠.

I can create a stacked bar plot with urls on x-axis and A,B,C,D stacked on top of each other.

现在,我想创建具有多个此类csv文件的聚簇堆积的条形图,它们均在x轴上由相同的url索引.

Now I want to create clustered stacked bar plots, with multiple such csv files, all indexed by the same url on the x axis.

data1 = read.csv("data.csv")
data2 = read.csv("data2.csv")
data.m = melt(data1, id.var="url")

ggplot(data.m, aes(x = url, y = value, fill = variable)) + 
  geom_bar(position="fill",stat = "identity")

基本上将data2添加到绘图中.不确定我是否应该使用聚集或构面,还是在融化后手动创建新列?

Basically add data2 to the plot. Not sure if I am supposed to use gather or facets or manually create new columns post melt?

它应该看起来像这样:

It should look something like this:

推荐答案

这是您要的吗?

# Two sample datasets
df1 <- cbind.data.frame(
    url = c("a.com", "b.com"),
    A = c(1, 3), B = c(2, 4), C = c(3, 5), D = c(4, 6));

df2 <- cbind.data.frame(
    url = c("a.com", "b.com"),
    A = c(5, 7), B = c(6, 8), C = c(7, 9), D = c(8, 10));

使用 gather

# Using gather
require(tidyr);
df <- rbind.data.frame(
    gather(cbind.data.frame(df1, src = "df1"), variable, value, -url, -src),
    gather(cbind.data.frame(df2, src = "df2"), variable, value, -url, -src));

使用融化

# Using melt
require(reshape2);
df <- rbind.data.frame(
    melt(cbind.data.frame(df1, src = "df1"), id.vars = c("url", "src")),
    melt(cbind.data.frame(df2, src = "df2"), id.vars = c("url", "src")));

样本图

ggplot(df, aes(x = url, y = value, fill = variable)) + geom_bar(stat = "identity") + facet_wrap(~ src);

注意:如果您有多个 csv 文件,最好 df.list<-lapply(...,read.csv),然后融化 df.list 以获得列 variable value L1 (对应于 src ).

Note: If you have multiple csv files, best to df.list <- lapply(..., read.csv), and then melt df.list to get columns variable, value and L1 (which corresponds to src).

我不太清楚你要做什么,所以这在黑暗中有点刺痛.您还可以通过 url (而不是 src )进行集群:

I'm not entirely clear on what you are after, so this is a bit of a stab in the dark. You can also cluster by url (instead of src):

ggplot(df, aes(x = src, y = value, fill = variable)) + geom_bar(stat = "identity") + facet_wrap(~ url);

和/或并排显示条(而不是堆叠的条)

and/or show bars side-by-side (instead of stacked)

ggplot(df, aes(x = src, y = value, fill = variable)) + geom_bar(stat = "identity", position = "dodge") + facet_wrap(~ url);

这篇关于具有多个csv文件的聚类和堆叠条形图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆