Gnuplot直方图簇(条形图),每个类别一行 [英] Gnuplot Histogram Cluster (Bar Chart) with One Line per Category

查看:278
本文介绍了Gnuplot直方图簇(条形图),每个类别一行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 gnuplot 从此数据文件中生成以下直方图簇,其中每个类别均表示为每年在数据文件中单独一行:

I'm trying to generate the following histogram cluster out of this data file with gnuplot, where each category is represented in a separate line per year in the data file:

# datafile
year   category        num_of_events
2011   "Category 1"    213
2011   "Category 2"    240
2011   "Category 3"    220
2012   "Category 1"    222
2012   "Category 2"    238
...

但是我不知道如何使用每个类别一行.如果有人知道如何使用gnuplot做到这一点,我将感到非常高兴.

But I don't know how to do it with one line per category. I would be glad if anybody has got an idea how to do this with gnuplot.

更好的是像下面这样的堆叠直方图簇,其中堆叠子类别由数据文件中的单独列表示:

Even better would be a stacked histogram cluster like the following, where the stacked sub categories are represented by separate columns in the datafile:

# datafile
year   category        num_of_events_for_A    num_of_events_for_B
2011   "Category 1"    213                    30
2011   "Category 2"    240                    28
2011   "Category 3"    220                    25
2012   "Category 1"    222                    13
2012   "Category 2"    238                    42
...

非常感谢!

推荐答案

经过研究,我提出了两种不同的解决方案.

After some research, I came up with two different solutions.

这两种解决方案都需要将数据文件拆分为按列分类的几个文件.因此,我创建了一个简短的 ruby​​ 脚本,该脚本可在以下要点中找到:

Both solutions require splitting up the data file into several files categorized by a column. Therefore, I've created a short ruby script, which can be found in this gist:

https://gist.github.com/fiedl/6294424

此脚本的用法如下:为了将数据文件data.csv拆分为data.Category1.csvdata.Category2.csv,请调用:

This script is used like this: In order to split up the data file data.csv into data.Category1.csv and data.Category2.csv, call:

# bash
ruby categorize_csv.rb --column 2 data.csv

# data.csv
# year   category   num_of_events_for_A   num_of_events_for_B
"2011";"Category1";"213";"30"
"2011";"Category2";"240";"28"
"2012";"Category1";"222";"13"
"2012";"Category2";"238";"42"
...

# data.Category1.csv
# year   category   num_of_events_for_A   num_of_events_for_B
"2011";"Category1";"213";"30"
"2012";"Category1";"222";"13"
...

# data.Category2.csv
# year   category   num_of_events_for_A   num_of_events_for_B
"2011";"Category2";"240";"28"
"2012";"Category2";"238";"42"
...

解决方案1:堆叠箱图

策略:每个类别一个数据文件.每个堆栈一列.通过使用gnuplot的带有框"参数,可以手动"绘制直方图的条形图.

Solution 1: Stacked Box Plot

Strategy: One data file per category. One column per stack. The bars of the histogram are plotted "manually" by using the "with boxes" argument of gnuplot.

上方:关于条形,大小,颜色等的完全灵活性.

Upside: Full flexibility concerning bar sizes, caps, colors, etc.

下方:必须手动放置钢筋.

Downside: Bars have to be placed manually.

# solution1.gnuplot
reset
set terminal postscript eps enhanced 14

set datafile separator ";"

set output 'stacked_boxes.eps'

set auto x
set yrange [0:300]
set xtics 1

set style fill solid border -1

num_of_categories=2
set boxwidth 0.3/num_of_categories
dx=0.5/num_of_categories
offset=-0.1

plot 'data.Category1.csv' using ($1+offset):($3+$4) title "Category 1 A" linecolor rgb "#cc0000" with boxes, \
     ''                   using ($1+offset):3 title "Category 2 B" linecolor rgb "#ff0000" with boxes, \
     'data.Category2.csv' using ($1+offset+dx):($3+$4) title "Category 2 A" linecolor rgb "#00cc00" with boxes, \
     ''                   using ($1+offset+dx):3 title "Category 2 B" linecolor rgb "#00ff00" with boxes

结果如下:

策略:每年一个数据文件.每个堆栈一列.直方图是使用gnuplot的常规直方图机制生成的.

Strategy: One data file per year. One column per stack. The histogram is produced using the regular histogram mechanism of gnuplot.

上方:易于使用,因为不必手动进行定位.

Upside: Easier to use, since positioning has not to be done manually.

缺点:由于所有类别都在一个文件中,因此每个类别具有相同的颜色.

Downside: Since all categories are in one file, each category has the same color.

# solution2.gnuplot
reset
set terminal postscript eps enhanced 14

set datafile separator ";"

set output 'histo.eps'
set yrange [0:300]

set style data histogram
set style histogram rowstack gap 1
set style fill solid border -1
set boxwidth 0.5 relative

plot newhistogram "2011", \
       'data.2011.csv' using 3:xticlabels(2) title "A" linecolor rgb "red", \
       ''              using 4:xticlabels(2) title "B" linecolor rgb "green", \
     newhistogram "2012", \
       'data.2012.csv' using 3:xticlabels(2) title "" linecolor rgb "red", \
       ''              using 4:xticlabels(2) title "" linecolor rgb "green", \
     newhistogram "2013", \
       'data.2013.csv' using 3:xticlabels(2) title "" linecolor rgb "red", \
       ''              using 4:xticlabels(2) title "" linecolor rgb "green"

结果如下:

  • http://psy.swansea.ac.uk/staff/carter/gnuplot/gnuplot_time_histograms.htm
  • Gnuplot: Multiple Stacked Histograms, each group using the same key
  • http://gnuplot.sourceforge.net/docs_4.2/node249.html

这篇关于Gnuplot直方图簇(条形图),每个类别一行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆