Gnuplot直方图簇(条形图),每个类别一行 [英] Gnuplot Histogram Cluster (Bar Chart) with One Line per Category
问题描述
我正在尝试使用 gnuplot 从此数据文件中生成以下直方图簇,其中每个类别均表示为每年在数据文件中单独一行:
I'm trying to generate the following histogram cluster out of this data file with gnuplot, where each category is represented in a separate line per year in the data file:
# datafile
year category num_of_events
2011 "Category 1" 213
2011 "Category 2" 240
2011 "Category 3" 220
2012 "Category 1" 222
2012 "Category 2" 238
...
但是我不知道如何使用每个类别一行.如果有人知道如何使用gnuplot做到这一点,我将感到非常高兴.
But I don't know how to do it with one line per category. I would be glad if anybody has got an idea how to do this with gnuplot.
更好的是像下面这样的堆叠直方图簇,其中堆叠子类别由数据文件中的单独列表示:
Even better would be a stacked histogram cluster like the following, where the stacked sub categories are represented by separate columns in the datafile:
# datafile
year category num_of_events_for_A num_of_events_for_B
2011 "Category 1" 213 30
2011 "Category 2" 240 28
2011 "Category 3" 220 25
2012 "Category 1" 222 13
2012 "Category 2" 238 42
...
非常感谢!
推荐答案
经过研究,我提出了两种不同的解决方案.
After some research, I came up with two different solutions.
这两种解决方案都需要将数据文件拆分为按列分类的几个文件.因此,我创建了一个简短的 ruby 脚本,该脚本可在以下要点中找到:
Both solutions require splitting up the data file into several files categorized by a column. Therefore, I've created a short ruby script, which can be found in this gist:
https://gist.github.com/fiedl/6294424
此脚本的用法如下:为了将数据文件data.csv
拆分为data.Category1.csv
和data.Category2.csv
,请调用:
This script is used like this: In order to split up the data file data.csv
into data.Category1.csv
and data.Category2.csv
, call:
# bash
ruby categorize_csv.rb --column 2 data.csv
# data.csv
# year category num_of_events_for_A num_of_events_for_B
"2011";"Category1";"213";"30"
"2011";"Category2";"240";"28"
"2012";"Category1";"222";"13"
"2012";"Category2";"238";"42"
...
# data.Category1.csv
# year category num_of_events_for_A num_of_events_for_B
"2011";"Category1";"213";"30"
"2012";"Category1";"222";"13"
...
# data.Category2.csv
# year category num_of_events_for_A num_of_events_for_B
"2011";"Category2";"240";"28"
"2012";"Category2";"238";"42"
...
解决方案1:堆叠箱图
策略:每个类别一个数据文件.每个堆栈一列.通过使用gnuplot的带有框"参数,可以手动"绘制直方图的条形图.
Solution 1: Stacked Box Plot
Strategy: One data file per category. One column per stack. The bars of the histogram are plotted "manually" by using the "with boxes" argument of gnuplot.
上方:关于条形,大小,颜色等的完全灵活性.
Upside: Full flexibility concerning bar sizes, caps, colors, etc.
下方:必须手动放置钢筋.
Downside: Bars have to be placed manually.
# solution1.gnuplot
reset
set terminal postscript eps enhanced 14
set datafile separator ";"
set output 'stacked_boxes.eps'
set auto x
set yrange [0:300]
set xtics 1
set style fill solid border -1
num_of_categories=2
set boxwidth 0.3/num_of_categories
dx=0.5/num_of_categories
offset=-0.1
plot 'data.Category1.csv' using ($1+offset):($3+$4) title "Category 1 A" linecolor rgb "#cc0000" with boxes, \
'' using ($1+offset):3 title "Category 2 B" linecolor rgb "#ff0000" with boxes, \
'data.Category2.csv' using ($1+offset+dx):($3+$4) title "Category 2 A" linecolor rgb "#00cc00" with boxes, \
'' using ($1+offset+dx):3 title "Category 2 B" linecolor rgb "#00ff00" with boxes
结果如下:
策略:每年一个数据文件.每个堆栈一列.直方图是使用gnuplot的常规直方图机制生成的.
Strategy: One data file per year. One column per stack. The histogram is produced using the regular histogram mechanism of gnuplot.
上方:易于使用,因为不必手动进行定位.
Upside: Easier to use, since positioning has not to be done manually.
缺点:由于所有类别都在一个文件中,因此每个类别具有相同的颜色.
Downside: Since all categories are in one file, each category has the same color.
# solution2.gnuplot
reset
set terminal postscript eps enhanced 14
set datafile separator ";"
set output 'histo.eps'
set yrange [0:300]
set style data histogram
set style histogram rowstack gap 1
set style fill solid border -1
set boxwidth 0.5 relative
plot newhistogram "2011", \
'data.2011.csv' using 3:xticlabels(2) title "A" linecolor rgb "red", \
'' using 4:xticlabels(2) title "B" linecolor rgb "green", \
newhistogram "2012", \
'data.2012.csv' using 3:xticlabels(2) title "" linecolor rgb "red", \
'' using 4:xticlabels(2) title "" linecolor rgb "green", \
newhistogram "2013", \
'data.2013.csv' using 3:xticlabels(2) title "" linecolor rgb "red", \
'' using 4:xticlabels(2) title "" linecolor rgb "green"
结果如下:
- http://psy.swansea.ac.uk/staff/carter/gnuplot/gnuplot_time_histograms.htm
- Gnuplot:多个堆叠的直方图,每个组使用相同的键
- http://gnuplot.sourceforge.net/docs_4.2/node249.html
- http://psy.swansea.ac.uk/staff/carter/gnuplot/gnuplot_time_histograms.htm
- Gnuplot: Multiple Stacked Histograms, each group using the same key
- http://gnuplot.sourceforge.net/docs_4.2/node249.html
这篇关于Gnuplot直方图簇(条形图),每个类别一行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!