Gnuplot Histogram with overlapping bar pairs - gnuplot

I am trying to plot a bar chart/histogram that shows, for every point of the x axis, two bars, each of which is divided in three parts. Take the following dataset:
Min # Max # Avg # Min % Max % Avg %
6 12 6.67 13 100 35.25
0 6 3 0 90 43.25
235 1243 553 66.67 100 83.43
The idea is that for each row, there will be a pair of vertical bars, with the left one representing the three # values and the right one representing the three % values. The values are made-up, but the scale is more or less the real one.
So far, I have managed to get the following script, which is a frankenstein of several online scripts I found:
set ytics 10 nomirror tc lt 1
set y2tics 100 nomirror tc lt 2
set yrange [0:120]
set y2range [0:1500]
set style fill solid border -1
plot "table2.dat" using 5:xticlabels(1) with boxes lt rgb "#40FF00" t "Max \%",\
"" using 6 lt rgb "#406090" t "Avg \%",\
"" using 4 with boxes lt rgb "#403090" t "Min \%"
This will plot out the following chart:
What I cannot seem to figure out is how to put the second bar for the first three columns. Ideally, that "X" would also be replaced by a dotted line cutting the bar. The reason for the two Y axes is that each bar follows a different scale, so the second bar would have to be proportional to the right-side y axis. Finally, I had to add that little "hack" of making yrange higher than 100 so that the bars would not "hit the top". If there is another way to do that, that'd be great.
Thanks in advance for any help that can be given, I am a complete newbie at gnuplot but since trying to make this chart using spreadsheet tools was an even bigger pain, I am hopeful that someone'll be able to help with at least some of those problems.
Edit.: I will take suggestions for a better title for this question.

You can get a little further by setting a boxwidth to 0.4 of the default width, and defining a function (I used f here) that converts your data in columns 1 to 3 into percentage values too, and explicitly providing an x coordinate with the syntax x:y and using $1 to refer to column 1 etc. $0 is the row.
set boxwidth 0.4 relative
f(y,max) = (y*100./max)
plot "table2.dat" \
using 5:xticlabels(1) with boxes lt rgb "#40FF00" t "Max \%",\
"" using 6 lt rgb "#406090" t "Avg \%",\
"" using 4 with boxes lt rgb "#403090" t "Min \%",\
"" using ($0-.5):(f($2,$2)) with boxes lt rgb "red" t "Max",\
"" using ($0-.5):(f($3,$2)) lt rgb "blue" t "Avg",\
"" using ($0-.5):(f($1,$2)) with boxes lt rgb "orange" t "Min"
I used some garish colours to show the new boxes:

Related

Horizontal bar chart in gnuplot

When Googling "horizontal gnuplot bar chart", the first result I could find http://www.phyast.pitt.edu/~zov1/gnuplot/html/histogram.html suggests rotating (!) the final bar chart which seems rather baroque. Nonetheless I tried the approach but the labels are cut off.
reset
$heights << EOD
dad 181
mom 170
son 100
daughter 60
EOD
set yrange [0:*] # start at zero, find max from the data
set boxwidth 0.5 # use a fixed width for boxes
unset key # turn off all titles
set style fill solid # solid color boxes
set colors podo
set xtic rotate by 90 scale 0
unset ytics
set y2tics rotate by 90
plot '$heights' using 0:2:($0+1):xtic(1) with boxes lc variable
Is there a better approach?
The link you are referring to is from approx. 2009. gnuplot has developed since then. As #Christoph suggested, check help boxxyerror.
Script: (edit: shortened by using 4-columns syntax for boxxyerror, i.e. x:y:+/-dx:+/-dy)
### horizontal bar graph
reset session
$Data << EOD
dad 181
mom 170
son 100
daughter 60
EOD
set yrange [0:*] # start at zero, find max from the data
set style fill solid # solid color boxes
unset key # turn off all titles
myBoxWidth = 0.8
set offsets 0,0,0.5-myBoxWidth/2.,0.5
plot $Data using (0.5*$2):0:(0.5*$2):(myBoxWidth/2.):($0+1):ytic(1) with boxxy lc var
### end of script
Result:
Addition:
what does
2:0:(0):2:($0-myBoxWidth/2.):($0+myBoxWidth/2.):($0+1):ytic(1) mean?
Well, it looks more complicated than it is. Check help boxxyerror. From the manual:
6 columns: x y xlow xhigh ylow yhigh
So, altogether:
x take value from column 2, but not so relevant here since we will use the xyerror box
y take pseudocolumn 0 which is line number starting from zero, check help pseudocolumns, but not so relevant here as well
xlow (0) means fixed value of zero
xhigh value from column 2
ylow ($0-myBoxWidth/2.), line number minus half of the boxwidth
yhigh ($0+myBoxWidth/2.), line number plus half of the boxwidth
($0+1) together with ... lc var: color depending on line number starting from 1
ytic(1): column 1 as ytic label
For some reason (which I don't know) gnuplot still doesn't seem to have a convenient horizontal histogram plotting style, but at least it offers this boxxyerror workaround.

gnuplot histogram : negative values go down instead go up

I try to generate a histogram plot with gnuplot. I have positive and negative values. Positive values go to the top of the chart, but negative values go to the bottom of the chart..
I would like to change the base for go up and go down
from 0 to -100 for example.
Maybe, it's not the good type of graphic to do that ?
I have tried this :
gnuplot -e "set terminal png size 20000, 1500; set yrange [-100:*]; set title 'VU meter 0'; set style data histogram; set style histogram clustered gap 1; set style fill solid 1 noborder; plot 'testVUmeter0.tsv' using 2:xticlabels(1)" > out.png
Thanks
As far as I know the plotting styles histogram and with boxes always start at y=0.
Assuming I understood your question correctly, you want to shift this zero level e.g. to -100.
As long as you do not need an advanced histogram style but just simple boxes, one possible solution could be to use the plotting style with boxxyerror. Compared to #meuh's solution, here, gnuplot automatically takes care about the y-tics.
Code:
### shift zero for boxes
reset session
$Data <<EOD
A -20
B -140
C 100
D -340
E +250
F 0
EOD
myOffset = -100
myWidth = 0.8
set style fill solid 1.0
set arrow 1 from graph 0, first myOffset to graph 1, first myOffset nohead ls -1
set style textbox opaque
plot $Data u 0:2:($0-myWidth/2.):($0+myWidth/2.):(myOffset):2:xtic(1) w boxxyerror notitle, \
'' u 0:2:2 w labels boxed notitle
### end of code
Result:
You can calculate a new y value at each point, taking into account some wanted offset. For example, setting bot=-20 to give a bottom y value of -20 you can refer to ($2-bot) to convert, say, -5 to -5-(-20)=15` above 0.
set terminal png size 400,300
set output "out.png"
set style data histogram
set style histogram clustered gap 1
set style fill solid 1 noborder
bot=-20
set yrange [0:*]
set ytics ("-10" -10-bot, "0" 0-bot, "10" 10-bot, "20" 20-bot, "30" 30-bot)
plot "data" using (($2)-bot):xticlabels(1) notitle, \
"" using 0:($2+3-bot):(sprintf("%d",$2)) with labels notitle
with data of
1 33
2 44
3 22
4 -12
gives the plot:

Plotting Condition Lines

Suppose I have the following data:
"1,5"
"2,10"
""
"3,4"
"4,2"
""
"5,6"
"6,10"
I want to graph this using gnuplot with a line between each condition, similar to this display:
How might this be accomplished? I have looked into gridlines, but that does not seem to suit my need. I am also looking for a solution that will automatically draw condition / phase lines between each break in the data set.
As mentioned in the comments and explained in the linked question and its answers, you can draw arbitrary lines manually via set arrow ... (check help arrow).
However, if possible I don't want to adjust the lines manually every time I change the data or if I have many different plots.
But, hey, you are using gnuplot, so, make it automated!
To be honest, within the time figuring out how it can be done I could have changed a "few" lines and labels manually ;-). But now, this might be helpful for others.
The script below is written in such a way that it doesn't matter whether you have zero, one or two or more empty lines between the different blocks.
Comments:
the function valid(1) returns 0 and 1 if column(1) contains a valid number (check help valid).
the vertical lines are plotted with vectors (check help vectors). The x-position is taken as average of the x-value before the label line and the x-value after the label line. The y-value LevelY is determined beforehand via stats (check help stats).
the labels are plotted with labels (check help labels) and positioned at the first x-value after each label line and at an y-value of LevelY with an offset.
Script:
### automatic vertical lines and labels
reset session
$Data <<EOD
Baseline
1 10.0
2 12.0
3 10.5
4 11.0 # zero empty lines follow
Treatment
5 45.0
6 35.0
7 32.5
8 31.0 # one empty line follows
Baseline
9 14.0
10 12.8
11 12.0
12 11.3 # two empty lines follow
Treatment
13 35.0
14 45.0
15 45.0
16 37.0
EOD
set offset 1,1,1,1
set border 3
set title "Student Performance" font ",14"
set xlabel "Sessions"
set xtics 1 out nomirror
set ylabel "Number of Responses"
set yrange [0:]
set ytics out nomirror
set key noautotitle
set grid x,y
stats $Data u 2 nooutput
LevelY = STATS_max # get the max y-level
getLinePosX(col) = (v0=v1,(v1=valid(col))?(x0=x1,x1=column(1)):0, v0==0?(x0+x1)/2:NaN)
getLabel(col) = (v0=v1,(v1=valid(col))?0:(h1=strcol(1),h0=h1),column(1))
plot x1=NaN $Data u (y0=(valid(1)?$2:NaN),$1):(y0) w lp pt 13 ps 2 lw 2 lc "red", \
x1=v1=NaN '' u (getLinePosX(1)):(0):(0):(LevelY) w vec nohead lc "black" lw 1.5 dt 2, \
v1=NaN '' u (getLabel(1)):(LevelY):(sprintf("%s",v0==0?h0:'')) w labels left offset 0,1.5 font ",12"
### end of script
Result:

gnuplot histogram with boxes (smooth frequency)

I have crawled many similar questions without finding the proper problem/question/answer...
I want to use gnuplot to make a histogram plot out of a distributed data file with bars/boxes of equal width and intervals. So I need to count/integrate over the width(=1) of my bars. That's why I wanted to use the 'smooth frequency' command:
#gnuplot
bin(x)=floor(x+0.5)
set boxwidth 0.8 relative
set style fill pattern
set grid
set xrange [0:11]
set yrange [0:3]
set xtics in 0,2,10
set mxtics 2
set ytics 0,1,3
set mytics 1
p 'data.dat' u (bin($1)):(1) smooth freq w boxes
#data.dat
2.489
7.5
9.128
9.567
I tried it and the result was the same, as with my handmade file plotted with boxes:
#gnuplot2
[...]
p 'data2.dat' w boxes
#data2.dat
2 1
8 1
9 1
10 1
Smooth frequency seems to do its job properly, but the result is not what I intended to do...: Image
Then I figured out, what the problem is. It is solved using my handmade data3.dat:
#gnuplot
p 'data3.dat' w boxes
#data3.dat
1 0
2 1
3 0
4 0
5 0
6 0
7 0
8 1
9 1
10 1
Image
So the problems are the holes in my data range, that aren't counted as '0'. With these holes, gnuplot seems to adjust the box width by itself to fit in the whole space left. How can I prevent this to get my desired result?
I think the problem is the relative key. Try:
set boxwidth 0.8 absolute

Add a single point at an existing plot

I am using the following script to fit a function on a plot. In the output plot I would like to add a single value with etiquette on the fitting curve lets say the point f(3.25). I have read that for gnuplot is very tricky to add one single point on a plot particularly when this plot is a fitting function plot.
Has someone has an idea how to add this single point on the existing plot?
set xlabel "1000/T (K^-^1)" font "Helvetica,20"
#set ylabel "-log(tau_c)" font "Helvetica,20"
set ylabel "-log{/Symbol t}_c (ns)" font "Helvetica,20"
set title "$system $type $method" font "Helvetica,24"
set xtics font "Helvetica Bold, 18"
set ytics font "Helvetica Bold, 18"
#set xrange[0:4]
set border linewidth 3
set xtic auto # set xtics automatically
set ytic auto # set ytics automatically
#set key on bottom box lw 3 width 8 height .5 spacing 4 font "Helvetica, 24"
set key box lw 3 width 4 height .5 spacing 4 font "Helvetica, 24"
set yrange[-5:]
set xrange[1.5:8]
f(x)=A+B*x/(1000-C*x)
A=1 ;B=-227 ; C=245
fit f(x) "$plot1" u (1000/\$1):(-log10(\$2)) via A,B,C
plot [1.5:8] f(x) ti "VFT" lw 4, "$plot1" u (1000/\$1):(-log10(\$2)) ti "$system $type" lw 10
#set key on bottom box lw 3 width 8 height .5 spacing 4 font "Helvetica, 24"
set terminal postscript eps color dl 2 lw 1 enhanced # font "Helvetica,20"
set output "KWW.eps"
replot
There are several possiblities to set a point/dot:
1. set object
If you have simple points, like a circle, circle wedge or a square, you can use set object, which must be define before the respective plot command:
set object circle at first -5,5 radius char 0.5 \
fillstyle empty border lc rgb '#aa1100' lw 2
set object circle at graph 0.5,0.9 radius char 1 arc [0:-90] \
fillcolor rgb 'red' fillstyle solid noborder
set object rectangle at screen 0.6, 0.2 size char 1, char 0.6 \
fillcolor rgb 'blue' fillstyle solid border lt 2 lw 2
plot x
To add a label, you need to use set label.
This may be cumbersome, but has the advantage that you can use different line and fill colors, and you can use different coordinate systems (first, graph, screen etc).
The result with 4.6.4 is:
2. Set an empty label with point option
The set label command has a point option, which can be used to set a point using the existing point types at a certain coordinate:
set label at xPos, yPos, zPos "" point pointtype 7 pointsize 2
3. plot with '+'
The last possibility is to use the special filename +, which generates a set of coordinates, which are then filtered, and plotted using the labels plotting style (or points if no label is requested:
f(x) = x**2
x1 = 2
set xrange[-5:5]
set style line 1 pointtype 7 linecolor rgb '#22aa22' pointsize 2
plot f(x), \
'+' using ($0 == 0 ? x1 : NaN):(f(x1)):(sprintf('f(%.1f)', x1)) \
with labels offset char 1,-0.2 left textcolor rgb 'blue' \
point linestyle 1 notitle
$0, or equivalently column(0), is the coordinate index. In the using statement only the first one is taken as valid, all other ones are skipped (using NaN).
Note, that using + requires setting a fixed xrange.
This has the advantages (or disadvantages?):
You can use the usual pointtype.
You can only use the axis values as coordinates (like first or second for the objects above).
It may become more difficult to place different point types.
It is more involved using different border and fill colors.
The result is:
Adding to Christoph's excellent answers :
4. use stdin to pipe in the one point
replot "-" using 1:(f($1))
2.0
e
and use the method in 3rd answer to label it.
5. bake a named datablock
(version > 5.0) that contains the one point, then you can replot without resupplying it every time:
$point << EOD
2.0
EOD
replot $point using 1:(f($1)):(sprintf("%.2f",f($1))) with labels
6. A solution using a dummy array of length one:
array point[1]
pl [-5:5] x**2, point us (2):(3) pt 7 lc 3
7. Or through a shell command (see help piped-data):
pl [-5:5] x**2, "<echo e" us (2):(3) pt 7 lc 3
pl [-5:5] x**2, "<echo 2 3" pt 7 lc 3
8. Special filename '+'
pl [-5:5] x**2, "+" us (2):(3) pt 7 lc 3
It seems to be the shortest solution. But note that while it looks like a single point, these are like 500 points (see show samples) plotted on the same position.
To have only one point the sampling needs to be temporarily adjusted (see help plot sampling)
pl [-5:5] x**2, [0:0:1] "+" us (2):(3) pt 7 lc 3
9. Function with zero sampling range length
Shortest to type, but plotting as many points on top of each other as many specified with samples
pl [-5:5] x**2, [2:2] 3 w p pt 7 lc 3

Resources