GNUPLOT: Multiple histograms each with normalized bars - gnuplot

I'd like to start by saying I am very new to gnuplot. I am attempting to plot multiple stacked histograms that have been normalized so that the height of each bar is 1. I'd also prefer to no have to amend my data files to include the total as the last entry as I have a lot of data files to plot and this would take a lot of time. I've looked around and I know this can be done, but I have been unsuccessful in adapting examples I've found to work with the code I am using.
The data file I am using (shortened considerably) is named "Test.dat" and formatted as follows:
#a = 2
#b 1 2 3 X
b=1 1 3 1
b=2 0 1 1
#a = 4
b 1 2 3 X
b=1 1 1.5 1.5
b=2 1 2.1 1.9
Here each row beginning with b=x is meant to be a single bar, and there are two groups of two bars corresponding to an a=x. My .gp file currently looks like this:
set style data histogram
set style histogram rowstacked gap .5 title offset 0, -1
set style fill solid border -1
set boxwidth .75 relative
set yrange [0:]
unset xtics
plot \\
newhistogram "b=2" lt 1, for[col=2:4] 'Test.dat' index 0 u col:xtic(1) notitle \
,newhistogram "b=4" lt 1, for[col=2:4] 'Test.dat' index 1 u col:xtic(1) notitle \
This give the image, but this is what I would like to get. I'd appreciate any assistance you could provide.

You have missed a comment sign "#" in the second data bolck.
You have to separate each data block with 2 blank lines.
You are using "b=1" and "b=2" in the data file, but "b=2" and b=4 in the script.
Last: gnuplot is able to make stacked histograms, but there is no way to normalize them automatic, but manually :-/
set style data histogram
set style histogram rowstacked gap .5 title offset 0, -1
set style fill solid border -1
set boxwidth .75 relative
set yrange [0:]
unset xtics
plot \\\
newhistogram "b=1" lt 1, for[col=2:4] 'Test.dat' index 0 u (column(col)/$5):xtic(1) notitle, \
newhistogram "b=2" lt 1, for[col=2:4] 'Test.dat' index 1 u (column(col)/$5):xtic(1) notitle

Related

gnuplot histogram chart with overlap

I would like to plot a bar chart or histogram like this in gnuplot.
I tried set style histogram rowstacked which is a start but it adds the columns on top of each other while I need them overlapped. Next is the issue of transparent color shading.
Thanks for your feedback.
UPDATE: user8153 asked for additional data.
The set style histogram clustered gap 0.0 is doing the cluster mode of the histogram bars. If you blur the eye it sort-of shows what I want but with overlap and transparent shading.
The only other histogram modes given in the docs are rowstacked and columnstacked. I never got a plot out of columnstacked so I discarded it. Now rowstacked stacks the histogram bars.
The overlay appearance is there but it is wrong. I don't want the stacked appearance. The histograms have to overlay.
Code :
set boxwidth 1.0 absolute
set style fill solid 0.5 noborder
set style data histogram
set style histogram clustered gap 0.0
#set style histogram rowstacked gap 0.0
set xtics in rotate by 90 offset first +0.5,0 right
set yrange [0:8000]
set xrange [90:180]
plot 'dat1.raw' using 3 lc rgb 'orange', \
'dat2.raw' using 3 lc rgb 'blue', \
'dat3.raw' using 3 lc rgb 'magenta'
Thanks for your feedback.
Given a sample datafile test.dat
-10 4.5399929762484854e-05
-9 0.0003035391380788668
-8 0.001661557273173934
-7 0.007446583070924338
-6 0.02732372244729256
-5 0.0820849986238988
-4 0.20189651799465538
-3 0.4065696597405991
-2 0.6703200460356393
-1 0.9048374180359595
0 1.0
1 0.9048374180359595
2 0.6703200460356393
3 0.4065696597405991
4 0.20189651799465538
5 0.0820849986238988
6 0.02732372244729256
7 0.007446583070924338
8 0.001661557273173934
9 0.0003035391380788668
10 4.5399929762484854e-05
you can use the following commands
set style fill transparent solid 0.7
plot "test.dat" with boxes, \
"test.dat" u ($1+4):2 with boxes
to get the following result (using the pngcairo terminal):
Using transparency as in user8153's solution is certainly the easiest way to visualize an overlap of two histograms.
This works even if the two histogram do not have identical bins or x-data-ranges.
However, the color of the overlap is pretty much bound to the colors of the two histogram and the level of transparency. Furthermore, if you want to show the overlap in the key you have to do it "manually".
Here is a solution where you can choose an independent color for the overlap area.
The overlap is basically the minimum y-value from both histograms for each x-value.
For this you need to compare the y-values for each x-value. This can be done in gnuplot with some "trick" by merging the two files line by line. This requires the data in a datablock (how to get it there from a file). Since this merging procedure is using indexing of datablock lines, it requires gnuplot>=5.2.0.
This assumes that you have the same x-range and bins for each histogram. If this is not the case, you have to implement some further steps.
Script: (works with gnuplot>=5.2.0, Sept. 2017)
### plot overlap of two histograms
reset session
# create some random test data
set samples 21
f(x,a,b) = 1./(a*(x-b)**4+1)
set table $Data1
plot '+' u 1:(f(x,0.01,-2)) w table
set table $Data2
plot '+' u 1:(f(x,0.02,4)) w table
unset table
set boxwidth 1.0
set grid y
set ytics 0.2
set multiplot layout 2,1
set style fill transparent solid 0.3
plot $Data1 u 1:2 w boxes lc 1 ti "Data1", \
$Data2 u 1:2 w boxes lc 2 ti "Data2"
set print $Overlap
do for [i=1:|$Data1|] { print $Data1[i].$Data2[i] }
set print
set style fill solid 0.3
plot $Data1 u 1:2 w boxes lc 1 ti "Data1", \
$Data2 u 1:2 w boxes lc 2 ti "Data2", \
$Overlap u 1:($2>$4?$4:$2) w boxes lc "red" ti "Overlap"
unset multiplot
### end of script
Result:

GNUPLOT - Two columns histogram with values on top of bars

yesteraday I made a similar question (this one). I could not display the value on top of bar in a gnuplot histogram. I lost many time because I couldn't find really good documentation about it, and I only can find similar issues on differents websites.
I lost many time with that but fortunately someone give me the solution. Now I am having a similar issue with an histogram with two bars, in which I have to put on top of both bars its value. I am quite near, or that is what I think, but I can't make it work properly. I am changing the script and regenerating the graph many times but I am not sure of what I am doing.
script.sh
#!/usr/bin/gnuplot
set term postscript
set terminal pngcairo nocrop enhanced size 600,400 font "Siemens Sans,8"
set termoption dash
set output salida
set boxwidth 0.8 absolute
set border 1
set style fill solid 1.00 border lt -1
set key off
set style histogram clustered gap 1 title textcolor lt -1
set datafile missing '-'
set style data histograms
set xtics border in scale 0,0 nomirror autojustify
set xtics norangelimit
set xtics ()
unset ytics
set title titulo font 'Siemens Sans-Bold,20'
set yrange [0.0000 : limite1] noreverse nowriteback
set y2range [0.0000 : limite2] noreverse nowriteback
show style line
set style line 1 lt 1 lc rgb color1 lw 1
set style line 2 lt 1 lc rgb color2 lw 1
## Last datafile plotted: "immigration.dat"
plot fuente using 2:xtic(1) ls 1 ti col axis x1y1, '' u 3 ls 2 ti col axis x1y2, '' u 0:2:2 with labels offset -3,1 , '' u 0:2:3 with labels offset 3,1
I am modifying the last code line, because is here where I set the labels. I have been able to show both labels, but in bad positions, I have also been able to show one of the labels in the right position but no the other. I have been able to show almost everything but the thing that I want. This is the graph that generates the script.
output.png
This is the source file that I use for generating the graph
source.dat
"Momento" "Torre 1" "Torre 2"
"May-16" 1500.8 787.8
"Jun-16" 1462.3 764.1
"Jul-16" 1311.2 615.4
"Ago-16" 1199.0 562.0
"Sep-16" 1480.0 713.8
"Oct-16" 1435.1 707.8
And that's the command that I execute with the parameters set
gnuplot -e "titulo='Energía consumida por torre (MWh)'; salida='output.png'; fuente='source.dat'; color1='#FF420E'; color2='#3465A4'; limite1='1800.96'; limite2='945.36'" script.sh
I think that is quite obvious what I am pretending, can someone help me?
Lots of thanks in advance.
Your script has several problems, the missing ti col is only one of them. (You can also use set key auto columnheader, then you must not give that option every time).
Don't use both y1 and y2 axis if you want to compare the values! Otherwise the correct bar heights are only a matter of luck...
Understand, how gnuplot positions the histogram bars, then you can exactly locate the top center of each bar. If you only use offset with char values (which is the case when you give only numbers), then your script will break as soon as you add or remove a data row.
The histogram clusters start at x-position 0, and are positioned centered at integer x values. Since you have two bars in each cluster and a gap of 1, the center of the first bar is at ($0 - 1/6.0) (= 1/(2 * (numberOfTorres + gapCount))), the second one at ($0 + 1/6.0):
set terminal pngcairo nocrop enhanced size 600,400 font ",8"
set output 'output.png'
set title 'Energía consumida por torre (MWh)' font ",20"
set boxwidth 0.8 absolute
set border 1
set style fill solid 1.00 border lt -1
set style histogram clustered gap 1 title textcolor lt -1
set style data histograms
set xtics border scale 1,0 nomirror autojustify norangelimit
unset ytics
set key off auto columnheader
set yrange [0:*]
set offset 0,0,graph 0.05,0
set linetype 1 lc rgb '#FF420E'
set linetype 2 lc rgb '#3465A4'
# dx = 1/(2 * (numberOfTorres + gap))
dx = 1/6.0
plot 'source.dat' using 2:xtic(1),\
'' u 3,\
'' u ($0 - dx):2:2 with labels,\
'' u ($0 + dx):3:3 with labels
Now, starting at the bars center you can safely use offset to specify only the offset relative to the bars top center:
plot 'source.dat' using 2:xtic(1),\
'' u 3,\
'' u ($0 - dx):2:2 with labels offset -1,1 ,\
'' u ($0 + dx):3:3 with labels offset 1,1
A second option would be to use the label's alignment: The labels of the red bars are right aligned at the bars right border, the labels of the blue bars are left aligned at the bars left border:
absoluteBoxwidth = 0.8
dx = 1/6.0 * (1 - absoluteBoxwidth)/2.0
plot 'source.dat' using 2:xtic(1),\
'' u 3,\
'' u ($0 - dx):2:2 with labels right offset 0,1 ,\
'' u ($0 + dx):3:3 with labels left offset 0,1
In any case, both options make your script more robust against changes of the input data.
This looks better :
plot fuente using 3:xtic(1) ls 1 ti col axis x1y1, '' u 3 ls 2 ti col axis x1y2, '' u ($0-1):3:3 with labels offset -3,1 , '' u ($0-1):2:2 with labels offset 3,1
You had 2 plots commands: only the first one was displayed.
Also, script.sh should be a bash script. This is a gnuplot script, so it should have another extension.
The problem is the ti col tab. You need to put it in every option, including labels and not only in bars. The right code is:
plot fuente using 2:xtic(1) ls 1 ti col, '' u 3 ls 2 ti col, '' u 0:2:2 ti col with labels offset -3,1 , '' u 0:3:3 ti col with labels offset 3,1
And that's how the picture is displayed now:
You can also avoid ti col and that is how it would look:

Gnuplot columnstacked histogram with errorbars

Suppose I have the following data file, so-qn.dat:
Type on on-err off off-err
good 75 5 55 4
bad 15 2 30 3
#other 10 1 15 2
which contains values on columns 2 and 4 and corresponding error deltas on columns 3 and 5.
I can produce a columnstacked histogram:
#!/usr/bin/gnuplot
set terminal png
set output 'so-qn.png'
set linetype 1 lc rgb "blue" lw 2 pt 0
set linetype 2 lc rgb "dark-red" lw 2 pt 0
set style data histograms
set style histogram columnstacked
set style fill solid
set ylabel "% of all items"
set yrange [0:100]
set boxwidth 0.75
set xtics scale 0
set xlabel "Option"
plot 'so-qn.dat' using 2 ti col, \
'' using 4:key(1) ti col
But I can’t figure out how to add errorbars to this. The closest I got so far is with
plot 'so-qn.dat' using 2 ti col, '' using 2:3 with yerrorbars lc rgb 'black' ti col, \
'' using 4:key(1) ti col, '' using 4:5:key(1) with yerrorbars lc rgb 'black' ti col
which produces
but only one of the error bars is in the right spot (I actually have no idea where the bottom left one gets its y from), one is completely invisible (hidden behind the right stack?), and I’d like the error bars to not show up in the key.
Is it possible to combine column-stacked histograms and error bars?
You can add errorbars to column-stacked histograms by manually adding plot-commands for the errorbars. To do so, you need, however, to keep track of the y-positions.
Therefore, let's introduce two variables which store the y-position for each of the two columns' errorbars.
y1 = -2
y2 = -4
You need to initialize these variables with -(number of column)
Next, let us define two functions that update the variables y1, y2.
f1(x) = (y1 = y1+x)
f2(x) = (y2 = y2+x)
Now, generate the desired plot via
plot 'so-qn.dat' using 2 ti col, \
'' using 4:key(1) ti col, \
'' using (0):(f1($2)):3 w yerr t "", \
'' using (1):(f2($4)):5 w yerr t ""
As you can see, you can supress the errorbars in the key by assigning an empty title (t ""). This approach even gives you more flexibility in customizing the appearance of the errorbars (e.g., assign different linestyles etc.).
This being said, I personally think this visualization is rather confusing. You might want to consider another visualization:
set bars fullwidth 0
set style data histograms
set style fill solid 1 border lt -1
set style histogram errorbars gap 2 lw 2
plot 'so-qn.dat' using 2:3:xtic(1) ti columnhead(2), \
'' using 4:5:xtic(1) ti columnhead(4)

GNUplot - plot data file (simple X and Y columns) - setting suitable color and scale on a figure

I have a simple file with two columns:
1 0.005467
2 0.005333
3 0.005467
4 0.005467
5 0.005600
6 0.005600
7 0.005467
8 0.005467
In the first column I have the x-axis values, while on the second column I have y-axis values. I would like to plot a figure of this data. I wrote a gnuplot script for this:
#!/usr/bin/gnuplot
set xlabel "test"
set ylabel "value"
set grid ytics lt 0 lw 1 lc rgb "#bbbbbb"
set grid xtics lt 0 lw 1 lc rgb "#bbbbbb"
set autoscale
set terminal postscript portrait enhanced mono dashed lw 1 'Helvetica' 14
set style line 1 lt 1 lw 3 pt 3 linecolor rgb "red"
set output 'out.eps'
plot 'data.txt' using 2:1 w points title "tests"
And, the output:
But of course, as a newbie in gnuplot, I have some troubles:
How to change the crosses on the fingure into dots?
How to change the color of the dots, to let's say, red? ( my command in my gnuplotscript seems not to work at all ...)
For the first test the adequate, accurate, exact value is 0.005467 but on my figure it doesnt look like so... I would like to place the dot on my figure for the first, second, third, (so on) test on the exact place, where is appropriate value.
How to add a grid to my figure? - SOLVED
How to get rid of the ugly text: 'data.txt' using 1:2 and replace it with a legend? - SOLVED
EDIT (SOLVED ISSUE NO 5)
plot 'data.txt' using 1:2 w points title "tests"
EDIT (SOLVED ISSUE NO 4)
set grid ytics lt 0 lw 1 lc rgb "#bbbbbb"
set grid xtics lt 0 lw 1 lc rgb "#bbbbbb"
You should read a bit in the documentation about all your commands!
Several remarks:
If you want colored points, you shouldn't use the mono (i.e. the monochrome) option, but rather color.
Your definition of the line style is correct, but in order to use it you must use linestyle 1 when plotting. Otherwise the linetype 1 is used. Compare:
set style line 1 lt 1 lw 3 pt 3 linecolor rgb "red"
plot x, 2*x linestyle 1
In order to see all the dots of a terminal, use the test command:
set terminal postscript eps enhanced color dashed lw 1 'Helvetica' 14
set output 'test.eps'
test
set output
You see, that for filled dots you must use pt 7.
I'm sure, that the points are shown at the correct values. Use
set ytics add (0.005467)
to see this.

setting multiple labels at the top of the x-axis

After the answer got in my earlier post drawing vertical lines in between bezier curves, I have been trying to label the segments separated by the dotted lines. I used x2label but found out that if I use it multiple times then the data gets replaced though they are positioned in different places. Below is the script:
set term x11 persist
set title "Animation curves"
set xlabel "Time (secs.)"
set ylabel "Parameter"
set x2label "Phoneme1" offset -35
set pointsize 2
set key off
set style line 2 lt 0 lc 1 lw 2
plot [0.04:0.15] "curve.dat" u 1:2 smooth csplines ls 1, "" u 1:($2-0.2):(0):(0.3) w vectors nohead ls 2, \
"curve.dat" u 1:2 with points
The output is the following.
I want to label Phoneme1, Phoneme2...and so on.. on top of each segment. How would I do it? Also as I was suggested in my earlier post to play with the line "" u 1:($2-0.2):(0):(0.3) w vectors nohead ls 2 to get a top to bottom vertical lines. But that also did not work. How do I get the lines from top margin to bottom? Thank you.
The horizontal lines
The horizontal lines can be accomplished with setting the yrange to an explicit value. Otherwise gnuplot would try to get some space between the lines and the axis. You could choose the values
set yrange [0.3:1.2]
Then you simply modify the vector using directions like so:
"" u 1:(0.3):(0):(1.2) w vectors nohead ls 2
(see below for the complete script)
The labeling of the sections
A quick way of doing this with your set of data would be this:
set key off
set style line 2 lt 0 lc 1 lw 2
set yrange [0.3:1.2]
plot [0.04:0.15] "Data.csv" u 1:2 smooth csplines ls 1, \
"" u 1:(0.3):(0):(1.2) w vectors nohead ls 2, \
"" u ($1+0.005):(1):(sprintf("P %d", $0)) w labels
However, this will probably not look the way you want it to look. You could think of modifying your data file to also include some information about the labeling like:
#x-value y-value x-label y-label label
0.06 0.694821399177 0.65 0.1 Phoneme1
0.07 0.543022222222 0.75 0.1 Phoneme2
Then the labels line would simply look like:
"" u 3:4:5 w labels
The complete plot then looks like this:

Resources