How to display y-labels on top of histogram bars on gnuplot - gnuplot

I desgined a histogram in gnuplot however the y-scale needs to be in log2 due to huge difference in values. Therefore, to improve readability of the plot I pretend to display the concrete values on top of each bar. The values represent bytes and so I would like for this values also be in log2 and to be formated to display kb, Mb, ... as is being done in the y-axis.
How can I achieve this?
This is the comands I'm currently using:
set terminal postscript eps enhanced dash color "" 13
reset
set datafile separator ","
set title "Bytes per Protocol"
set xlabel "Protocol"
set ylabel "Bytes" rotate by 90
set yrange [0:1342177280]
set logscale y 2
set format y '%.0s%cB'
set style data histogram
set boxwidth 0.5
set style fill solid
set xtics format ""
set grid ytics
set style data histogram
set style histogram clustered gap 2
set grid ytics
set tic scale 0
set size 1,0.9
set size ratio 0.5
set key autotitle columnhead
set output "ex_a_1_BIG.eps"
plot "ex_a_1_BIG.csv" using ($3):xtic(1) title "IN", \
'' using ($5):xtic(1) title "OUT", \
'' using 0:($3):($3) with labels center offset -2,1 notitle, \
'' using 0:($5):($5) with labels center offset 2,1 notitle
This is the content of the csv I want to plot (I only want the bytes in and out):
protocol,packets in,bytes in,packets out,bytes out
ICMP,1833,141562,979,60334
IGMP,0,0,283,14006
TCP,158214,129221151,130101,47734355
UDP,68476,9571677,72530,24310734

Check help format_specifiers and help gprintf. And the example below.
What is a bit unfortunate, that in gnuplot apparently the prefix for 1 to 999 is a single space instead of an empty string.
For example, with the format '%.1s %cB' this leads to two spaces for 1-999 B and one space for the others, e.g. 1 kB. However, if you use '%.1s%cB' this leads to one space for 1-999 B and no space for the others e.g. 100kB. As far as I know, correct would be one space between the number and the units. I'm not sure whether there is an easy fix for this.
Code:
### prefixes
reset session
$Data <<EOD
1 1
2 12
3 123
4 1234
5 12345
6 123456
7 1234567
8 12345678
9 123456789
10 1234567890
11 12345678901
12 123456789012
13 1234567890123
EOD
set boxwidth 0.7
set style fill solid 1.0
set xtics 1
set yrange [0.5:8e13]
set multiplot layout 2,1
set logscale y # base of 10
set format y '%.0s %cB'
plot $Data u 1:2 w boxes lc rgb "green" notitle, \
'' u 1:2:(gprintf('%.1s %cB',$2)) w labels offset 0,1 not
set logscale y 2 # base of 2
set format y '%.0b %BB'
plot $Data u 1:2 w boxes lc rgb "red" notitle, \
'' u 1:2:(gprintf('%.1b %BB',$2)) w labels offset 0,1 not
unset multiplot
### end of code
Result:
Addition:
a workaround for number/unit space issue at least for the labels in the graph would be:
myFmt(c) = column(c)>=1 && column(c)<1000 ? \
gprintf('%.1s%cB',column(c)) : gprintf('%.1s %cB',column(c))
and
plot $Data u 1:2 w boxes lc rgb "green" notitle, \
'' u 1:2:(myFmt(2)) w labels offset 0,1 not
But for the ytics labels I still don't have an idea.

Related

GNUPLOT - Output Array of Stacked Images

at the bottom of this post is gnuplot sample code that plots an array of .dat files numbered 001 to 103 and turns them to an array of .png's. Below is the first and last image
The question is, how do I stack the 001 to 103 .png's on top of each other and produce an output array of 103 images in the process? So far, I've managed to do one image that stacks all the combined data from 001 to 103.dat. See below
The bit of code that does the one stacked image is commented out
# Test.png - One merged data
#filename(n) = sprintf("ar-agn--DensT-%03.0f.dat", n)
#plot for [i=1:103] filename(i) using 1:2:3 with points pointtype 5 ps 0.3 palette notitle
But what I need is an array of output images stacked on top of each other.
Thank you all in advance!!!
#!/bin/bash
# Comment out the 3 lines below to produce all in one stacked image
for FILE in ar-agn--DensT*.dat; do
gnuplot -p << EOF
set output "${FILE}.png"
set terminal png
# uncomment line below for all in one merged data
#set output "TEST.png"
set datafile separator ","
set xlabel "x-units" font ",16"
set ylabel "y-units" font ",16"
set cblabel "y-units" font ",16"
set tics font ", 16"
set xzeroaxis
# Temp vs Density
set yr [0.0:8.0]
set xr [-3.0:6.0]
set xlabel "log (Number density/(cm^{-3}) )"
set ylabel "log (Temperature/ K )"
set cbrange [0.099949:10.2948]
set cblabel "Time (Myr)"
set palette defined ( \
0 '#0c0887' ,\
1 '#4b03a1' ,\
2 '#7d03a8' ,\
3 '#a82296' ,\
4 '#cb4679' ,\
5 '#e56b5d' ,\
6 '#f89441' ,\
7 '#fdc328' ,\
8 '#f0f921' )
# Series of subsequnt plots
plot "${FILE}" u 1:2:3 with points pointtype 5 ps 0.3 palette notitle
# Test.png - One merged data
#filename(n) = sprintf("ar-agn--DensT-%03.0f.dat", n)
#plot for [i=1:103] filename(i) using 1:2:3 with points pointtype 5 ps 0.3 palette notitle
EOF
# insert comment into line below for all in one merged data
done
You were very close. You need two iterations, one inside the plot command and one outside:
filename(n) = sprintf("ar-agn--DensT-%03.0f.dat", n)
outfile(n) = sprintf("ar-agn--DensT-%03.0f.png", n)
do for [N=1:103] {
set output outfile(N)
plot for [i=1:N] filename(i) using 1:2:3 with points pointtype 5 ps 0.3 palette notitle
}

gnuplot histogram chart with overlap

I would like to plot a bar chart or histogram like this in gnuplot.
I tried set style histogram rowstacked which is a start but it adds the columns on top of each other while I need them overlapped. Next is the issue of transparent color shading.
Thanks for your feedback.
UPDATE: user8153 asked for additional data.
The set style histogram clustered gap 0.0 is doing the cluster mode of the histogram bars. If you blur the eye it sort-of shows what I want but with overlap and transparent shading.
The only other histogram modes given in the docs are rowstacked and columnstacked. I never got a plot out of columnstacked so I discarded it. Now rowstacked stacks the histogram bars.
The overlay appearance is there but it is wrong. I don't want the stacked appearance. The histograms have to overlay.
Code :
set boxwidth 1.0 absolute
set style fill solid 0.5 noborder
set style data histogram
set style histogram clustered gap 0.0
#set style histogram rowstacked gap 0.0
set xtics in rotate by 90 offset first +0.5,0 right
set yrange [0:8000]
set xrange [90:180]
plot 'dat1.raw' using 3 lc rgb 'orange', \
'dat2.raw' using 3 lc rgb 'blue', \
'dat3.raw' using 3 lc rgb 'magenta'
Thanks for your feedback.
Given a sample datafile test.dat
-10 4.5399929762484854e-05
-9 0.0003035391380788668
-8 0.001661557273173934
-7 0.007446583070924338
-6 0.02732372244729256
-5 0.0820849986238988
-4 0.20189651799465538
-3 0.4065696597405991
-2 0.6703200460356393
-1 0.9048374180359595
0 1.0
1 0.9048374180359595
2 0.6703200460356393
3 0.4065696597405991
4 0.20189651799465538
5 0.0820849986238988
6 0.02732372244729256
7 0.007446583070924338
8 0.001661557273173934
9 0.0003035391380788668
10 4.5399929762484854e-05
you can use the following commands
set style fill transparent solid 0.7
plot "test.dat" with boxes, \
"test.dat" u ($1+4):2 with boxes
to get the following result (using the pngcairo terminal):
Using transparency as in user8153's solution is certainly the easiest way to visualize an overlap of two histograms.
This works even if the two histogram do not have identical bins or x-data-ranges.
However, the color of the overlap is pretty much bound to the colors of the two histogram and the level of transparency. Furthermore, if you want to show the overlap in the key you have to do it "manually".
Here is a solution where you can choose an independent color for the overlap area.
The overlap is basically the minimum y-value from both histograms for each x-value.
For this you need to compare the y-values for each x-value. This can be done in gnuplot with some "trick" by merging the two files line by line. This requires the data in a datablock (how to get it there from a file). Since this merging procedure is using indexing of datablock lines, it requires gnuplot>=5.2.0.
This assumes that you have the same x-range and bins for each histogram. If this is not the case, you have to implement some further steps.
Script: (works with gnuplot>=5.2.0, Sept. 2017)
### plot overlap of two histograms
reset session
# create some random test data
set samples 21
f(x,a,b) = 1./(a*(x-b)**4+1)
set table $Data1
plot '+' u 1:(f(x,0.01,-2)) w table
set table $Data2
plot '+' u 1:(f(x,0.02,4)) w table
unset table
set boxwidth 1.0
set grid y
set ytics 0.2
set multiplot layout 2,1
set style fill transparent solid 0.3
plot $Data1 u 1:2 w boxes lc 1 ti "Data1", \
$Data2 u 1:2 w boxes lc 2 ti "Data2"
set print $Overlap
do for [i=1:|$Data1|] { print $Data1[i].$Data2[i] }
set print
set style fill solid 0.3
plot $Data1 u 1:2 w boxes lc 1 ti "Data1", \
$Data2 u 1:2 w boxes lc 2 ti "Data2", \
$Overlap u 1:($2>$4?$4:$2) w boxes lc "red" ti "Overlap"
unset multiplot
### end of script
Result:

gnuplot: string values xticlabel & adjusting fontsize

I have data I would like to plot in a histogram style with a "cumulated" curve on top. I have the following problem:
My data consists of one column with the categories ("discharge") and one column with the quantity of values ("probability") that belong to the respective category. The last value of the category-column is ">100" summarizing all power plants that have a bigger discharge than the last numeric value ("100 m^3/s"). I have not found a solution to plot this last category and the respective values with the command plot 'datafile.dat' using 1:2 with boxes ... because (as I assume) in this case only numerical values are read out for the x-ticlabels, so the last category is missing. If
I plot it with this command plot 'datafile.dat' using 2:xtics(1) with boxes ... I get the last category ">100" plotted just fine.
BUT: if I use the latter command the x-axis labels appear in the normal font size. Even though I have the line set format x '\footnotesize \%10.0f' in my code.
I have read about explicit labels in the plotcommand line that overwrite format style which was set before but was not able to adapt it to my code.
Changing ytic font size in gnuplot epslatex (multiplot)
Do you have an idea how to do this?
Excel screenshot to visualize what I want to achieve
'datafile.dat'
discharge probability cumulated
10 20 20%
20 10 10%
30 5 5%
40 6 6%
50 4 4%
60 12 12%
70 8 8%
80 15 15%
90 20 20%
100 6 6%
>100 4 4%`
[terminal=epslatex,terminaloptions={size 15cm, 8cm font ",10"}]
set xrange [*:*]
set yrange [0:20]
set y2range [0:100]
set xlabel 'Discharge$' offset 0,-1
set ylabel 'No. of power plants' offset 10.5
set y2label 'Cumulated probability' offset -10
set format xy '$\%g$'
set format x '\footnotesize \%10.0f'
set format y '\footnotesize \%10.0f'
set format y2 '\footnotesize \%10.0f'
set xtics rotate by 45 center offset 0,-1
set style fill pattern border -1
set boxwidth 0.3 relative
set style line 1 lt 1 lc rgb 'black' lw 2 pt 6 ps 1 dt 2
plot 'datafile.dat' using 1:2 with boxes axes x1y1 fs pattern 6 lc black notitle, \
'datafile.dat' using 1:3 with linespoints axes x1y2 ls 1 notitle
I am confused by your datafile; the numbers in the third column do not seem to be cumulative, and do not add up to 100%. Here is a solution that uses only the first two columns of your file:
set term epslatex standalone header "\\usepackage[T1]{fontenc}"
set output 'test.tex'
stats "datafile.dat" using 2
total = STATS_sum
set xlabel "Discharge" offset 0, 1.5
set xtics rotate
set ylabel "No. of power plants"
set ytics nomirror
set yrange [0:*]
set y2label "Cumulative probability"
set y2tics
set y2range [0:]
set boxwidth 0.3 relative
set style line 1 lt 1 lc rgb 'black' lw 2 pt 6 ps 1 dt 2
plot \
'datafile.dat' using 2:xtic("\\footnotesize " . stringcolumn(1)) with boxes axes x1y1 fs pattern 6 lc black notitle, \
'datafile.dat' using ($2/total) smooth cumulative with linespoints axes x1y2 ls 1 notitle
set output
The trick is to add the latex command \footnotesize in front of each label in the using command. It also first computes the total number of power plants so that it can compute probabilities, and computes cumulative values with the smooth cumulative option.

GNUPLOT - Two columns histogram with values on top of bars

yesteraday I made a similar question (this one). I could not display the value on top of bar in a gnuplot histogram. I lost many time because I couldn't find really good documentation about it, and I only can find similar issues on differents websites.
I lost many time with that but fortunately someone give me the solution. Now I am having a similar issue with an histogram with two bars, in which I have to put on top of both bars its value. I am quite near, or that is what I think, but I can't make it work properly. I am changing the script and regenerating the graph many times but I am not sure of what I am doing.
script.sh
#!/usr/bin/gnuplot
set term postscript
set terminal pngcairo nocrop enhanced size 600,400 font "Siemens Sans,8"
set termoption dash
set output salida
set boxwidth 0.8 absolute
set border 1
set style fill solid 1.00 border lt -1
set key off
set style histogram clustered gap 1 title textcolor lt -1
set datafile missing '-'
set style data histograms
set xtics border in scale 0,0 nomirror autojustify
set xtics norangelimit
set xtics ()
unset ytics
set title titulo font 'Siemens Sans-Bold,20'
set yrange [0.0000 : limite1] noreverse nowriteback
set y2range [0.0000 : limite2] noreverse nowriteback
show style line
set style line 1 lt 1 lc rgb color1 lw 1
set style line 2 lt 1 lc rgb color2 lw 1
## Last datafile plotted: "immigration.dat"
plot fuente using 2:xtic(1) ls 1 ti col axis x1y1, '' u 3 ls 2 ti col axis x1y2, '' u 0:2:2 with labels offset -3,1 , '' u 0:2:3 with labels offset 3,1
I am modifying the last code line, because is here where I set the labels. I have been able to show both labels, but in bad positions, I have also been able to show one of the labels in the right position but no the other. I have been able to show almost everything but the thing that I want. This is the graph that generates the script.
output.png
This is the source file that I use for generating the graph
source.dat
"Momento" "Torre 1" "Torre 2"
"May-16" 1500.8 787.8
"Jun-16" 1462.3 764.1
"Jul-16" 1311.2 615.4
"Ago-16" 1199.0 562.0
"Sep-16" 1480.0 713.8
"Oct-16" 1435.1 707.8
And that's the command that I execute with the parameters set
gnuplot -e "titulo='Energía consumida por torre (MWh)'; salida='output.png'; fuente='source.dat'; color1='#FF420E'; color2='#3465A4'; limite1='1800.96'; limite2='945.36'" script.sh
I think that is quite obvious what I am pretending, can someone help me?
Lots of thanks in advance.
Your script has several problems, the missing ti col is only one of them. (You can also use set key auto columnheader, then you must not give that option every time).
Don't use both y1 and y2 axis if you want to compare the values! Otherwise the correct bar heights are only a matter of luck...
Understand, how gnuplot positions the histogram bars, then you can exactly locate the top center of each bar. If you only use offset with char values (which is the case when you give only numbers), then your script will break as soon as you add or remove a data row.
The histogram clusters start at x-position 0, and are positioned centered at integer x values. Since you have two bars in each cluster and a gap of 1, the center of the first bar is at ($0 - 1/6.0) (= 1/(2 * (numberOfTorres + gapCount))), the second one at ($0 + 1/6.0):
set terminal pngcairo nocrop enhanced size 600,400 font ",8"
set output 'output.png'
set title 'Energía consumida por torre (MWh)' font ",20"
set boxwidth 0.8 absolute
set border 1
set style fill solid 1.00 border lt -1
set style histogram clustered gap 1 title textcolor lt -1
set style data histograms
set xtics border scale 1,0 nomirror autojustify norangelimit
unset ytics
set key off auto columnheader
set yrange [0:*]
set offset 0,0,graph 0.05,0
set linetype 1 lc rgb '#FF420E'
set linetype 2 lc rgb '#3465A4'
# dx = 1/(2 * (numberOfTorres + gap))
dx = 1/6.0
plot 'source.dat' using 2:xtic(1),\
'' u 3,\
'' u ($0 - dx):2:2 with labels,\
'' u ($0 + dx):3:3 with labels
Now, starting at the bars center you can safely use offset to specify only the offset relative to the bars top center:
plot 'source.dat' using 2:xtic(1),\
'' u 3,\
'' u ($0 - dx):2:2 with labels offset -1,1 ,\
'' u ($0 + dx):3:3 with labels offset 1,1
A second option would be to use the label's alignment: The labels of the red bars are right aligned at the bars right border, the labels of the blue bars are left aligned at the bars left border:
absoluteBoxwidth = 0.8
dx = 1/6.0 * (1 - absoluteBoxwidth)/2.0
plot 'source.dat' using 2:xtic(1),\
'' u 3,\
'' u ($0 - dx):2:2 with labels right offset 0,1 ,\
'' u ($0 + dx):3:3 with labels left offset 0,1
In any case, both options make your script more robust against changes of the input data.
This looks better :
plot fuente using 3:xtic(1) ls 1 ti col axis x1y1, '' u 3 ls 2 ti col axis x1y2, '' u ($0-1):3:3 with labels offset -3,1 , '' u ($0-1):2:2 with labels offset 3,1
You had 2 plots commands: only the first one was displayed.
Also, script.sh should be a bash script. This is a gnuplot script, so it should have another extension.
The problem is the ti col tab. You need to put it in every option, including labels and not only in bars. The right code is:
plot fuente using 2:xtic(1) ls 1 ti col, '' u 3 ls 2 ti col, '' u 0:2:2 ti col with labels offset -3,1 , '' u 0:3:3 ti col with labels offset 3,1
And that's how the picture is displayed now:
You can also avoid ti col and that is how it would look:

gnuplot: Not enough columns for variable color

I am executing the following gnuplot script:
set title "Efficiency scatter plot"
set xlabel "perf_1"
set ylabel "secondary report"
set log x
set log y
set xrange [0.1:40.0]
set yrange [0.1:40.0]
set terminal png medium
set output "./graph1.png"
set size square
set multiplot
set pointsize 0.3
set style line 6 pt 6
set datafile separator ","
set border 3
set xtics nomirror
set ytics nomirror
plot '/tmp/data.csv' using 3:1 with points pt 1 lt 3 lc var title "perf_20140113131309", \
'/tmp/data.csv' using 3:2 with points pt 1 lt 1 lc var title "perf_1"
plot x notitle
plot 2*x notitle
plot 0.5*x notitle
obtaining the following error message
"script.gnuplot", line 20: Not enough columns for variable color
Could you please guide me in order to find what I am doing wrong.
By the way the gnuplot version is '4.6 patchlevel 3' the data.csv files used is
0.1,0.1,40.0
0.14,0.14,40.0
0.32,0.32,40.0
0.7,0.74,40.0
Thanks in advance!
That means, that you need to specify one more column in your using statement: The first one is the x-coordinate, the second one the y-coordinate. The one for the variable line color is missing.
Use e.g.
plot '/tmp/data.csv' using 3:1:0 with points pt 1 lt 3 lc var
to use the row number (zeroth column) as linetype index. You can also use e.g. linecolor palette so select the color from the currently defined color palette.

Resources