I want to plot a histogram with broken axis on Y. A good tutorial has been explained here and here but they don't fit my need. The data points are
"Method" "Year1" "Year2"
M1 12 -40
M2 5 40
The code snippet for this data points are
set ylabel "The Profit (%)"
set style data histogram
set style histogram cluster gap 1
# Draw a horizontal line at Y=0
set arrow 1 from -1,0 to 2,0 nohead
plot 'test_data.txt' using 2:xtic(1) ti col lc rgb "black", '' u 3 ti col lc rgb "grey"
And the output looks like
As you can see the grey bars are on the extreme values. What I want is to limit the yrange from - to +20 and put a ~~ symbol (rotate it by 90 degree) on the second bars and put a label -40 and +40. Something like this figure
How that is possible?
You can do it, but it is very tedious:
Crop the y-values in the using statement of your histograms
Plot a label with the labels plotting style when the value is above or below a given limit.
Plot the vectors, which show, that the boxes are truncated.
The following script works:
set ylabel "The Profit (%)"
set style histogram cluster gap 1
set boxwidth 0.9 relative
# Draw a horizontal line at Y=0
set xzeroaxis lt -1
ulim = 15
llim = -15
set yrange[-20:20]
sc = 0.333
set style fill solid noborder
plot 'test_data.txt' using ($2 > ulim ? ulim : ($2 < llim ? llim : $2)):xtic(1) ti col lc rgb "black" with histogram, \
'' u ($3 > ulim ? ulim : ($3 < llim ? llim : $3)) ti col lc rgb "grey" with histogram,\
for [c=2:3] '' u ($0-1+(c-2.5)*sc):(column(c) > ulim ? ulim : 1/0):(sprintf('+%d', ulim)) with labels offset 0, char 1.5 notitle,\
for [c=2:3] '' u ($0-1+(c-2.5)*sc):(column(c) < llim ? llim : 1/0):(sprintf('%d', llim)) with labels offset 0, char -1.5 notitle,\
for [c=2:3] for [ofs=0:1] '' u ($0-1+(c-2.5)*sc - 0.03 + ofs*0.02):\
(column(c) > ulim ? ulim - 1 : (column(c) < llim ? llim - 1 : 1/0)):(0.04):(2) with vectors lc rgb 'black' nohead notitle
and gives the following result with 4.6.3:
There is too much involved to explain everything, so here are some important remarks:
The histogram boxes are placed starting from 0 and are given a custom label. This is important for the placement of the labels and the vectors ($0-1 in the using statement).
The factor sc = 0.333 results from the three columns for on xtick (year1, year2, and the gap 1).
The method works for both columns 2 and 3
The script gives some warning, because some plots are empty (no value of column 2 exceeds the limits, so the respective label and vectors plots contain no points).
I think its not practicable to use curves to indicate the broken boxes.
If your boxes have borders, they would appear also on top of the broken boxes, which might be counterintuitive.
Use either set xzeroaxis to draw a line at y=0, or an arrow with graph coordinates (set arrow from graph 0,first 0 to graph 1, first 0 nohead).
Related
My question is similar to this one:
vary point color based on column value for multiple data blocks gnuplot
Except there was not an explanation given above for the syntax used and what it meant..
The data looks like this - columns separated by a comma and enter separates rows:
0, 0F_0F_0F_0F_0F, 0_0_0_0_0_0_0_0_0_0, 1_0_0_0_0_0_0_0_0_0
4.046025985, 0F_2Fo_0F_2Fo_0F, 0_0_1_0_0_0_0_0_1_0, 1_1_0_0_0_0_1_0_0_0
2.941144083, 0F_0F_0F_0F_0F, 0_0_1_0_0_1_0_0_0_1, 1_0_0_0_1_0_0_0_0_0
1.836301245, 0F_0F_0F_2Fo_0F, 0_0_0_0_0_0_0_0_0_0, 1_0_0_0_0_0_0_0_0_0
0.90317579, 0F_0F_0F_2Fo_0F, 0_0_0_1_0_0_0_1_0_0, 1_0_1_0_0_1_0_0_1_0
3.826663156, 0F_0F_0F_0F_0F, 0_1_0_0_1_0_1_0_0_1, 1_0_1_0_0_0_0_0_0_0
In my datafile, there are 100 individual rows, where column 1 is to be used for the colour palette and columns 2-4 are labels for X,Y axes on two different plots
What I want is an X,Y scatter of columns 3 and 4, with column 1 used to colour each point on the plot.
Here is my script attempt:
set title "K and W Occupancy \n KcsA, Replica 0, 0 mV "
set xlabel "POT" font ",18"
set ylabel "Water" font ",18"
set cblabel "Free energy (kT)" font ",18"
set xtics rotate by -45
set xtics out font ", 13" nomirror
set ytics out font ", 13" nomirror
set pointsize 0.4
set xrange [0:100]
iset yrange [0:100]
set cbrange [0:10]
# MATLAB jet color pallete --> from https://github.com/Gnuplotting/gnuplot-palettes/blob/master/jet.pal
# palette
set palette defined (0 0.0 0.0 0.5, \
1 0.0 0.0 1.0, \
2 0.0 0.5 1.0, \
3 0.0 1.0 1.0, \
4 0.5 1.0 0.5, \
5 1.0 1.0 0.0, \
6 1.0 0.5 0.0, \
7 1.0 0.0 0.0, \
8 0.5 0.0 0.0 )
splot '$filename' using 3:4:($1 <= 10 ? 0 : 1) w p pointtype 5 pointsize 1 palette linewidth 10
I do not really know what this means:
($1 <= 10 ? 0 : 1)
Why does the script plot a 3D graph with the data incorrectly placed?
Was expected a 2D plot with unique entries along the X and Y axes, with each point coloured along a colour scale..
The attempt described above results in a 3D plot and the points are incorrect.
Multiple answers to similar questions I have read do not explain what each term in the gnuplot script means, including:
Plotting style based on an entry in a data-file
gnuplot splot colors based on a fourth column of the data file
vary point color based on column value for multiple data blocks gnuplot
We don't have your data (if possible please always add minimized data) and we don't see your graph output.
I do not really know what this means: ($1 <= 10 ? 0 : 1)
This is the ternary operator. Check help ternary. If the value in column 1 ($1) is smaller or equal to 10 return 0, and 1 otherwise.
Why does the script plot a 3D graph with the data incorrectly placed?
Because you told gnuplot so. Mind the difference splot and plot. Check help splot and help plot. splot requires x,y,z input and your z is ($1 <= 10 ? 0 : 1)
So, without being able to test your case, your command probably should be something like this:
plot '$filename' u 3:4:1 w p pt 5 ps 1 lc palette
Addition:
If I understood your question correctly, I guess there is no off-the-self plotting style for this.
You need to:
create lists of unique elements (by (mis-)using stats, check help stats) for x and for y (in your case column 3 and 4). The list will be in the order of occurrence in the datafile. Unfortunately, gnuplot does not offer an internal alphanumerical sort of a list. If you want it sorted you need to either use external tools or a cumbersome gnuplot-only workaround.
define a function by (mis-)using sum (check help sum) which determines the index of a given item and use this index either as x- or y-coordinate
Script:
### scatter plot with x,y strings
reset session
$Data <<EOD
0.00, 0F_0F_0F, 0_0_0_0, 0_0_0_0
0.43, 0F_0F_0F, 0_1_1_1, 1_0_1_1
0.64, 0F_0F_0F, 0_1_1_1, 1_1_0_0
0.73, 0F_0F_0F, 0_1_1_1, 0_1_1_1
0.29, 0F_0F_0F, 0_1_0_1, 1_0_1_1
0.34, 0F_0F_0F, 0_1_0_1, 1_1_1_1
0.45, 0F_0F_0F, 1_1_1_1, 1_0_1_1
0.10, 0F_0F_0F, 1_1_1_1, 0_1_1_1
0.99, 0F_0F_0F, 0_0_1_1, 1_1_0_0
EOD
uniqX = uniqY = ' '
addToList(uniq,col) = uniq.(strstrt(uniq,' '.strcol(col).' ') ? '' : strcol(col).' ' )
getIdx(list,s) = (_c=NaN, sum[_i=1:words(list)] (word(list,_i) eq s ? _c=_i : NaN) , _c)
set datafile separator comma
stats $Data u (uniqX=addToList(uniqX,3), uniqY=addToList(uniqY,4)) nooutput
set key noautotitle
set xtic noenhanced rotate by 90 right
set ytic noenhanced
set offsets 0.5,0.5,0.5,0.5
set bmargin 4
set size ratio -1
set grid x,y
set palette rgb 33,13,10
plot $Data u (getIdx(uniqX,strcol(3))):(getIdx(uniqY,strcol(4))):1:xtic(3):ytic(4) w p pt 5 ps 7 lc palette
### end of script
Result:
I haven't been able to find any example of what I'm trying to do in GNUplot from raking docs and demos.
Essentially I want to plot the Blue, Green, and Red lines I manually drew on this output (for demonstration) at the 10/50/90% marks.
EDIT: For clarity, I'm looking to determine where the distribution lines hit the cumulative distribution at 0.1/0.5/0.9 to know which co-ordinates to draw the lines at. Thanks!
set terminal png size 1600,800 font "Consolas" 16
set output "test.png"
set title "PDF and CDF - 1000 Simulations"
set grid y2
set ylabel "Date Probability"
set y2range [0:1.00]
set y2tics 0.1
set y2label "Cumulative Distribution"
set xtics rotate by 90 offset 0,-5
set bmargin 6
plot "data.txt" using 1:3:xtic(2) notitle with boxes axes x1y1,'' using 1:4 notitle with linespoints axes x1y2
Depending on the number of points in your cumulative data curve you might need interpolation. The following example is chosen such that no original data point will be at your levels 10%, 50%, 90%. If your data is not steadily increasing, it will take the last value which matches your level(s).
The procedure is as follows:
plot your data to a dummy table.
check when Level is between to successive y-values (y0,y1).
remember the interpolated x-value in xp.
draw arrows from the borders of the graph to the point (xp,Level) (or instead use the partly outside rectangle "trick" from #Ethan).
Code:
### linear interpolation of data
reset session
set colorsequence classic
set key left
# create some dummy data
set sample 10
set table $Data
plot [-2:2] '+' u 1:(norm(x)) with table
unset table
Interpolate(yi) = x0 + (x1-x0)*(yi-y0)/(y1-y0)
Levels = "0.1 0.5 0.9"
do for [i=1:words(Levels)] {
Level = word(Levels,i)
x0 = x1 = y0 = y1 = NaN
set table $Dummy
plot $Data u (x0=x1,x1=$1,y0=y1,y1=$2, (y0<=Level && Level<=y1)? (xp=Interpolate(Level)):NaN ): (Level) w table
unset table
set arrow i*2 from xp, graph 0 to xp,Level nohead lc i
set arrow i*2+1 from xp,Level to graph 1,Level nohead lc i
}
plot $Data u 1:2 w lp pt 7 lc 0 t "Original data"
### end code
Result:
It is not clear if you are asking how to find the x-coordinates at which your cumulative distribution line hits 0.1, 0.5, 0.9 (hard to do so I will leave that for now) or asking how to draw the lines once you know those x values. The latter part is easy. Think of the lines you want to draw as the unclipped portion of a rectangle that extends off the plot to the lower right:
set object 1 rectangle from x1, 0.1 to graph 2, -2 fillstyle empty border lc "blue"
set object 2 rectangle from x2, 0.1 to graph 2, -2 fillstyle empty border lc "green"
set object 3 rectangle from x3, 0.1 to graph 2, -2 fillstyle empty border lc "red"
plot ...
I am having some difficulty generating a plot of a data set that is oscillating between negative and positive values (line a sin or cos). My goal is to fill the area under the curve with alternating colour: negative region with blue and positive with red. To be more precise I want to fill the area between the curve and the x axis. So far i managed to plot the curve with alternating colours (blue for negative, red for positive) using:
set palette model RGB defined ( 0 'red', 1 'blue' )
unset colorbox
plot 'data.set' u 1:2:( $2 < 0.0 ? 1 : 0 ) w lines lt 1 lw 4 palette
Unfortunately if I replace w lines with filledcurves I don't get an alternate fill. How can one accomplish this?
Cheers
If I understood the question correctly, you can try this:
plot '+' using 1:(0):(sin($1)) w filledc below, \
'+' using 1:(0):(sin($1)) w filledc above
which is telling gnuplot to fill the area between two curves (sin(x) and 0), using the above and below positions. There is another solution as well:
plot '+' using 1:(sin($1) > 0 ? sin($1):0) w filledcurves y1, \
'+' using 1:(sin($1) < 0 ? sin($1):0) w filledcurves y2
and the result would be:
The important part refers to the options part of filledcurves. See more details here and here.
yesteraday I made a similar question (this one). I could not display the value on top of bar in a gnuplot histogram. I lost many time because I couldn't find really good documentation about it, and I only can find similar issues on differents websites.
I lost many time with that but fortunately someone give me the solution. Now I am having a similar issue with an histogram with two bars, in which I have to put on top of both bars its value. I am quite near, or that is what I think, but I can't make it work properly. I am changing the script and regenerating the graph many times but I am not sure of what I am doing.
script.sh
#!/usr/bin/gnuplot
set term postscript
set terminal pngcairo nocrop enhanced size 600,400 font "Siemens Sans,8"
set termoption dash
set output salida
set boxwidth 0.8 absolute
set border 1
set style fill solid 1.00 border lt -1
set key off
set style histogram clustered gap 1 title textcolor lt -1
set datafile missing '-'
set style data histograms
set xtics border in scale 0,0 nomirror autojustify
set xtics norangelimit
set xtics ()
unset ytics
set title titulo font 'Siemens Sans-Bold,20'
set yrange [0.0000 : limite1] noreverse nowriteback
set y2range [0.0000 : limite2] noreverse nowriteback
show style line
set style line 1 lt 1 lc rgb color1 lw 1
set style line 2 lt 1 lc rgb color2 lw 1
## Last datafile plotted: "immigration.dat"
plot fuente using 2:xtic(1) ls 1 ti col axis x1y1, '' u 3 ls 2 ti col axis x1y2, '' u 0:2:2 with labels offset -3,1 , '' u 0:2:3 with labels offset 3,1
I am modifying the last code line, because is here where I set the labels. I have been able to show both labels, but in bad positions, I have also been able to show one of the labels in the right position but no the other. I have been able to show almost everything but the thing that I want. This is the graph that generates the script.
output.png
This is the source file that I use for generating the graph
source.dat
"Momento" "Torre 1" "Torre 2"
"May-16" 1500.8 787.8
"Jun-16" 1462.3 764.1
"Jul-16" 1311.2 615.4
"Ago-16" 1199.0 562.0
"Sep-16" 1480.0 713.8
"Oct-16" 1435.1 707.8
And that's the command that I execute with the parameters set
gnuplot -e "titulo='Energía consumida por torre (MWh)'; salida='output.png'; fuente='source.dat'; color1='#FF420E'; color2='#3465A4'; limite1='1800.96'; limite2='945.36'" script.sh
I think that is quite obvious what I am pretending, can someone help me?
Lots of thanks in advance.
Your script has several problems, the missing ti col is only one of them. (You can also use set key auto columnheader, then you must not give that option every time).
Don't use both y1 and y2 axis if you want to compare the values! Otherwise the correct bar heights are only a matter of luck...
Understand, how gnuplot positions the histogram bars, then you can exactly locate the top center of each bar. If you only use offset with char values (which is the case when you give only numbers), then your script will break as soon as you add or remove a data row.
The histogram clusters start at x-position 0, and are positioned centered at integer x values. Since you have two bars in each cluster and a gap of 1, the center of the first bar is at ($0 - 1/6.0) (= 1/(2 * (numberOfTorres + gapCount))), the second one at ($0 + 1/6.0):
set terminal pngcairo nocrop enhanced size 600,400 font ",8"
set output 'output.png'
set title 'Energía consumida por torre (MWh)' font ",20"
set boxwidth 0.8 absolute
set border 1
set style fill solid 1.00 border lt -1
set style histogram clustered gap 1 title textcolor lt -1
set style data histograms
set xtics border scale 1,0 nomirror autojustify norangelimit
unset ytics
set key off auto columnheader
set yrange [0:*]
set offset 0,0,graph 0.05,0
set linetype 1 lc rgb '#FF420E'
set linetype 2 lc rgb '#3465A4'
# dx = 1/(2 * (numberOfTorres + gap))
dx = 1/6.0
plot 'source.dat' using 2:xtic(1),\
'' u 3,\
'' u ($0 - dx):2:2 with labels,\
'' u ($0 + dx):3:3 with labels
Now, starting at the bars center you can safely use offset to specify only the offset relative to the bars top center:
plot 'source.dat' using 2:xtic(1),\
'' u 3,\
'' u ($0 - dx):2:2 with labels offset -1,1 ,\
'' u ($0 + dx):3:3 with labels offset 1,1
A second option would be to use the label's alignment: The labels of the red bars are right aligned at the bars right border, the labels of the blue bars are left aligned at the bars left border:
absoluteBoxwidth = 0.8
dx = 1/6.0 * (1 - absoluteBoxwidth)/2.0
plot 'source.dat' using 2:xtic(1),\
'' u 3,\
'' u ($0 - dx):2:2 with labels right offset 0,1 ,\
'' u ($0 + dx):3:3 with labels left offset 0,1
In any case, both options make your script more robust against changes of the input data.
This looks better :
plot fuente using 3:xtic(1) ls 1 ti col axis x1y1, '' u 3 ls 2 ti col axis x1y2, '' u ($0-1):3:3 with labels offset -3,1 , '' u ($0-1):2:2 with labels offset 3,1
You had 2 plots commands: only the first one was displayed.
Also, script.sh should be a bash script. This is a gnuplot script, so it should have another extension.
The problem is the ti col tab. You need to put it in every option, including labels and not only in bars. The right code is:
plot fuente using 2:xtic(1) ls 1 ti col, '' u 3 ls 2 ti col, '' u 0:2:2 ti col with labels offset -3,1 , '' u 0:3:3 ti col with labels offset 3,1
And that's how the picture is displayed now:
You can also avoid ti col and that is how it would look:
I have a set of points "data" defining a curve that I want to plot with bezier smooth.
So I want to fill the area below that curve between some pairs of x values.
If I only had one pair of x values it's not that difficult because I define a new set of data and plot it with filledcu. Example:
The problem is that I want to do that several times in the same plot.
Edit: Minimal working example:
#!/usr/bin/gnuplot
set terminal wxt enhanced font 'Verdana,12'
set style fill transparent solid 0.35 noborder
plot 'data' using 1:2 smooth sbezier with lines ls 1
pause -1
Where the structure of 'data' is:
x_point y_point
And I realized that my problem is that in fact I can't fill not even one curve, it seems to be filled because the slope is almost constant there.
To fill parts below a curve, you must use the filledcurves style. With the option x1 you fill the part between the curve and the x-axis.
In order to fill only parts of the curve, you must filter your data, i.e. give the x-values a value of 1/0 (invalid data point) if they are outside of the desired range, and the correct value from the data file otherwise. At the end you plot the curve itself:
set style fill transparent solid 0.35 noborder
filter(x,min,max) = (x > min && x < max) ? x : 1/0
plot 'data' using (filter($1, -1, -0.5)):2 with filledcurves x1 lt 1 notitle,\
'' using (filter($1, 0.2, 0.8)):2 with filledcurves x1 lt 1 notitle,\
'' using 1:2 with lines lw 3 lt 1 title 'curve'
This fills the range [-1:0.5] and [0.2:0.8].
To give a working example, I use the special filename +:
set samples 100
set xrange [-2:2]
f(x) = -x**2 + 4
set linetype 1 lc rgb '#A3001E'
set style fill transparent solid 0.35 noborder
filter(x,min,max) = (x > min && x < max) ? x : 1/0
plot '+' using (filter($1, -1, -0.5)):(f($1)) with filledcurves x1 lt 1 notitle,\
'' using (filter($1, 0.2, 0.8)):(f($1)) with filledcurves x1 lt 1 notitle,\
'' using 1:(f($1)) with lines lw 3 lt 1 title 'curve'
With the result (with 4.6.4):
If you must use some kind of smoothing, the filter may affect the data curve differently, depending on the filtered part. You can first write the smoothed data to a temporary file and then use this for 'normal' plotting:
set table 'data-smoothed'
plot 'data' using 1:2 smooth bezier
unset table
set style fill transparent solid 0.35 noborder
filter(x,min,max) = (x > min && x < max) ? x : 1/0
plot 'data-smoothed' using (filter($1, -1, -0.5)):2 with filledcurves x1 lt 1 notitle,\
'' using (filter($1, 0.2, 0.8)):2 with filledcurves x1 lt 1 notitle,\
'' using 1:2 with lines lw 3 lt 1 title 'curve'