Box and whisker plot GNUPLOT - gnuplot

I need to visualize some data I have, with box and whisker plots, and I'd like to do it in GNUPLOT. So far I have converted my data to what I have understood is needed for GNU plot. The minimum, first quartile, median, third quartile and max.
This is the data I have:
#x min Q1 median q3 max width label
1 9.9 10.25 10.7 10.975 11.3 0.3 100
2 23.5 25.525 26.05 27.85 29.1 0.3 200
3 37.5 40.8 43.65 44.35 45.7 0.3 300
4 55 58.25 58.65 61.875 65.9 0.3 400
5 71.3 73.65 75.25 77.4 80.1 0.3 500
6 73.6 83.85 86.05 88.775 97.5 0.3 600
7 85.8 89.45 97.3 103.75 106 0.3 700
8 102 111 112 115.5 119 0.3 800
9 116 127 128 134 141 0.3 900
10 126 134 136 140.25 146 0.3 1000
11 144 149 152 156.25 165 0.3 1100
12 144 151.25 154 158 166 0.3 1200
13 138 157.25 159 162 171 0.3 1300
14 155 161.25 165.5 170 173 0.3 1400
15 158 171 172.5 177.5 182 0.3 1500
I have made this graph in Excel
But I need to have more graphs in the same image, which is something I cannot do in Excel. I have been messing around with GNUPLOT for a couple of hours, trying to use candlesticks, but all the graphs I get are wrong!
I've uploaded my data-file to DROPBOX https://dl.dropboxusercontent.com/u/12340447/data.txt
Any help is greatly appreciated!
EDIT:
I should probably include the script I currently have
set bars 2.0
set style fill empty
plot 'data.txt' using 1:3:2:6:5:xticlabels(7) with candlesticks title 'Quartiles' whiskerbars, \
'' using 1:4:4:4:4 with candlesticks lt -1 notitle
This gives the ouput
There's a few thing wrong with the picture: First of all the labels are wrong. They all say 0.3, but that's supposed to the the width of the boxplots. I'd also like to add a line (as in the excel) from each mean value, marked with a dot or cross or something.. Basically, make it look a little more like the Excel output.
Again - any help is greatly appreceiated!

The labels were wrong because they need to come from column 8 (xticlabels(8)) in the data.
The last line adds a blue line (lt 3), with diamond points (pt 13)
set bars 2.0
set style fill empty
plot 'data.txt' using 1:3:2:6:5:xticlabels(8) with candlesticks title 'Quartiles' whiskerbars, \
'' using 1:4:4:4:4 with candlesticks lt -1 notitle, \
'' using 1:4 with linespoints lt 3 pt 13 notitle

Related

gnuplot fitting extends the points assigned. how to solve this error?

I have 2 sets of data file
#veff S Pmax S Pmin
0.10 103 0.2135 152 -0.0505
0.11 104 0.2162 152 -0.0592
0.12 105 0.2177 152 -0.0669
and
#veff S Pmax S Pmin
0.13 106 0.2177 152 -0.0729
0.14 105 0.2162 152 -0.0778
0.15 105 0.2127 152 -0.0819
0.16 105 0.2078 152 -0.0858
0.17 105 0.2018 153 -0.0879
0.18 104 0.1959 153 -0.0889
0.19 104 0.1907 153 -0.0898
0.20 103 0.1860 153 -0.0921
while I try to fit the fit goes beyond the points with the code
set terminal wxt
#set term postscript eps color enhanced
#set output "1.eps"
set xlabel "Pmax"
set ylabel "Pmin"
[![enter image description here][1]][1]
set title "Pmax vs Pmin"
unset key
FIT_LIMIT = 1e-6
f1(x)=a1*x*x+b1*x+c1
f2(x)=a2*x*x+b2*x+c2
fit [x=0.185:0.218] f1(x) "PmaxPminVEFF.txt" u 3:5 via a1,b1,c1
fit [x=0.212:0.218] f2(x) "PmaxPminVEFF1.txt" u 3:5 via a2,b2,c2
plot 'PmaxPminVEFF.txt' using 3:5 with p pt 7,f1(x) lc rgb "red",\
'PmaxPminVEFF1.txt' using 3:5 with p pt 8, f2(x) lc rgb "green"
the fitting line goes beyond the points. Help me out here to fix the fittings up to the point only. Fit command is also not working.
Check the following:
Code:
### Limit fitted plot to data
reset session
$Data1 <<EOD
#veff S Pmax S Pmin
0.13 106 0.2177 152 -0.0729
0.14 105 0.2162 152 -0.0778
0.15 105 0.2127 152 -0.0819
0.16 105 0.2078 152 -0.0858
0.17 105 0.2018 153 -0.0879
0.18 104 0.1959 153 -0.0889
0.19 104 0.1907 153 -0.0898
0.20 103 0.1860 153 -0.0921
EOD
$Data2 <<EOD
#veff S Pmax S Pmin
0.10 103 0.2135 152 -0.0505
0.11 104 0.2162 152 -0.0592
0.12 105 0.2177 152 -0.0669
EOD
set xlabel "Pmax"
set ylabel "Pmin"
set title "Pmax vs Pmin"
unset key
FIT_LIMIT = 1e-6
f1(x) = a1*x**2 + b1*x + c1
f2(x) = a2*x**2 + b2*x + c2
fit [x=0.185:0.218] f1(x) $Data1 u 3:5 via a1,b1,c1
fit [x=0.212:0.218] f2(x) $Data2 u 3:5 via a2,b2,c2
plot $Data1 using 3:5 with p pt 7, \
[x=0.185:0.218] f1(x) lc rgb "red",\
$Data2 using 3:5 with p pt 8, \
[x=0.212:0.218] f2(x) lc rgb "green"
### end of code
Result:

Gnuplot pm3d: 'NaN value' removes all surrounding rectangles

I would like to plot a pm3d map, where data points are not equidistant on the axis.
Since the spacings for the x and y axis are identical, it is symmetrical, though.
The problem is whenever a value is "NaN", all of the four surrounding rectangles
are not plotted. In the data file below, this happens, for example, at (x,y)=(0.14,0.33) .
If the value is not 'NaN', then the four rectangles reappear.
I discovered this problem, when I tried to plot only the values >0 or <0, where the same happens.
I tried to search the documentation and the internet, but couldn't find anything on this.
Are there any solutions to this?
Plotscript:
set view map
set pm3d at b
set style data pm3d
set pm3d corners2color c1
set size ratio 1
set autoscale fix
set cbrange [-25:25]
set palette defined (-25 "blue", 0 "white", 25 "red")
set term png
set output "test.png"
splot "data.txt" u 1:2:3 notitle
set output
Data file:
0.0 0.0 1
0.0 0.08 -2
0.0 0.14 3
0.0 0.33 -4
0.0 0.46 5
0.0 0.55 5
0.08 0.0 -6
0.08 0.08 7
0.08 0.14 -8
0.08 0.33 9
0.08 0.46 -10
0.08 0.55 -10
0.14 0.0 11
0.14 0.08 -12
0.14 0.14 13
0.14 0.33 NaN
0.14 0.46 15
0.14 0.55 15
0.33 0.0 -16
0.33 0.08 17
0.33 0.14 -18
0.33 0.33 19
0.33 0.46 -20
0.33 0.55 -20
0.46 0.0 21
0.46 0.08 -22
0.46 0.14 23
0.46 0.33 -24
0.46 0.46 25
0.46 0.55 25
0.55 0.0 21
0.55 0.08 -22
0.55 0.14 23
0.55 0.33 -24
0.55 0.46 25
0.55 0.55 25
Thanks to the comment by #theozh I figured out a solution to this problem.
I adopted the script by #theozh under Plotting Heatmap with different column/line widths to the form below. This yields for the file
1 -6 11 -16 21
-2 7 -12 17 -22
3 -8 13 -18 23
-4 9 NaN 19 -24
5 -10 15 -20 25
this plot.
This is the best solution, because the data has this format anyway and the coordinates are a different file that I read in.
Plotscript:
CoordsX = "0.04 0.11 0.24 0.40 0.51"
CoordsY = "0.04 0.11 0.24 0.40 0.51"
dimX = words(CoordsX)
dimY = words(CoordsY)
dx(i) = (word(CoordsX,i)-word(CoordsX,i-1))*0.5
dy(i) = (word(CoordsY,i)-word(CoordsY,i-1))*0.5
ndx(i,j) = word(CoordsX,i) - (i-1<1 ? dx(i+1) : dx(i))
pdx(i,j) = word(CoordsX,i) + (i+1>ColCount ? dx(i) : dx(i+1))
ndy(i,j) = word(CoordsY,j) - (j-1<1 ? dy(j+1) : dy(j))
pdy(i,j) = word(CoordsY,j) + (j+1>RowCount ? dy(j) : dy(j+1))
set xrange[ndx(1,1):pdx(ColCount,1)]
set yrange[ndy(1,1):pdy(1,RowCount)]
set tic out
max = 25
set cbrange [-max:max]
set palette defined (-max "blue", 0 "white", max "red")
set term png
set output "test.png"
plot for [i=1:dim_x] file u (real(word(CoordsX,i))):1:(ndx(i,int($0))):(pdx(i,int($0))):(ndy(i,int($0+1))):(pdy(i,int($0+1))):i with boxxyerror fs solid 1.0 palette notitle
set output
### end of code

Gnuplot giving an extra point in dispersion plot

I have the following data:
"ani_vs_16s.csv"
P_graminis_DSM_15220 P_jilunlii_ATCC_23019 93.02 99.2
P_graminis_DSM_15220 P_polymyxa_ATCC_842 69.03 94.5
P_jilunlii_ATCC_23019 P_polymyxa_ATCC_842 68.86 94.5
P_graminis_DSM_15220 P_riograndensis_SBR5 91.76 99
P_jilunlii_ATCC_23019 P_riograndensis_SBR5 92.76 98.5
P_polymyxa_ATCC_842 P_riograndensis_SBR5 68.57 94.2
P_graminis_DSM_15220 P_sonchi_X19-5 92.06 99.1
P_jilunlii_ATCC_23019 P_sonchi_X19-5 93.31 99.2
P_polymyxa_ATCC_842 P_sonchi_X19-5 68.88 94.8
P_riograndensis_SBR5 P_sonchi_X19-5 96.09 99
P_graminis_DSM_15220 P_sp._CAR114 91.38 99.4
P_jilunlii_ATCC_23019 P_sp._CAR114 92.45 99.3
P_polymyxa_ATCC_842 P_sp._CAR114 68.61 94.5
P_riograndensis_SBR5 P_sp._CAR114 96.31 99.2
P_sonchi_X19-5 P_sp._CAR114 95.61 99.4
P_graminis_DSM_15220 P_sp._CAS34 91.84 99.5
P_jilunlii_ATCC_23019 P_sp._CAS34 92.91 99
P_polymyxa_ATCC_842 P_sp._CAS34 68.63 94.7
P_riograndensis_SBR5 P_sp._CAS34 97.01 99.3
P_sonchi_X19-5 P_sp._CAS34 96.32 99.6
P_sp._CAR114 P_sp._CAS34 97.7 99.7
When I plot this points with gnuplot, an extra point appears on the plot (blue arrow). The table has 21 points, but 22 points are shown in the plot (note that there are 6 points in the lower left side of the plot).
I checked the data, but I was not able to find the problem. When I plot with LibreOffice Calc, no extra point appears. Is there some problem in my code?
set terminal svg
set output “ani_vs_16S.svg”
set style rect fc lt -1 fs solid 0.15 noborder
set object rect from 95, graph 0 to 100, graph 1
set arrow from 0,98.5 to 100,98.5 nohead lw 8
plot "ani_vs_16s.csv" using 3:4 with points pt 7 ps 1
This point seems to be a part of legend. Try to add notitle to the end of last line of your script.

Colouring a pm3d surface using a column values

I am trying to colour a splot surface using pm3d and wanted to colour using values from another column instead of the z-axis.
The input file (test.file, tab separated) is :
atom_num residue_name X Y Z
288 1 45.3 36.6 79.3
301 1 38.9 197.4 72.5
314 1 118.2 53.8 76.5
327 1 58.2 139.1 78.5
353 1 1.9 14.4 71.9
366 1 156.9 180.0 72.1
379 1 183.2 5.4 69.5
392 1 71.7 155.4 75.8
457 1 83.4 11.8 74.8
613 1 97.1 180.7 77.5
626 1 145.2 160.3 71.7
678 2 73.1 76.3 81.0
704 3 30.3 46.5 79.3
717 2 216.0 130.7 85.5
743 2 55.0 137.2 74.4
756 2 23.4 67.3 78.3
769 2 46.9 156.1 77.3
821 2 145.4 143.9 80.7
990 2 7.8 119.3 79.8
1016 3 44.3 67.3 76.7
1042 3 12.8 44.4 74.3
1055 3 149.1 79.9 78.2
1068 3 100.8 35.8 76.1
1081 3 57.6 196.8 76.8
1094 3 214.7 122.8 79.5
1107 3 82.0 190.0 74.4
1120 3 150.9 39.4 71.3
1133 3 50.4 143.7 75.3
1146 1 42.9 104.7 74.3
1159 1 139.0 48.8 73.4
1172 1 66.8 165.3 71.5
1198 1 190.7 150.1 84.2
1211 1 92.1 5.1 75.8
1224 1 211.8 177.7 74.1
1237 1 131.6 0.2 73.6
1250 2 103.8 104.2 76.6
1276 2 132.4 5.0 70.0
1289 2 94.4 9.4 73.0
1302 2 72.6 33.7 74.3
1315 2 14.4 162.6 74.7
1406 2 171.4 143.6 86.1
1419 2 209.5 52.9 77.4
1445 2 11.6 14.7 72.3
1458 1 115.5 165.0 73.0
1549 1 147.1 45.5 76.1
1575 1 115.8 36.6 74.5
1588 1 35.8 37.3 76.2
1601 1 65.4 28.2 76.9
1614 1 13.4 199.9 76.5
The commands I am using is:
set dgrid3d 30,30
set hidden3d
set palette rgbformulae 33,13,10
splot "test.file" u 3:4:5 w pm3d
The image is appearing like this:
The plot is by default colouring based on the Z-axis value (column 5). I am stuck colouring the plot using the values of Residue Name (column 2), which ranges from 1-3. Is there an option to define which coloumn to choose for colouring? Ideally I would like to have the same plot but coloured according to the column 2, so that I can see which "Residue types" lie in which contours.
Any help would hugely helpful.
As your residue is an integer, it is unclear whether you want it interpolated onto the grid.
However, if that's what you want, you can use the solution in Plotting 3D surface from scatter points and a png on the same 3D graph but don't use with pm3d when writing tables. Here's a solution with a quick and somewhat dirty unix trick to merge the tables:
set terminal push #Save current terminal settings
set terminal unknown #dummy terminal
set table "surface.dat"
set dgrid3d
splot 'test.dat' using 3:4:5
set table "residue.dat"
splot 'test.dat' using 3:4:2
unset dgrid3d
unset table
set term pop #reset current terminal settings
!paste surface.dat residue.dat > test_grid.dat
splot "test_grid.dat" u 1:2:3:7 w pm3d

How to start with negative axis polarplot using gnuplot?

From How to get a radial(polar) plot using gnu plot?
My data:
theta dB
0 0.00
30 0.09
60 -0.26
90 -0.26
120 -0.35
150 -0.35
180 -0.35
210 -0.35
240 -0.26
270 -0.09
300 -0.26
330 0.00
360 0.00
Axis will not start at 0 but it start at -2 to 0. How can i fix this code?
The range in polar mode is controlled by set rrange:
set polar
set grid polar
set angles degree
set size ratio 1
unset border
unset xtics
unset ytics
set rrange [-2:0]
plot 'file.txt' with lines
Result with 4.6.3 is:

Resources