Gnuplot giving an extra point in dispersion plot - gnuplot

I have the following data:
"ani_vs_16s.csv"
P_graminis_DSM_15220 P_jilunlii_ATCC_23019 93.02 99.2
P_graminis_DSM_15220 P_polymyxa_ATCC_842 69.03 94.5
P_jilunlii_ATCC_23019 P_polymyxa_ATCC_842 68.86 94.5
P_graminis_DSM_15220 P_riograndensis_SBR5 91.76 99
P_jilunlii_ATCC_23019 P_riograndensis_SBR5 92.76 98.5
P_polymyxa_ATCC_842 P_riograndensis_SBR5 68.57 94.2
P_graminis_DSM_15220 P_sonchi_X19-5 92.06 99.1
P_jilunlii_ATCC_23019 P_sonchi_X19-5 93.31 99.2
P_polymyxa_ATCC_842 P_sonchi_X19-5 68.88 94.8
P_riograndensis_SBR5 P_sonchi_X19-5 96.09 99
P_graminis_DSM_15220 P_sp._CAR114 91.38 99.4
P_jilunlii_ATCC_23019 P_sp._CAR114 92.45 99.3
P_polymyxa_ATCC_842 P_sp._CAR114 68.61 94.5
P_riograndensis_SBR5 P_sp._CAR114 96.31 99.2
P_sonchi_X19-5 P_sp._CAR114 95.61 99.4
P_graminis_DSM_15220 P_sp._CAS34 91.84 99.5
P_jilunlii_ATCC_23019 P_sp._CAS34 92.91 99
P_polymyxa_ATCC_842 P_sp._CAS34 68.63 94.7
P_riograndensis_SBR5 P_sp._CAS34 97.01 99.3
P_sonchi_X19-5 P_sp._CAS34 96.32 99.6
P_sp._CAR114 P_sp._CAS34 97.7 99.7
When I plot this points with gnuplot, an extra point appears on the plot (blue arrow). The table has 21 points, but 22 points are shown in the plot (note that there are 6 points in the lower left side of the plot).
I checked the data, but I was not able to find the problem. When I plot with LibreOffice Calc, no extra point appears. Is there some problem in my code?
set terminal svg
set output “ani_vs_16S.svg”
set style rect fc lt -1 fs solid 0.15 noborder
set object rect from 95, graph 0 to 100, graph 1
set arrow from 0,98.5 to 100,98.5 nohead lw 8
plot "ani_vs_16s.csv" using 3:4 with points pt 7 ps 1

This point seems to be a part of legend. Try to add notitle to the end of last line of your script.

Related

gnuplot: how to plot color squares for each month's temperature?

I would like to plot the following figure (from Fundamentals of Data Visualization) using gnuplot:
I expect the data for each location is something like:
# month temperature
01 60.0
02 78.0
03 90.0
...
12 78.0
Here is what I tried. For simplicity, I have transposed the data into a matrix.
$data << EOD
1.50 1.57 1.85 2.15 1.87 1.05 1.70 1.65 1.97 1.71 1.53 1.15
4.44 4.71 4.74 3.50 3.43 4.98 4.29 4.55 3.93 3.34 3.74 4.88
8.55 9.59 5.65 0.13 9.33 4.70 8.94 7.74 4.49 6.26 0.96 1.20
EOD
unset border
unset ytics
set xlabel 'month'
set palette rgbformula -7,2,-7
set cbrange [0:10]
set cblabel "precipitation"
set xrange [-0.5:11.5]
set yrange [-0.5:2.5]
set xtics ("Jan" 0, "Feb" 1, "Mar" 2, "Apr" 3, "May" 4, \
"Jun" 5, "Jul" 6, "Aug" 7, "Sep" 8, "Oct" 9, "Nov" 10, "Dec" 11)
plot $data matrix with image
But the effect is far from satisfactory. For example, How to generate clear borders between squares?
The plotting style with image probably cannot have borders. So, I would use the versatile style with boxxyerror.
Furthermore, instead of matrix format I would add a line header and loop through the columns (since there will be always 12 months)
Check the following example as well as help boxxyerror, help size, help xticlabels and help strftime for further reading.
If you have your data in separate files you would have to modify the script accordingly.
Script:
### plotting style boxxyerror as replacement for "with image"
reset session
$Data << EOD
# location Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Atlantis 1.50 1.57 1.85 2.15 1.87 1.05 1.70 1.65 1.97 1.71 1.53 1.15
Mordor 4.44 4.71 4.74 3.50 3.43 4.98 4.29 4.55 3.93 3.34 3.74 4.88
Wonderland 8.55 9.59 5.65 0.13 9.33 4.70 8.94 7.74 4.49 6.26 0.96 1.20
EOD
unset border
set xlabel 'month'
set xrange [0.5:12.5]
set yrange [:] reverse
set ytics
set palette rgbformula -7,2,-7
set cbrange [0:10]
set cblabel "precipitation"
MonthName(n) = strftime("%b",24*3600*28*n) # get the month name from 1..12
set key noautotitle
set style fill solid 1.0 border rgb "white"
set size ratio -1 # make the boxes squares
plot for [i=1:12] $Data u (i):0:(0.5):(0.5):i+1: \
xtic(MonthName(i)):ytic(1) w boxxy fc palette lw 2
### end of code
Result:

How to shift origin in gnuplot?

B x(cm)
24.5 4.2
25.5 4.5
26.5 5.0
27.5 5.4
28.5 5.9
29.5 6.6
30.5 7.2
31.5 7.9
32.5 8.6
33.5 9.3
34.5 10.0
35.5 10.5
36.5 10.9
37.5 11.1
38.5 11.1
39.5 10.8
40.5 10.3
41.5 9.8
42.5 9.2
43.5 8.4
44.5 7.7
45.5 7.1
46.5 6.4
47.5 5.9
48.5 5.4
49.5 5.0
50.5 4.6
51.5 4.2
This is my data.
And y(x) = a/(b**2 + x**2)**3/2 is the equation to which I want to fit the above data but the problem I am facing is that value of b is coming negative. So I want to know how will I change the origin of the graph to get the right result
A few things:
are you sure the function is f(x) = a/(b**2 + x**2)**3/2 and not f(x) = a/(b**2 + x**2)**(3/2), mind the parentheses around (3/2).
gnuplot has integer division (a common pitfall for unexpected results), hence, (3/2) will be evaluated to 1 instead of the expected 1.5.
why not letting gnuplot find the offset? Just introduce a variable c which will account the x-offset and let it fit.
depending on your model, i.e. if the exponent is variable, you could also add a variable d for the exponent and let it find by the gnuplot fitting algorithm.
sometimes it's better if you help the fitting with good starting values.
Then you have to judge whether the fitted values are making sense or not, e.g. b<0 or d=0.794 ...
Code:
### fitting with finding x-offset automatically
reset session
$Data <<EOD
B x(cm)
24.5 4.2
25.5 4.5
26.5 5.0
27.5 5.4
28.5 5.9
29.5 6.6
30.5 7.2
31.5 7.9
32.5 8.6
33.5 9.3
34.5 10.0
35.5 10.5
36.5 10.9
37.5 11.1
38.5 11.1
39.5 10.8
40.5 10.3
41.5 9.8
42.5 9.2
43.5 8.4
44.5 7.7
45.5 7.1
46.5 6.4
47.5 5.9
48.5 5.4
49.5 5.0
50.5 4.6
51.5 4.2
EOD
f1(x) = a1/(b1**2 + (x-c1)**2)**(3/2)
f2(x) = a2/(b2**2 + (x-c2)**2)**(3./2)
f3(x) = a3/(b3**2 + (x-c3)**2)**d3
set fit quiet nolog
fit f1(x) $Data u 1:2 via a1,b1,c1
fit f2(x) $Data u 1:2 via a2,b2,c2
a3=11; b3=1; c3=40; d3=1.5 # sometimes it's better to help the fitting with some good starting values
fit f3(x) $Data u 1:2 via a3,b3,c3,d3
print sprintf("% 9s% 9s% 9s% 9s","a","b","c","d")
print sprintf("%9.3g %9.3g %9.3g",a1,b1,c1)
print sprintf("%9.3g %9.3g %9.3g",a2,b2,c2)
print sprintf("%9.3g %9.3g %9.3g %9.3g",a3,b3,c3,d3)
plot $Data u 1:2 w p pt 7,\
f1(x) w l lc "red",\
f2(x) w l lc "web-green", \
f3(x) w l lc "web-blue"
### end of code
Result:
a b c d
1.17e+03 10.3 37.9
2.73e+04 -13.6 37.9
343 8.66 37.9 0.794
I'm not sure, but I think the equation that you're trying to fit to may be inappropriate for the data. Perhaps you could rewrite your equation such that it's clearer.
Here's an example using the quadratic equation y(x) = a*x**2 + b*x + c to fit:
test.dat
24.5 4.2
25.5 4.5
26.5 5.0
27.5 5.4
28.5 5.9
29.5 6.6
30.5 7.2
31.5 7.9
32.5 8.6
33.5 9.3
34.5 10.0
35.5 10.5
36.5 10.9
37.5 11.1
38.5 11.1
39.5 10.8
40.5 10.3
41.5 9.8
42.5 9.2
43.5 8.4
44.5 7.7
45.5 7.1
46.5 6.4
47.5 5.9
48.5 5.4
49.5 5.0
50.5 4.6
51.5 4.2
quad_fit.gp
set term pos col
set out 'xy_fit.ps'
set title 'Quadratic Regression Example Scatterplot'
set ylabel 'Y'
set xlabel 'X'
set style line 1 ps 1.5 pt 7 lc 'red'
set style line 2 lw 1.5 lc 'blue'
set grid
f(x) = a*(x**2) + b*x + c
fit f(x) 'test.dat' using 1:2 via a, b, c
p 'test.dat' ls 1 t 'Datapoints', f(x) ls 2 t 'Quadratic Regression'
set out
Running gnuplot quad_fit.gp produces:

Fitting a sinc function with gnuplot

I am trying to fit a sinc function with gnuplot but it fails with the message:
'Undefined value during function evaluation'.
First my data:
27 9.3
27.2 9.3
27.8 9.3
29 9.4
32 9.5
34 9.6
34.2 9.7
34.4 9.7
34.6 9.8
34.8 10.1
35 10.9
35.2 12.9
35.4 16.1
35.6 21.1
35.8 26.5
36 31.8
36.2 34.7
36.4 36.6
36.6 36.3
36.8 32.3
37 26.4
37.2 20.6
37.4 15.4
37.6 11.6
37.8 9.9
38 9.6
38.5 10
39 9.5
39.5 9.5
40 9.6
What I am trying to do in Gnuplot:
sinc(x)=sin(pi*x)/pi/x
f(x)=a*(sinc((b*(x-c))))**2+d
fit f(x) '4_temp.txt' via a,b,c,d
I set a,b,c,d close to the values that are needed (see picture) but it wont fit.
Somebody can help?
Thanks in advance.
I can reproduce your error message. You are trying to fit a sin(x)/x function. For x=0 you will get 0/0, although, gnuplot has no problems to plot sin(x)/x, apparently, fitting has a problem with this.
Only if you add a little offset, e.g. 1e-9, it seems to work and it will find some reasonable parameters.
As #Ethan says, you need to choose some starting values which should not be too far away from the final values.
You will get the fitted values:
Final set of parameters Asymptotic Standard Error
======================= ==========================
a = 27.5271 +/- 0.2822 (1.025%)
b = 0.608263 +/- 0.006576 (1.081%)
c = 36.3954 +/- 0.00657 (0.01805%)
d = 9.21346 +/- 0.127 (1.379%)
Code:
### fitting type of sin(x)/x function
reset session
$Data <<EOD
27 9.3
27.2 9.3
27.8 9.3
29 9.4
32 9.5
34 9.6
34.2 9.7
34.4 9.7
34.6 9.8
34.8 10.1
35 10.9
35.2 12.9
35.4 16.1
35.6 21.1
35.8 26.5
36 31.8
36.2 34.7
36.4 36.6
36.6 36.3
36.8 32.3
37 26.4
37.2 20.6
37.4 15.4
37.6 11.6
37.8 9.9
38 9.6
38.5 10
39 9.5
39.5 9.5
40 9.6
EOD
a=25
b=1
c=36
d=10
sinc(x)=sin(pi*x)/pi/(x)
f(x)=a*(sinc((b*(x-c+1e-9))))**2+d
set fit nolog
fit f(x) $Data via a,b,c,d
plot $Data u 1:2 w p pt 7, f(x) w l lc rgb "red"
### end of code
Result:

Colouring a pm3d surface using a column values

I am trying to colour a splot surface using pm3d and wanted to colour using values from another column instead of the z-axis.
The input file (test.file, tab separated) is :
atom_num residue_name X Y Z
288 1 45.3 36.6 79.3
301 1 38.9 197.4 72.5
314 1 118.2 53.8 76.5
327 1 58.2 139.1 78.5
353 1 1.9 14.4 71.9
366 1 156.9 180.0 72.1
379 1 183.2 5.4 69.5
392 1 71.7 155.4 75.8
457 1 83.4 11.8 74.8
613 1 97.1 180.7 77.5
626 1 145.2 160.3 71.7
678 2 73.1 76.3 81.0
704 3 30.3 46.5 79.3
717 2 216.0 130.7 85.5
743 2 55.0 137.2 74.4
756 2 23.4 67.3 78.3
769 2 46.9 156.1 77.3
821 2 145.4 143.9 80.7
990 2 7.8 119.3 79.8
1016 3 44.3 67.3 76.7
1042 3 12.8 44.4 74.3
1055 3 149.1 79.9 78.2
1068 3 100.8 35.8 76.1
1081 3 57.6 196.8 76.8
1094 3 214.7 122.8 79.5
1107 3 82.0 190.0 74.4
1120 3 150.9 39.4 71.3
1133 3 50.4 143.7 75.3
1146 1 42.9 104.7 74.3
1159 1 139.0 48.8 73.4
1172 1 66.8 165.3 71.5
1198 1 190.7 150.1 84.2
1211 1 92.1 5.1 75.8
1224 1 211.8 177.7 74.1
1237 1 131.6 0.2 73.6
1250 2 103.8 104.2 76.6
1276 2 132.4 5.0 70.0
1289 2 94.4 9.4 73.0
1302 2 72.6 33.7 74.3
1315 2 14.4 162.6 74.7
1406 2 171.4 143.6 86.1
1419 2 209.5 52.9 77.4
1445 2 11.6 14.7 72.3
1458 1 115.5 165.0 73.0
1549 1 147.1 45.5 76.1
1575 1 115.8 36.6 74.5
1588 1 35.8 37.3 76.2
1601 1 65.4 28.2 76.9
1614 1 13.4 199.9 76.5
The commands I am using is:
set dgrid3d 30,30
set hidden3d
set palette rgbformulae 33,13,10
splot "test.file" u 3:4:5 w pm3d
The image is appearing like this:
The plot is by default colouring based on the Z-axis value (column 5). I am stuck colouring the plot using the values of Residue Name (column 2), which ranges from 1-3. Is there an option to define which coloumn to choose for colouring? Ideally I would like to have the same plot but coloured according to the column 2, so that I can see which "Residue types" lie in which contours.
Any help would hugely helpful.
As your residue is an integer, it is unclear whether you want it interpolated onto the grid.
However, if that's what you want, you can use the solution in Plotting 3D surface from scatter points and a png on the same 3D graph but don't use with pm3d when writing tables. Here's a solution with a quick and somewhat dirty unix trick to merge the tables:
set terminal push #Save current terminal settings
set terminal unknown #dummy terminal
set table "surface.dat"
set dgrid3d
splot 'test.dat' using 3:4:5
set table "residue.dat"
splot 'test.dat' using 3:4:2
unset dgrid3d
unset table
set term pop #reset current terminal settings
!paste surface.dat residue.dat > test_grid.dat
splot "test_grid.dat" u 1:2:3:7 w pm3d

Box and whisker plot GNUPLOT

I need to visualize some data I have, with box and whisker plots, and I'd like to do it in GNUPLOT. So far I have converted my data to what I have understood is needed for GNU plot. The minimum, first quartile, median, third quartile and max.
This is the data I have:
#x min Q1 median q3 max width label
1 9.9 10.25 10.7 10.975 11.3 0.3 100
2 23.5 25.525 26.05 27.85 29.1 0.3 200
3 37.5 40.8 43.65 44.35 45.7 0.3 300
4 55 58.25 58.65 61.875 65.9 0.3 400
5 71.3 73.65 75.25 77.4 80.1 0.3 500
6 73.6 83.85 86.05 88.775 97.5 0.3 600
7 85.8 89.45 97.3 103.75 106 0.3 700
8 102 111 112 115.5 119 0.3 800
9 116 127 128 134 141 0.3 900
10 126 134 136 140.25 146 0.3 1000
11 144 149 152 156.25 165 0.3 1100
12 144 151.25 154 158 166 0.3 1200
13 138 157.25 159 162 171 0.3 1300
14 155 161.25 165.5 170 173 0.3 1400
15 158 171 172.5 177.5 182 0.3 1500
I have made this graph in Excel
But I need to have more graphs in the same image, which is something I cannot do in Excel. I have been messing around with GNUPLOT for a couple of hours, trying to use candlesticks, but all the graphs I get are wrong!
I've uploaded my data-file to DROPBOX https://dl.dropboxusercontent.com/u/12340447/data.txt
Any help is greatly appreciated!
EDIT:
I should probably include the script I currently have
set bars 2.0
set style fill empty
plot 'data.txt' using 1:3:2:6:5:xticlabels(7) with candlesticks title 'Quartiles' whiskerbars, \
'' using 1:4:4:4:4 with candlesticks lt -1 notitle
This gives the ouput
There's a few thing wrong with the picture: First of all the labels are wrong. They all say 0.3, but that's supposed to the the width of the boxplots. I'd also like to add a line (as in the excel) from each mean value, marked with a dot or cross or something.. Basically, make it look a little more like the Excel output.
Again - any help is greatly appreceiated!
The labels were wrong because they need to come from column 8 (xticlabels(8)) in the data.
The last line adds a blue line (lt 3), with diamond points (pt 13)
set bars 2.0
set style fill empty
plot 'data.txt' using 1:3:2:6:5:xticlabels(8) with candlesticks title 'Quartiles' whiskerbars, \
'' using 1:4:4:4:4 with candlesticks lt -1 notitle, \
'' using 1:4 with linespoints lt 3 pt 13 notitle

Resources