fitting sub-range on time data in gnuplot - gnuplot

Let me start by saying that I am working on:
$ gnuplot --version
gnuplot 5.2 patchlevel 2
I would like to plot and fit date/time data in gnuplot and have the fit only performed and subsequently displayed on a sub-range of the plot.
Example data that I played with can e.g. be found here.
EDIT: I realized that the data in the file don't match the timefmt signature, I added a /06 to each line so that the point would be drawn in the middle of the year which allowed to nicely plot it together with also monthly data from the same source.
I can get the desired result with the code below where I plot three functions, one over the full range of the plot and two others both of which only cover part of the date range.
set key left
set yrange[-0.75:1.0]
set xdata time
set timefmt '%Y/%m'
r=10e-10
e(x) = r*x+s
fit e(x) 'HadCRUT.4.6.0.0.annual_ns_avg_smooth.txt' using 1:2 via r,s
a=10e-10
f(x) = a * x + b
set xrange ["1970/06":"2018/06"]
fit f(x) 'HadCRUT.4.6.0.0.annual_ns_avg_smooth.txt' using 1:2 via a,b
g(x) = ( x > "1970/06" ) ? f(x) : 1/0
set xrange ["1850/06":"1970/06"]
c=9.24859e-11
h(x) = c * x + d
fit h(x) 'HadCRUT.4.6.0.0.annual_ns_avg_smooth.txt' using 1:2 via c,d
i(x) = ( x < "1970/06" ) ? h(x) : 1/0
set xrange ["1849/06":"2018/06"]
set term png size 1500,1000
set output 'annual_average_with_fit.png'
plot 'HadCRUT.4.6.0.0.annual_ns_avg_smooth.txt' using 1:2 with lp lw 2 t'annual avg (decadally smoothed)', e(x) t'full range fit' lw 2, i(x) t'1850-1970 fit' lw 2, g(x) t'1970-2018 fit' lw 2
which yields this plot
This is all good and well, but (and this is where the question comes in) in principle I should be able to achieve the same result also by other means.
First: I restrict the range of the file data to a certain range to fit it only on that range. In principle I should be able to do the same using this (type of) syntax:
fit ["1970/06":"2018/06"] f(x) 'HadCRUT.4.6.0.0.annual_ns_avg_smooth.txt' using 1:2 via a,b
which however gives
Read 168 points
Skipped 168 points outside range [x=1970:2018]
[...] No data to fit
which seems weird given that the set xrange clearly has the desired effect.
Secondly trying to restrict the plotting of the curve to the fit range with
plot 'HadCRUT.4.6.0.0.annual_ns_avg_smooth.txt' using 1:2 with lp lw 2 t'annual avg (decadally smoothed)', ["1970/06":"2018/06"] f(x) t''
does not plot the function at all.
I might be overlooking something very basic, but having tried various things I don't see what it is

The following (a bit cleaned up) code should do what you want (tested with gnuplot 5.2.5).
I guess the problem is that you tried to fit a range ["1970/06":"2018/06"], but your data is only until 2017. So better leave it open, e.g. ["1970/06":] or ["1970/06":*].
edit: added a limited range fit i(x)
reset session
set term png size 1500,1000
set output 'annual_average_with_fit.png'
set key left
set yrange[-0.75:1.0]
set xdata time
set timefmt '%Y/%m'
set format x '%Y'
FILE = 'HadCRUT.4.6.0.0.annual_ns_avg_smooth.txt'
r=10e-10
f(x) = r*x+s
fit [*:*] f(x) FILE using 1:2 via r,s
c=9.24859e-11
g(x) = c * x + d
fit [*:"1970/06"] g(x) FILE using 1:2 via c,d
a=10e-10
h(x) = a * x + b
fit ["1970/06":*] h(x) FILE using 1:2 via a,b
p=1e-9
i(x) = p * x + q
fit [strptime("%Y/%m", "1910/06"):strptime("%Y/%m", "1945/06")] i(x) FILE using 1:2 via p,q
set xrange [*:*]
plot FILE using 1:2 with lp lw 2 t'annual avg (decadally smoothed)', \
f(x) t 'full range fit' lw 2, \
[:"1970/06"] g(x) t '1850-1970 fit' lw 2, \
["1970/06":] h(x) t '1970-2018 fit' lw 2,\
[strptime("%Y/%m", "1910/06"):strptime("%Y/%m", "1945/06")] i(x) t '1910-1945 fit' lw 2
set output
Output:

Related

In gnuplot show only the maxmimum point of the graph and highlight it

In Gnuplot I write below code:
set xlabel "Time in Seconds"
set ylabel "Resistance in Ohms"
while(1){
set multiplot layout 2, 1 title " " font ",12"
set tmargin 1.5
set title "MQ7 Gas Sensor Data"
unset key
plot 'putty2.log' using 0:1 with lines ,'' using 0:2:2 with labels center boxed bs 1 notitle column
set title "MQ9 Gas Sensor Data"
unset key
plot 'putty2.log' using 0:3 with lines
pause 1;
reread;
}
This code is described by drawing the multiplot of the data file 'putty.log' in Gnuplot. After doing this I got this:
but I want to show only the maximum point in the 1st multigraph.
Any help will be appreciated.
As starting point, the following script is a simple way to identify maxima in noisy curves. Actually, the random test data generation takes almost more lines than the maxima extraction.
On the smoothened curve you simply check if the 3 consecutive y-values y0,y1,y2 fulfil y0<y1 && y1>y2, then you have a maximum at y1.
The smoothing via smooth bezier might not be suitable for all type of data. Maybe some averaging together with smoothing might lead to better results.
For example, in the example below the human eye would also detect maxima at 35 and 42.
Futhermore, if you also want to display the y-values of the maxima, the Bezier smoothing probably will mostly return too low values compared to what averaging would give.
I hope you can optimize the script for your data and special needs.
Script:
### find maxima on smoothened data
reset session
# create some random test data
set table $Backbone
set samples 30
plot [0:100] '+' u 1:(rand(0)*10+10) w table
set table $CSpline
set samples 1000
plot $Backbone u 1:2 smooth cspline
set table $Data
noise(h) = (rand(0)*2-1)*h
spike(p,h) = rand(0) < p ? (rand(0)*2-1)*h : 0
plot $CSpline u 1:($2 + noise(1) + spike(0.2,3)) w table
unset table
# smooth the data to facilitate identification of maxima
set table $Smooth
set samples 200
plot $Data u 1:2 smooth bezier
unset table
# simple maxima extraction
set table $Maxima
plot x2=x1=y2=y1=NaN $Smooth u (x0=x1,x1=x2,x2=$1,y0=y1,y1=y2,y2=$2, y0<y1 && y1>y2 ? x1 : NaN):(y1) w table
unset table
set yrange[0:]
set key noautotitle
plot $Data u 1:2 w l lc "red", \
$Smooth u 1:2 w l lc "blue", \
$Maxima u 1:2 w impulses lc "black", \
'' u 1:(0):(sprintf("%.2f",$1)) w labels left offset 1,0.5 rotate by 90 tc "blue"
### end of script
Result:

Gnuplot smoothing data in loglog plot

I would like to plot a smoothed curve based on a dataset which spans over 13 orders of magnitude [1E-9:1E4] in x and 4 orders of magnitude [1E-6:1e-2] in y.
MWE:
set log x
set log y
set xrange [1E-9:1E4]
set yrange [1E-6:1e-2]
set samples 1000
plot 'data.txt' u 1:3:(1) smooth csplines not
The smooth curve looks nice above x=10. Below, it is just a straight line down to the point at x=1e-9.
When increasing samples to 1e4, smoothing works well above x=1. For samples 1e5, smoothing works well above x=0.1 and so on.
Any idea on how to apply smoothing to lower data points without setting samples to 1e10 (which does not work anyway...)?
Thanks and best regards!
JP
To my understanding sampling in gnuplot is linear. I am not aware, but maybe there is a logarithmic sampling in gnuplot which I haven't found yet.
Here is a suggestion for a workaround which is not yet perfect but may act as a starting point.
The idea is to split your data for example into decades and to smooth them separately.
The drawback is that there might be some overlaps between the ranges. These you can minimize or hide somehow when you play with set samples and every ::n or maybe there is another way to eliminate the overlaps.
Code:
### smoothing over several orders of magnitude
reset session
# create some random test data
set print $Data
do for [p=-9:3] {
do for [m=1:9:3] {
print sprintf("%g %g", m*10**p, (1+rand(0))*10**(p/12.*3.-2))
}
}
set print
set logscale x
set logscale y
set format x "%g"
set format y "%g"
set samples 100
pMin = -9
pMax = 3
set table $Smoothed
myFilter(col,p) = (column(col)/10**p-1) < 10 ? column(col) : NaN
plot for [i=pMin:pMax] $Data u (myFilter(1,i)):2 smooth cspline
unset table
plot $Data u 1:2 w p pt 7 ti "Data", \
$Smoothed u 1:2 every ::3 w l ti "cspline"
### end of code
Result:
Addition:
Thanks to #maij who pointed out that it can be simplified by simply mapping the whole range into linear space. In contrast to #maij's solution I would let gnuplot handle the logarithmic axes and keep the actual plot command as simple as possible with the extra effort of some table plots.
Code:
### smoothing in loglog plot
reset session
# create some random test data
set print $Data
do for [p=-9:3] {
do for [m=1:9:3] {
print sprintf("%g %g", m*10**p, (1+rand(0))*10**(p/12.*3.-2))
}
}
set print
set samples 500
set table $SmoothedLog
plot $Data u (log10($1)):(log10($2)) smooth csplines
set table $Smoothed
plot $SmoothedLog u (10**$1):(10**$2) w table
unset table
set logscale x
set logscale y
set format x "%g"
set format y "%g"
set key top left
plot $Data u 1:2 w p pt 7 ti "Data", \
$Smoothed u 1:2 w l lc "red" ti "csplines"
### end of code
Result:
Using a logarithmic scale basically means to plot the logarithm of a value instead of the value itself. The set logscale command tells gnuplot to do this automatically:
read the data, still linear world, no logarithm yet
calculate the splines on an equidistant grid (smooth csplines), still linear world
calculate and plot the logarithms (set logscale)
The key point is the equidistant grid. Let's say one chooses set xrange [1E-9:10000] and set samples 101. In the linear world 1e-9 compared to 10000 is approximately 0, and the resulting grid will be 1E-9 ~ 0, 100, 200, 300, ..., 9800, 9900, 10000. The first grid point is at 0, the second one at 100, and gnuplot is going to draw a straight line between them. This does not change when afterwards logarithms of the numbers are plotted.
This is what you already have noted in your question: you need 10 times more points to get a smooth curve for smaller exponents.
As a solution, I would suggest to switch the calculation of the logarithms and the calculation of the splines.
# create some random test data, code "stolen" from #theozh (https://stackoverflow.com/a/66690491)
set print $Data
do for [p=-9:3] {
do for [m=1:9:3] {
print sprintf("%g %g", m*10**p, (1+rand(0))*10**(p/12.*3.-2))
}
}
set print
# this makes the splines smoother
set samples 1000
# manually account for the logarithms in the tic labels
set format x "10^{%.0f}" # for example this format
set format y "1e{%+03.0f}" # or this one
set xtics 2 # logarithmic world, tic distance in orders of magnitude
set ytics 1
# just "read logarithm of values" from file, before calculating splines
plot $Data u (log10($1)):(log10($2)) w p pt 7 ti "Data" ,\
$Data u (log10($1)):(log10($2)) ti "cspline" smooth cspline
This is the result:

Gnuplot with Errorbars and line of regression

I want to graph some values with errorbars but it somehow doesnt work. Can you help me please?
431.00E12 0.69 47.00E5
567.00E12 1.10 58.00E5
662.00E12 1.75 67.00E5
watched a lot of videos and tutorials and did exactly what they did but it doesnt work.. The part with Regression and so on worked fine but now I want those error bars horizontally. My textfile is in this order:
x-Value y-Value DeltaX
The DeltaX should be the Errorbar so the errorbar schould look like this: at point x, the errorbar has length from x+-DeltaX.
Could you please tell me the code that combines the regression line and the Errorbars?
plot "/Users/amar/Desktop/dgd.txt" using 1:2:3 with errorbars, f(x)
Check help xerrorbars.
A delta x which is 8 orders of magnitude smaller than the x-value will be difficult to see as errorbar. Just to demonstrate xerrorbars, I changed it to a similar order of magnitude.
With the following code:
### xerrorbars
reset session
$Data <<EOD
431.00E12 0.69 47.00E12
567.00E12 1.10 58.00E12
662.00E12 1.75 67.00E12
EOD
set key left
f(x) = a*x + b
a = 1e-15 # some initial guesses
b = -1
set fit nolog brief
fit f(x) $Data u 1:2 via a,b
plot $Data u 1:2:3 with xerrorbars pt 7 lc rgb "red", \
f(x) title sprintf("f(x) = %g * x + %g",a,b)
### end of code
You'll get:

Fitting graph and draw lines by selecting ranges in horizontal axis

I am trying to plot a graph and fitting it using a linear line.
f1(x)=a1+b1*x
fit [0:80] f1(x) 'diff-xy-bcmLyo25perS.dat' via a1,b1
f2(x)=a2+b2*x
fit [100:220] f2(x) 'diff-xy-bcmLyo25perS.dat' via a2,b2
And I tried to plot both the plots into the same graph using command:
f(x) = x < 60 ? f1(x) : f2(x)
plot 'diff-xy-bcmLyo25perS.dat' using 1:2 with lines linestyle 1 title "{/Symbol b}BCMal-C_{12}C_{8}", f(x) lw 3.0 lc rgb 'black'
I get a plot as above.
In that plot one could see that there are two lines intersecting at 80 (horizontal scale) and it makes shape like 'v'.
I wish to eliminate that 'v' shape intersection and I would like to get two separate lines, one from 0 to 80 and the other one from 100 to 220.
How I could get this?
Appreciate any help.
Thanks in advance.
You could exploit that gnuplot doesn't plot infinity and NaN values (as 1.0/0)
Using
plot_if_in_range(y,x,lower,upper) = (x>=lower && x<=upper)?(y):(1.0/0)
You could easily plot any function in given domain:
plot plot_if_in_range(exp(x) , x, -5, 2), \
plot_if_in_range(sin(x)+x, x, -2, 5)
With gnuplot 5.0 you can specify different range for different functions:
set style data lines
plot 'diff-xy-bcmLyo25perS.dat' using 1:2 ls 1, \
[0:80] f1(x) lw 3.0 lc rgb 'black',\
[100:220] f2(x) lw 3.0 lc rgb 'black'
Note, that this works only, because you first plot the data file. Plotting only
plot [0:80] f1(x), [100:220] f2(x)
wouldn't work, since the first range settings are equivalent to a global set xrange [0:80] (it has always been), so that the second function wouldn't be visible at all.
However, in your case it should work fine.
Edit:
Sorry, this is basically the same idea as Sergei Izmailov's answer which I missed.
Answer:
Use the special file "+", which provides x values for your plot that you can then sample using a function of your choice, including one that ignores input if it's outside of range. Then you can use your f1(x) and f2(x) directly:
plot "+" using ($1):(0 < $1 && $1 < 80 ? f1($1) : 1/0), \
"+" using ($1):(100 < $1 && $1 < 220 ? f2($1) : 1/0)

How to restrict yrange for fit in gnuplot

I used the following scripts for plotting and fitting.
Data set:
2.474 2.659
0.701 2.637
0.582 2.643
0.513 2.666
0.403 2.639
0.308 2.615
0.218 2.561
0.137 2.537
Script:
reset
set key bottom right
f(x) = a*atan(x/b); a = 2.65; b = 2.5
fit f(x) 'test.txt' u 1:2 via a,b
plot 'test.txt' u 1:2 w p not, f(x) t 'f(x)'
The plot looks like this:
I am trying to restrict it between min_y and max_y. The following intuitive code failed horribly,
fit [y=2.537:2.659] f(x) 'test.txt' u 1:2 via a,b
Any suggestion on restriction would be highly appreciated! Thanks!
The range option only specifies which input points should be used, not restricting the output. So far as I can see from the manual, restrictions on the output value of f(x) aren't really possible (and so far as I can see from the problem, not really desirable).
You should be able also to do it simply by defining a fit range [][].
The following code works also with gnuplot4.6 which was the version in 2014.
"Data.dat":
1 2
2 3
3 4
1 9
2 8
3 7
Code:
### fit with limited y-range
reset
set xrange[0:10]
set yrange[0:10]
f(x) = a*x + b
set multiplot layout 3,1
fit [*:*][0:5] f(x) "Data.dat" u 1:2 via a,b
plot "Data.dat" u 1:2 w p pt 7 lc rgb "red" not,\
f(x) t sprintf("Fitrange: [*:*][0:5]\nf(x) = %g*x + %g",a,b)
fit [*:*][5:10] f(x) "Data.dat" u 1:2 via a,b
plot "Data.dat" u 1:2 w p pt 7 lc rgb "red" not,\
f(x) t sprintf("Fitrange: [*:*][5:10]\nf(x) = %g*x + %g",a,b)
fit [*:*][0:10] f(x) "Data.dat" u 1:2 via a,b
plot "Data.dat" u 1:2 w p pt 7 lc rgb "red" not,\
f(x) t sprintf("Fitrange: [*:*][0:10]\nf(x) = %g*x + %g",a,b)
unset multiplot
### end of code
Result:
This is an old question, but I arrived here looking for a solution to a similar problem. The answer is to use the stats command:
stats 'test.txt'
This will analyze, by default, the y data and set a bunch of STATS_* variables, which you can use in your fit statement along with the ternary operator:
fit f(x) 'test.txt' u 1:($2 >= STATS_min && $2 <= STATS_max ? $2 : NaN) via a,b
You can also add a using clause to the stats statement to further filter the data to match your fit statement, if needed.

Resources