I want to fit the following data:
70 0.0429065
100 0.041212
150 0.040117
200 0.035018
250 0.024366
300 0.02017
350 0.018255
400 0.015368
to the following function which is combination of an exponantial and a gaussian functions:
$ f(x)= a1*(a2* exp(-x/T2e)+ exp(-(x/T2g)**2))
$ fit f(x) 'data' via a1,a2,T2e,T2g
But it keeps giving me the following results:
a1 = 0.0720021 +/- 0.04453 (61.84%)
a2 = 0.310022 +/- 0.9041 (291.6%)
T2e = 63291.7 +/- 2.029e+07 (3.206e+04%)
T2g = 252.79 +/- 32.36 (12.8%)
While when I try to fit it separetly to
$ g(x)=b* exp(-(x/T2g)**2)
$ fit g(x) 'data' via b,T2g
I get
b = 0.0451053 +/- 0.001598 (3.542%)
T2g = 359.359 +/- 16.89 (4.701%)
and
$ S(x)=S0* exp(-x/T2e)
$ fit S(x) 'data' via S0,T2e
gives:
S0 = 0.057199 +/- 0.003954 (6.913%)
T2e = 319.257 +/- 38.17 (11.96%)
I already tried to set the initial values but it didn't change the results.
Does anybody know what is wrong?
Thank you,
Ok, you can see an exponential decay with a hump which could be a Gaussian.
The approach, how I got to a fit: first, exclude the datapoints 100 and 150 and fit the exponental and then set a Gaussian approximately at 170.
You probably don't get a good fit, because at least the Gaussian peak is shifted by some value x1.
With the code:
### fitting
reset session
$Data <<EOD
70 0.0429065
100 0.041212
150 0.040117
200 0.035018
250 0.024366
300 0.02017
350 0.018255
400 0.015368
EOD
a = 0.055
T2e = 310
b = 0.008
x1 = 170
T2g = 54
Exponential(x) = a*exp(-x/T2e)
Gaussian(x) = b*exp(-((x-x1)/T2g)**2)
f(x) = Exponential(x) + Gaussian(x)
fit f(x) $Data u 1:2 via a,b,x1,T2e,T2g
plot $Data u 1:2 w lp pt 7, f(x) lc rgb "red"
### end of code
You'll get:
a = 0.0535048 +/- 0.00183 (3.42%)
b = 0.00833589 +/- 0.001006 (12.06%)
x1 = 170.356 +/- 5.664 (3.325%)
T2e = 315.114 +/- 12.94 (4.106%)
T2g = 54.823 +/- 12.13 (22.12%)
Related
So here is what I'm trying to do.
The values on x axis are from 10000, 20000, 30000, ... 100000. I'm trying to write it like this: 10, 20, 30, 40, ... 100 (only x axis)
Is there some way to do this in Gnuplot?
I have this so far:
(data.dat - example of data)
# x y
10000 +1.24241522E-04
11000 +1.28623514E-04
12000 +1.35229020E-04
13000 +1.43767741E-04
14000 +1.53409148E-04
15000 +1.63788695E-04
16000 +1.75429485E-04
17000 +1.88827813E-04
18000 +2.02984785E-04
19000 +2.20830420E-04
...
(my gnuplot script)
set term png
set out 'example.png'
U0 = 0.00732 #parameters for this particular problem
v1 = 68000
b1 = 6550
v2 = 59600
b2 = 6050
I = sqrt(-1)
A(w, w0, b) = ((w0)**2)/(((w0)**2) - ((w)**2) + 2*I*w*b)
f(x) = U0*abs(A(2*pi*x, 2*pi*v1, b1) - A(2*pi*x, 2*pi*v2, b2))
set xlabel "x"
set ylabel "y"
fit f(x) 'data.dat' u 1:2 via U0, v1, b1, v2, b2
plot 'data.dat' u 1:2 t "Title1" w p, U(x) t "Title2"
set out
But how do I do this?
I've tried this example
How to scale the axes in Gnuplot
but it doesn't work.
See below.
# I modified the things a little bit
f(x) = (.... ... ....)/1000
fit f(x) 'data.dat' u ($1/1000.):2 via U0, v1, b1, v2, b2
plot 'data.dat' u ($1/1000.):2 t "Title1" w p, f(x) t "Title2"
But now the fitted function disappears!
How can I modify x-axis without other function disappearing?
Does there exist a line command in gnuplot for this? I'm sure there has to be a more elegant way of writing this insted of dividing each function by a desired factor.
Two possible ways come to my mind:
if you want to avoid too many zeros in the xtic labels, simply set the xtic label format to engineering
set format x "%.0s%c"
This will show, e.g. 10000 and 100000 as 10k and 100k, respectively.
if you scale (in your case: divide) the x values of the data by factor of 1000, gnuplot will take this x range for plotting the function f(x). Since this is will give x values which are a factor of 1000 too small you have to scale your x values by a factor of 1000 accordingly (in your case: multiply).
Code:
### avoid too many zeros in xtic labels
reset session
# create some random test data
set print $Data
A = rand(0)*10+5
B = rand(0)*50000+25000
C = rand(0)*5000+5000
do for [i=10000:100000:500] {
print sprintf("%g %g",i,A*exp(-((real(i)-B)/C)**2))
}
set print
a=1; b=50000; c=5000 # give some reasonable starting values
f(x) = a*exp(-((x-b)/c)**2)
set fit quiet nolog
fit f(x) $Data u 1:2 via a,b,c
set multiplot layout 1,2
set format x "%.0s%c" # set xtics to engineering
plot $Data u 1:2 w p, \
f(x) w l lc "red"
set format x "%g" # set xtics to default
plot $Data u ($1/1000):2 w p, \
f(x*1000) w l lc "red"
unset multiplot
### end of code
Result:
I have data (data can be downloaded here: gauss_data) and need to find the area of a particular peak. From my data set, the peak seems to have some contribution from another peak. I made the fit on my data with 3 Gaussians using this code:
# Gaussian fit
reset
set terminal wxt enhanced
# Set fitting function
f(x) = g1(x)+g2(x)+g3(x)
g1(x) = p1*exp(-(x-m1)**2/(2*s**2))
g2(x) = p2*exp(-(x-m2)**2/(2*s2**2))
g3(x) = p3*exp(-(x-m3)**2/(2*s3**2))
# Estimation of each parameter
p1 = 100000
p2 = 2840
p3 = 28000
m1 = 70
m2 = 150
m3 = 350
s = 25
s2 = 100
s3 = 90
# Fitting & Plotting data
fit [0:480] f(x) 'spectrum_spl9.txt' via p1, m1, s, p2, m2, s2, p3, m3, s3
plot [0:550] 'spectrum_spl9.txt' lc rgb 'blue', f(x) ls 1, g1(x) lc rgb 'black', g2(x) lc rgb 'green' , g3(x) lc rgb 'orange'
and the result is shown in fig below
I need to calculate the area under the peak i.e. area f(x) - area g3(x). Is there any way to find/calculate the area of each function in Gnuplot?
Your data is equidistant in x-units with a step width of 1. So, you can simply sum up the intensity values multiplied by the width (which is 1). If you have irregular data then this would be a bit more complicated.
Code:
### determination of area below curve
reset session
FILE = "SO/spectrum_spl9.txt"
# fitting function
f(x) = g1(x)+g2(x)+g3(x)
g1(x) = p1*exp(-(x-m1)**2/(2*s1**2))
g2(x) = p2*exp(-(x-m2)**2/(2*s2**2))
g3(x) = p3*exp(-(x-m3)**2/(2*s3**2))
# Estimation of each parameter
p1 = 100000
p2 = 2840
p3 = 28000
m1 = 70
m2 = 150
m3 = 350
s1 = 25
s2 = 100
s3 = 90
set fit quiet nolog
fit [0:480] f(x) FILE via p1, m1, s1, p2, m2, s2, p3, m3, s3
set table $Difference
plot myIntegral=0 FILE u 1:(myIntegral=myIntegral+f($1)-g3($1),f($1)-g3($1)) w table
unset table
set samples 500 # set samples to plot the functions
plot [0:550] FILE u 1:2 w p lc 'blue' ti FILE noenhanced, \
f(x) ls 1, \
g1(x) lc rgb 'black', \
g2(x) lc rgb 'green', \
g3(x) lc rgb 'orange', \
$Difference u 1:2 w filledcurves lc rgb 0xddff0000 ti sprintf("Area: %.3g",myIntegral)
### end of code
Result:
Can you use the analytic integral under a Gaussian function?
y(x) = 1/(s*sqrt(2*pi)) * exp(-(x-m1)**2/(2*s**2))
integral(y) [-inf:inf] = 1
This would mean that:
I1 = integral(g1) = p1 * s1 * sqrt(2.0*pi)
I2 = integral(g2) = p2 * s2 * sqrt(2.0*pi)
area f(x) - area g3(x) = I1 + I2
Please double check the math :)
I am new to Gnuplot, I have a non-linear data set and I want to fit the data within the linear range only. I normally do the fitting and specifies the fit range using the following command and redo the fitting process by changing the fit range manually until I get the optimum range for the fit:
fit [0.2:0.6]f(x) "data.txt" u 2:3:6 yerror via m1,m2
plot "<(sed -n '15,500p' data.txt)" u 2:3:6 w yerr title 'Window A',[0:.6] f(x) notitle lc rgb 'black'
Is it possible to iteratively run the fit within some data range to obtain the optimum data range for the fit in Gnuplot?
The data is typically like this one:
data
Your data (I named the file 'mas_data.txt') looks like the following (please always show/provide relevant data in your question).
Data: (how to plot with zoom-in)
### plotting data with zoom-in
reset session
FILE = 'mas_data.txt'
colX = 2
colY = 3
set key top left
set multiplot
plot FILE u colX:colY w lp pt 7 ps 0.3 lc rgb "red" ti "Data", \
set title "Zoom in"
set origin 0.45,0.1
set size 0.5, 0.6
set xrange [0:1.0]
plot FILE u colX:colY w lp pt 7 ps 0.3 lc rgb "red" ti "Data"
unset multiplot
### end of code
Regarding the "optimum" fitting range, you could try the following procedure:
find the absolute y-minimum of your data using stats (see help stats)
limit the x-range from this minimum to the maximum x-value
do a linear fit with f(x)=a*x+b and remember the standard error value for the slope (here: a_err)
reduce the x-range by a factor of 2
go back to 3. until you have reached the number of iteration (here: N=10)
find the minimum of Aerr[i] and get the corresponding x-range
The assumption is if the relative error (Aerr[i]) has a minimum then you will have the "best" fitting range for a linear fit starting from the minimum of your data.
However, I'm not sure if this procedure will be robust for all of your datasets. Maybe there are smarter procedures. Of course, you can also decrease the xrange in different steps. This procedure could be a starting point for further adaptions and optimizations.
Code:
### finding "best" fitting range
reset session
FILE = 'mas_data.txt'
colX = 2
colY = 3
stats FILE u colX:colY nooutput # do some statistics
MinY = STATS_min_y # minimum y-value
MinX = STATS_pos_min_y # x position of minimum y-value
Xmax = STATS_max_x # maximum x-value
XRangeMax = Xmax-MinX
f(x,a,b) = a*x + b
set fit quiet nolog
N = 10
array A[N]
array B[N]
array Aerr[N]
array R[N]
set print $myRange
do for [i=1:N] {
XRange = XRangeMax/2**(i-1)
R[i] = MinX+XRange
fit [MinX:R[i]] f(x,a,b) FILE u colX:colY via a,b
A[i] = a
Aerr[i] = a_err/a*100 # asymptotic standard error in %
B[i] = b
print sprintf("% 9.3g % 9.3f %g",MinX,R[i],Aerr[i])
}
set print
print $myRange
set key bottom right
set xrange [0:1.5]
plot FILE u colX:colY w lp pt 7 ps 0.3 lc rgb "red" ti "Data", \
for [i=1:N] [MinX:R[i]] f(x,A[i],B[i]) w l lc i title sprintf("%.2f%%",Aerr[i])
stats [*:*] $myRange u 2:3 nooutput
print sprintf('"Best" fitting range %.3f to %.3f', MinX, STATS_pos_min_y)
### end of code
Result:
Zoom-in xrange[0:1.0]
0.198 19.773 1.03497
0.198 9.985 1.09066
0.198 5.092 1.42902
0.198 2.645 1.53509
0.198 1.421 1.81259
0.198 0.810 0.659631
0.198 0.504 0.738046
0.198 0.351 0.895321
0.198 0.274 2.72058
0.198 0.236 8.50502
"Best" fitting range 0.198 to 0.810
I want to reproduce this effect in gnuplot:
How can I achive it? If it can't be done, what software can I use to reproduce it?
Using a 2d kernel for every pixel can be done inside gnuplot. That way, more dense accumulations get brighter than single pixels. Check show palette rgbformulae and the respective chapter in the help to change the colours.
set term wxt size 300,300 background rgb 0
set view map
set samp 140
set dgrid3d 180,180, gauss kdensity2d 0.2,0.2
set palette rgbform 4,4,3
splot "+" us 1:(sin($1/3)**2*20):(1) with pm3d notitle
Disclaimer: It can be done with gnuplot as instructed in this answer but you should probably consider a different tool to draw this particular type of plot.
There is at least one way to do it, with preprocessing of the data. The idea is to mimic the glow effect by using a Gaussian kernel to smear the data points. Consider the following data, contained in a file called data:
1 2
1 2.1
1.1 2.2
2 3
3 4
I have purposely placed the first 3 points close to each other to be able to observe the intensified glow of neighboring points. These data look like this:
Now we smear the data points using a 2D Gaussian kernel. I have written the following python code to help with this. The code has a cutoff of 4 standard deviations (sx and sy) around each point. If you want the glow to be a circle, you should choose the standard deviations so that the sx / sy ratio is the same as the ratio of the x/y axes lengths in gnuplot. Otherwise the points will look like ellipses. This is the code:
import numpy as np
import sys
filename = str(sys.argv[1])
sx = float(sys.argv[2])
sy = float(sys.argv[3])
def f(x,y,x0,y0,sx,sy):
return np.exp(-(x-x0)**2/2./sx**2 -(y-y0)**2/2./sy**2)
datafile = open(filename, 'r')
data = []
for datapoint in datafile:
a, b = datapoint.split()
data.append([float(a),float(b)])
xmin = data[0][0]
xmax = data[0][0]
ymin = data[0][1]
ymax = data[0][1]
for i in range(1, len(data)):
if(data[i][0] < xmin):
xmin = data[i][0]
if(data[i][0] > xmax):
xmax = data[i][0]
if(data[i][1] < ymin):
ymin = data[i][1]
if(data[i][1] > ymax):
ymax = data[i][1]
xmin -= 4.*sx
xmax += 4.*sx
ymin -= 4.*sy
ymax += 4.*sy
dx = (xmax - xmin) / 250.
dy = (ymax - ymin) / 250.
for i in np.arange(xmin,xmax+dx, dx):
for j in np.arange(ymin,ymax+dy, dy):
s = 0.
for k in range(0, len(data)):
d2 = (i - data[k][0])**2 + (j - data[k][1])**2
if( d2 < (4.*sx)**2 + (4.*sy)**2):
s += f(i,j,data[k][0],data[k][1],sx,sy)
print i, j, s
It is used as follows:
python script.py data sx sy
where script.py is the name of the file where the code is located, data is the name of the data file, and sx and sy are the standard deviations.
Now, back to gnuplot, we define a palette that mimics a glowing pattern. For isolated points, the summed Gaussians yield 1 at the position of the point; for overlapping points it yields values higher than 1. You must consider that when defining the palette. The following is just an example:
set cbrange [0:3]
unset colorbox
set palette defined (0 "black", 0.5 "blue", 0.75 "cyan", 1 "white", 3 "white")
plot "< python script.py data 0.05 0.05" w image
You can see that the points are actually ellipses, because the ratio of the axes lengths is not the same as that of the standard deviations along the different directions. This can be easily fixed:
plot "< python script.py data 0.05 0.06" w image
Set a black background, and then plot your dataset several time in different colours with decreasing pointsize.
set term wxt backgr rgb "black"
plot sin(x) w p pt 7 ps 2 lc rgb 0x00003f not, \
sin(x) w p pt 7 ps 1.5 lc rgb 0x00007f not, \
sin(x) w p pt 7 ps 1 lc rgb 0x0000af not, \
sin(x) w p pt 7 ps .5 lc rgb 0x0000ff
Alternatively, some combination of splot with pm3d,set dgrid3d gauss kdensity2d, and set view map, combined with a suitable palette, can be used, see my other answer.
I have a function f(x) = a/x and I have a set of data containing values for f(x) +- df(x) and x +- dx. How do I tell gnuplot to do a weighted fit for a with that?
I know that fitaccepts the using term and this works for df(x), but it does not work for dx. It seems gnuplot treats the error I have for x as the error for the whole RHS of my function (a/x +- dx).
How do I do a weighted fit that fits f(x) +- df(x) = a/(x +- dx) to find the optimal a?
Since version 5.0, gnuplot has an explicit provision for taking uncertainty in the independent variable into account
fit f(x) datafile using 1:2:3:4 xyerror
using "Orear's effective variance method".
(Above command expects data in the form x y dx dy.)
You're fitting an equation like:
z = a/(x +- dx)
This can be equivalently written as:
z = a/x +- dz
for an appropriate dz.
I think (if my calculus and statistics serve correctly), that you can calculate dz from x and dx by:
dz = partial_z/partial_x*dx
provided that dx is small.
For this case, that yields:
dz = -a/x**2*dx
So now you have a function of 2 variables (z = a/x - (a/x**2)*dx) that you want to fit for 1 constant. Of course, I could be wrong about this ... It's been a while since I've played with this stuff.
Here a simple example will suffice to prove that gnuplot is doing what you want:
Construct a flat text file data.dat with the following toy model data:
#f df x dx
1 0.1 1 0.1
2 0.1 2 0.1
3 0.1 3 0.1
4 0.1 4 0.1
10 1.0 5 0.1
Just by eying the data, we expect a good model would be a direct proportionality f = x, with one obvious outlier at x = 5, f = 10. We may tell gnuplot to fit this data in two very different ways. The following script weightedFit.gp demonstrates the difference:
# This file is called weightedFit.gp
#
# Gnuplot script file for demonstrating the difference between a
# weighted least-squares fit and an unweighted fit, using mock data in "data.dat"
#
# columns in the .dat are
# 1:f, 2:d_f, 3: x, 4: d_x
# x is the independent variable and f is the dependent variable
# you have to play with the terminal settings based on your system
# set term wxt
#set term xterm
set autoscale # scale axes automatically
unset log # remove any log-scaling
unset label # remove any previous labels
set xtic auto # set xtics automatically
set ytic auto # set ytics automatically
set key top left
# change plot labels!
set title "Weighted and Un-weighted fits"
set xlabel "x"
set ylabel "f(x)"
#set key 0.01,100
# start with these commented for auto-ranges, then zoom where you want!
set xr [-0.5:5.5]
#set yr [-50:550]
#this allows you to access ASE values of var using var_err
set fit errorvariables
## fit syntax is x:y:Delta_y column numbers from data.dat
#Fit data as linear, allowing intercept to float
f(x)=m*x+b
fW(x)=mW*x+bW
# Here's the important difference. First fit with no uncertainty weights:
fit f(x) 'data.dat' using 3:1 via m, b
chi = sprintf("chiSq = %.3f", FIT_WSSR/FIT_NDF)
t = sprintf("f = %.5f x + %.5f", m, b)
errors = sprintf("Delta_m = %.5f, Delta_b = %.5f", m_err, b_err)
# Now, weighted fit by properly accounting for uncertainty on each data point:
fit fW(x) 'data.dat' using 3:1:2 via mW, bW
chiW = sprintf("chiSqW = %.3f", FIT_WSSR/FIT_NDF)
tW = sprintf("fW = %.5f x + %.5f", mW, bW)
errorsW = sprintf("Delta_mW = %.5f, Delta_bW = %.5f", mW_err, bW_err)
# Pretty up the plot
set label 1 errors at 0,8
set label 2 chi at 0,7
set label 3 errorsW at 0,5
set label 4 chiW at 0,4
# Save fit results to disk
save var 'fit_params'
## plot using x:y:Delta_x:Delta_y column numbers from data.dat
plot "data.dat" using 3:1:4:2 with xyerrorbars title 'Measured f vs. x', \
f(x) title t, \
fW(x) title tW
set term jpeg
set output 'weightedFit.jpg'
replot
set term wxt
The generated plot weightedFit.jpg tells the story: the green fit does not take data point uncertainties into account and is a bad model for the data. The blue fit accounts for the large uncertainty in the outlier and properly fits the proportionality model, obtaining slope 1.02 +/- 0.13 and intercept -0.05 +/- 0.35.
As I just joined today, I lack the '10 reputation' needed to post images so you'll just have to run the script yourself to see the fit. Once you have the script and data file in your working directory, do:
gnuplot> load 'weightedFit.gp'
Your fit.log should look like this:
*******************************************************************************
Thu Aug 20 14:09:57 2015
FIT: data read from 'data.dat' using 3:1
format = x:z
x range restricted to [-0.500000 : 5.50000]
#datapoints = 5
residuals are weighted equally (unit weight)
function used for fitting: f(x)
f(x)=m*x+b
fitted parameters initialized with current variable values
iter chisq delta/lim lambda m b
0 1.0000000000e+01 0.00e+00 4.90e+00 2.000000e+00 -2.000000e+00
1 1.0000000000e+01 0.00e+00 4.90e+02 2.000000e+00 -2.000000e+00
After 1 iterations the fit converged.
final sum of squares of residuals : 10
rel. change during last iteration : 0
degrees of freedom (FIT_NDF) : 3
rms of residuals (FIT_STDFIT) = sqrt(WSSR/ndf) : 1.82574
variance of residuals (reduced chisquare) = WSSR/ndf : 3.33333
Final set of parameters Asymptotic Standard Error
======================= ==========================
m = 2 +/- 0.5774 (28.87%)
b = -2 +/- 1.915 (95.74%)
correlation matrix of the fit parameters:
m b
m 1.000
b -0.905 1.000
*******************************************************************************
Thu Aug 20 14:09:57 2015
FIT: data read from 'data.dat' using 3:1:2
format = x:z:s
x range restricted to [-0.500000 : 5.50000]
#datapoints = 5
function used for fitting: fW(x)
fW(x)=mW*x+bW
fitted parameters initialized with current variable values
iter chisq delta/lim lambda mW bW
0 2.4630541872e+01 0.00e+00 1.78e+01 1.024631e+00 -4.926108e-02
1 2.4630541872e+01 0.00e+00 1.78e+02 1.024631e+00 -4.926108e-02
After 1 iterations the fit converged.
final sum of squares of residuals : 24.6305
rel. change during last iteration : 0
degrees of freedom (FIT_NDF) : 3
rms of residuals (FIT_STDFIT) = sqrt(WSSR/ndf) : 2.86534
variance of residuals (reduced chisquare) = WSSR/ndf : 8.21018
p-value of the Chisq distribution (FIT_P) : 1.84454e-005
Final set of parameters Asymptotic Standard Error
======================= ==========================
mW = 1.02463 +/- 0.1274 (12.43%)
bW = -0.0492611 +/- 0.3498 (710%)
correlation matrix of the fit parameters:
mW bW
mW 1.000
bW -0.912 1.000
See http://gnuplot.info/ for documentation. Cheers!