Fit more than one block of data from the same file - gnuplot

I have these two blocks of data in the same file. Both represents a set of measurements that I want to fit then using a single script to compare each other. I know that it would be easier separate in two files and than fit each one separately but I'll have more than two blocks and it would be boring. Someone know how should I do it?.
I tried to use:
f(x) = a*x^b
f1(x) = a1*x^b1
fit f(x) "temp.dat" i 0 u 1:2:4 via a,b, f1(x) "temp.dat" i 1 u 1:2:4 via a1,b1
p f(x), "temp.dat" i 0 u 1:2:4 w yer, f1(x), "temp.dat" i 1 u 1:2:4
Thks
1 100 2.13048e-09 0.2 2.4178e-11
2 140 1.51668e-09 0.2 1.69698e-11
3 180 1.18001e-09 0.2 1.35081e-11
4
5 100 1.41599e-09 0.3 1.62087e-11
6 140 1.02526e-09 0.3 1.16511e-11
7 180 8.1794e-10 0.3 9.50745e-12

Note that your data file blocks should be separated by two blank lines in order to use the index option. Otherwise, with only one blank line, you need to use every.
That said, what you want to achieve can be done with eval and a do for loop:
do for [i=0:1] {
eval sprintf("f%i(x) = a%i + b%i * x", i, i, i)
eval sprintf("fit f%i(x) 'temp.dat' i %i via a%i, b%i", i, i, i, i)
}
plot "temp.dat" i 0, f0(x), "temp.dat" i 1, f1(x)

Related

Plotting Average curve for points in gnuplot

[Current]
I am importing a text file in which the first column has simulation time (0~150) the second column has the delay (0.01~0.02).
1.000000 0.010007
1.000000 0.010010
2.000000 0.010013
2.000000 0.010016
.
.
.
149.000000 0.010045
149.000000 0.010048
150.000000 0.010052
150.000000 0.010055
which gives me the plot:
[Desired]
I need to plot an average line on it like shown in the following image with red line:
Here is a gnuplot only solution with sample data:
set table "test.data"
set samples 1000
plot rand(0)+sin(x)
unset table
You should check the gnuplot demo page for a running average. I'm going to generalize this demo in terms of dynamically building the functions. This makes it much easier to change the number of points include in the average.
This is the script:
# number of points in moving average
n = 50
# initialize the variables
do for [i=1:n] {
eval(sprintf("back%d=0", i))
}
# build shift function (back_n = back_n-1, ..., back1=x)
shift = "("
do for [i=n:2:-1] {
shift = sprintf("%sback%d = back%d, ", shift, i, i-1)
}
shift = shift."back1 = x)"
# uncomment the next line for a check
# print shift
# build sum function (back1 + ... + backn)
sum = "(back1"
do for [i=2:n] {
sum = sprintf("%s+back%d", sum, i)
}
sum = sum.")"
# uncomment the next line for a check
# print sum
# define the functions like in the gnuplot demo
# use macro expansion for turning the strings into real functions
samples(x) = $0 > (n-1) ? n : ($0+1)
avg_n(x) = (shift_n(x), #sum/samples($0))
shift_n(x) = #shift
# the final plot command looks quite simple
set terminal pngcairo
set output "moving_average.png"
plot "test.data" using 1:2 w l notitle, \
"test.data" using 1:(avg_n($2)) w l lc rgb "red" lw 3 title "avg\\_".n
This is the result:
The average lags quite a bit behind the datapoints as expected from the algorithm. Maybe 50 points are too many. Alternatively, one could think about implementing a centered moving average, but this is beyond the scope of this question.
And, I also think that you are more flexible with an external program :)
Here's some replacement code for the top answer, which makes this also work for 1000+ points and much much faster. Only works in gnuplot 5.2 and later I guess
# number of points in moving average
n = 5000
array A[n]
samples(x) = $0 > (n-1) ? n : int($0+1)
mod(x) = int(x) % n
avg_n(x) = (A[mod($0)+1]=x, (sum [i=1:samples($0)] A[i]) / samples($0))
Edit
The updated question is about a moving average.
You can do this in a limited way with gnuplot alone, according to this demo.
But in my opinion, it would be more flexible to pre-process your data using a programming language like python or ruby and add an extra column for whatever kind of moving average you require.
The original answer is preserved below:
You can use fit. It seems you want to fit to a constant function. Like this:
f(x) = c
fit f(x) 'S1_delay_120_LT100_LU15_MU5.txt' using 1:2 every 5 via c
Then you can plot them both.
plot 'S1_delay_120_LT100_LU15_MU5.txt' using 1:2 every 5, \
f(x) with lines
Note that this is technique can be used with arbitrary functions, not just constant or lineair functions.
I wanted to comment on Franky_GT, but somehow stackoverflow didn't let me.
However, Franky_GT, your answer works great!
A note for people plotting .xvg files (e.g. after doing analysis of MD simulations), if you don't add the following line:
set datafile commentschars "##&"
Franky_GT's moving average code will result in this error:
unknown type in imag()
I hope this is of use to anyone.
For gnuplot >=5.2, probably the most efficient solution is using an array like #Franky_GT's solution.
However, it uses the pseudocolumn 0 (see help pseudocolumns). In case you have some empty lines in your data $0 will be reset to 0 which eventually might mess up your average.
This solution uses an index t to count up the datalines and a second array X[] in case a centered moving average is desired. Datapoints don't have to be equidistant in x.
At the beginning there will not be enough datapoints for a centered average of N points so for the x-value it will use every second point and the other will be NaN, that's why set datafile missing NaN is necessary to plot a connected line at the beginning.
Code:
### moving average over N points
reset session
# create some test data
set print $Data
y = 0
do for [i=1:5000] {
print sprintf("%g %g", i, y=y+rand(0)*2-1)
}
set print
# average over N values
N = 250
array Avg[N]
array X[N]
MovAvg(col) = (Avg[(t-1)%N+1]=column(col), n = t<N ? t : N, t=t+1, (sum [i=1:n] Avg[i])/n)
MovAvgCenterX(col) = (X[(t-1)%N+1]=column(col), n = t<N ? t%2 ? NaN : (t+1)/2 : ((t+1)-N/2)%N+1, n==n ? X[n] : NaN) # be aware: gnuplot does integer division here
set datafile missing NaN
plot $Data u 1:2 w l ti "Data", \
t=1 '' u 1:(MovAvg(2)) w l lc rgb "red" ti sprintf("Moving average over %d",N), \
t=1 '' u (MovAvgCenterX(1)):(MovAvg(2)) w l lw 2 lc rgb "green" ti sprintf("Moving average centered over %d",N)
### end of code
Result:

How to draw a filledcurve graph with GNUPlot by using 2 csv files

If I have 2 csv files ("CSV1.csv" dataname_1 and "CSV2.csv" dataname_2), how can I draw filled curve graph from the data of 2 csv files. The formats of these CSV files are identical, where 2 is timestamps and 5 is the value thus the using 2:5
I am trying this:
plot dataname_2 using 2:5 title "Above" with filledcurves above lc rgb 'blue',\
dataname_1 using 2:5 title "Below" with filledcurves below lc rgb 'red',\
dataname_2 using 2:5 title "Engine Starts" with lines lc rgb "#1E90FF",\
dataname_1 using 2:5 title "Engine Hours" with lines lc rgb "#FF1493"
I need to modify the code above so that the output is:
A solution which will probably always work is to prepare the data with whatever external tool in such a way that gnuplot can handle and plot it. I'm aware that the philosophy of gnuplot is to concentrate on plotting and not necessarily on data preparation for plotting. However, it is always good to have a minimum set of features to do some basic data preparation.
In your case you have several problems, well, let's call it challenges ;-)
with filledcurves requires data within the same file or datablock
however, gnuplot cannot easily merge datafiles line by line (it can with some workaround, see: https://stackoverflow.com/a/61559658/7295599). Simply appending would be no problem with gnuplot.
the latter doesn't help you since with filledcurves needs identical x for upper and lower curve, i.e. x,y1,y2 and your data has x1,y1 and x2,y2
however, gnuplot cannot easily resample data (it can with some workaround, see: Resampling data with gnuplot)
with filledcurves cannot directly fill curves with non-monotonic increasing x (not the case with your data. Here just for illustration purposes) (it can with some workaround see: https://stackoverflow.com/a/53769446/7295599 or https://stackoverflow.com/a/56176717/7295599)
So a workaround for all this could be the following (works with gnuplot 5.2, maybe can be tweaked to work with earlier versions):
Assumptions:
Data x1,y1 and x2,y2 in two files or datablocks
Data has not necessarily identical x, i.e. x1,y1 and x2,y2
Data may contain non-monotonic x
the two curves have only one intersection (well, the workaround below will just take the first one)
Procedure:
if not already, get the data into a datablock.
Find the intersection of the curves.
Create new datablocks: Filled1 using Data1 from the beginning to the intersection point and using Data2 backwards from the intersection point to the beginning. Filled2 using Data1 from the end backwards to the intersection point and using Data2 from the intersection point to the end.
Plot $Data1 and $Data2 with lines and $Filled1 and $Filled2 with filledcurves
Steps 2 and 3, probably will not be much shorter in another programming language unless there are dedicated functions.
Get files to datablock: (see also here gnuplot: load datafile 1:1 into datablock)
# get files to datablocks
set table $Data1
plot 'myFile1.dat' u 1:2 w table
set table $Data2
plot 'myFile2.dat' u 1:2 w table
unset table`
Code: (copy&paste for gnuplot >=5.2)
### fill intersecting curves from two files not having identical x
reset session
$Data1 <<EOD
1 1
2 0
4 1
3 3
5 5
6 6
8 8
9 9
EOD
$Data2 <<EOD
1 3
3.5 5
7.5 1
9 7
EOD
# orientation of 3 points a,b,c: -1=clockwise, +1=counterclockwise
Orientation(a,b,c) = sgn((word(b,1)-word(a,1))*(word(c,2)-word(a,2)) - \
(word(c,1)-word(a,1))*(word(b,2)-word(a,2)))
# check for intersection of segment a-b with segment c-d,
# 0=no intersection, 1=intersection
IntersectionCheck(a,b,c,d) = \
Orientation(a,c,b) == Orientation(a,d,b) || \
Orientation(c,a,d) == Orientation(c,b,d) ? 0 : 1
# coordinate of intersection
M(a,b) = real(word(a,1)*word(b,2) - word(a,2)*word(b,1))
N(a,b,c,d) = (word(a,1)-word(b,1))*(word(c,2)-word(d,2)) - \
(word(a,2)-word(b,2))*(word(c,1)-word(d,1))
Px(a,b,c,d) = (M(a,b)*(word(c,1)-word(d,1)) - (word(a,1)-word(b,1))*M(c,d))/N(a,b,c,d)
Py(a,b,c,d) = (M(a,b)*(word(c,2)-word(d,2)) - (word(a,2)-word(b,2))*M(c,d))/N(a,b,c,d)
Intersection(a,b,c,d) = sprintf("%g %g", Px(a,b,c,d), Py(a,b,c,d))
stop = 0
do for [i=1:|$Data1|-1] {
a = $Data1[i]
b = $Data1[i+1]
do for [j=1:|$Data2|-1] {
c = $Data2[j]
d = $Data2[j+1]
if (IntersectionCheck(a,b,c,d)) {
i0 = i; j0 = j
stop=1; break }
}
if (stop) { break }
}
# create the datablocks for the outline to be filled
set print $Filled1
do for [k=1:i0] { print $Data1[k] }
print Intersection(a,b,c,d)
do for [k=j0:1:-1] { print $Data2[k] }
set print $Filled2
do for [k=|$Data1|:i0+1:-1] { print $Data1[k] }
print Intersection(a,b,c,d)
do for [k=j0+1:|$Data2|] { print $Data2[k] }
set print
set key top left
plot $Filled1 u 1:2 w filledcurves lc rgb 0x3f48cc, \
$Filled2 u 1:2 w filledcurves lc rgb 0xed1c24, \
$Data1 u 1:2 w lp pt 7 lw 5 lc rgb 0x99d9ea, \
$Data2 u 1:2 w lp pt 7 lw 5 lc rgb 0xff80c0
### end of code
Result:

Loopsteps are getting skipped when if statement is inserted

I'm puzzled about the behaviour of my plotting script. I want to plot multiple files. In the last file i want different border settings. So i came up with an if statement. The script looks like:
labeltitles = "0.01 0.02 0.05 0.1 0.15 0.1885 0.2 0.25"
outnames = "0p01 0p02 0p05 0p1 0p15 0p1885 0p2 0p25"
do for [i=1:2] {
set border 4 + 1 ## top (4) + bottom (1)
if (i = words(labeltitles)) {set border 8 + 4 + 1} ## right (8) + top (4) + bottom (1)}
set xlabel 'z = '.word(labeltitles,i)
set out word(outnames,i).'.eps'
plot 'data.dat' u (column(i+1)):1 w l lt 1 lw 7 lc rgb '#444444'
}
When i run this script, only the last plot gets outputted. If i comment the if statement all plots get outputted. I also tried to add the else statement, but nothing changes.
A rather classical programming error: You have an assignment inside the if-condition instead of a comparison (==). It must be
if (i == words(labeltitles)) {set border 8 + 4 + 1}

GNUPLOT: Show a x value given a y value

i'm having some problems with gnuplot
I have to draw a cdf function and i'm interested in the values of variable x when F(x) is equal to 0.1 and 0.9
How can I tell Gnuplot to show me on the x axis the value corresponding to a given value on the y value (in my example those values are 0.1 and 0.9)
thanks
You're basically asking gnuplot to solve an equation. In your particular case, actually two equations: F(x)=0.1 and F(x)=0.9. As far as I know this cannot be done, but I might be wrong. What you can do if you simply want a graphical solution, is make a conditional plot, and ask that when F(x) is very close to 0.1 0.9, gnuplot plots something other than the function.
For example, assume f(x)=x^2 and you want to know "graphically" for which x f(x)=0.1. Then you can request the value abs(f(x) - 0.1) be small, for example < 0.01. Then tell gnuplot to go to zero (just an example!) if this is the case, otherwise plot f(x)=x^2:
f(x)=x**2
set xrange [-2:2]
set samples 1000
plot abs(f(x)-1) < 0.01 ? 0 : f(x)
Which yields:
The two peaks that go to zero mark graphically on the x axis the solution to the equation f(x)=0.1. Of course, you need gnuplot to sample this point in order to see a peak. Thus you need to play with set samples and set xrange.
From your question it is not clear whether you have a function F(x) as expression or just a x,y-data file. I assume that your function is monotonic increasing in x and y.
Two solutions come to my mind:
via simple linear interpolation
via curve fitting
Let's create some test data. For this, let's assume your function is known (as expression) and something like this (check help norm): F(x) = a*norm(b*x + c)
Let's take a = 1; b = 0.8; c = -4. In the example below, sampling will be only 8, just for illustration purpose.
You can easily set samples 200 and you will get the same results as for the curve fitting method below. From gnuplot 5.0 on, you could write the data into a datablock instead of a file on disk.
Data: SO22276755.dat
0 3.16712e-05
1.42857 0.002137
2.85714 0.043238
4.28571 0.283855
5.71429 0.716145
7.14286 0.956762
8.57143 0.997863
10 0.999968
Script 1: (basically works for gnuplot 4.6.0, March 2012)
### interpolate x-values
reset
FILE = "SO22276755.dat"
yis = '0.10 0.90'
yi(n) = real(word(yis,n))
xis = ''
xi(n) = real(word(xis,n))
Interpolate(yi) = (x1-x0)/(y1-y0)*(yi-y0) + x0
getXis(xis) = xis.(n=words(xis), n<words(yis) ? yi=real(word(yis,n+1)) : 0, \
y0<=yi && y1>=yi ? sprintf(" %g",Interpolate(yi)) : '')
set key left top noautotitle
set grid x,y
plot x1=y1=NaN FILE u (x0=x1,x1=$1):(y0=y1,y1=$2,xis=getXis(xis),y1) \
w l lc rgb "blue" ti "data", \
'+' u (xi=xi(int($0+1))):(yi=yi(int($0+1))):\
(sprintf("(%.4g|%.4g)",xi,yi)) every ::0::1 \
w labels point pt 7 lc rgb "red" right offset -1,0 ti "interpolated"
### end of script
Result:
Script 2: (basically works for gnuplot>=4.6.0, March 2012)
With this approach you are fitting your known function F(x) to constant lines, i.e. your desired values 0.1 and 0.9. For this, a file will be created (could be a datablock for gnuplot>=5.0) and it will basically look like this SO22276755.fit:
0 0.1
1 0.1
0 0.9
1 0.9
### interpolate x-values
reset
F(x) = a*norm(b*x+c) # function
a = 1
b = 0.8
c = -4
yis = '0.10 0.90'
yi(n) = real(word(yis,n))
xis = ''
xi(n) = real(word(xis,n))
set key left top noautotitle
set grid x,y
# create fit levels file
LEVELS = "SO22276755.fit"
set table LEVELS
set samples 2
plot for [i=1:words(yis)] '+' u (yi(i))
unset table
xmin = 0
xmax = 10
set xrange[xmin:xmax]
set samples 100
xis = ''
do for [i=1:words(yis)] {
xi = (xmin+xmax)*0.5 # set start value
fit F(xi) LEVELS u 1:2 index i-1 via xi
xis = xis.sprintf(" %g",xi)
}
plot F(x) w l lc rgb "web-green" ti "F(x)", \
'+' u (xi=xi(int($0+1))):(yi=yi(int($0+1))):(sprintf("(%.4g|%.4g)",xi,yi)) \
every ::0::1 w labels point pt 7 lc rgb "red" righ offset -1,0 ti "fitted"
### end of script
Result:

Different number of samples for different functions

plot x+3 , x**2+5*x+12
Is it possible to set x+3 to have only 2 samples and x**2+5*x+12 to have say 1000 samples in the same plot?
It can be done, but not out-of-the-box.
The first variant uses a temporary file to save one function with a low sampling rate and plotting it later together with the high-resolution function:
set samples 2
set table 'tmp.dat'
plot x+3
unset table
set samples 1000
plot 'tmp.dat' w lp t 'x+3', x**2 + 5*x + 12
This has the advantage, that you can use any sampling rates for both functions.
For you special case of 2 samples for one function, it can be done without an external file, but it involves quite some tricking:
set xrange [-10:10]
s = 1000
set samples s
f1(x) = x + 3
set style func linespoints
set style data linespoints
plot '+' using (x0 = (($0 == 0 || $0 == (s-1) )? $1 : x0), \
($0 < (s-2) ? 1/0 : x0)):(f1(x0)) t 'x+3',\
x**2 + 5*x + 12
What I did here is:
Use the special filename + to generate a set of coordinates in the current xrange. This must be set, no autoscaling is possible.
Skipping all points but the first and the last by giving them the value 1/0 doesn't work, because the two remaining points aren't connected.
So I store the first x-value (when $0, or column(0) equals 0) and use it when I encountered the second last points. For the last points, the usual values are used.
That works for your special case of 2 samples.
You must keep in mind, that the first function is treated as data, so you must use both set style data and set style func (just to show it).
The result with 4.6.4 is:
I am not sure if different samplings (as opposed to different ranges) are possible with gnuplot 5.x. If I missed that please let me know.
Here is a suggestion to have two different samplings in the same plot command without temporary files (or datablocks from gnuplot 5.0 on).
A requirement is a known xrange, i.e. it will work with autoscale only if you plot and replot the graph to automatically get xmin and xmax. For the second function you could also use '+' u 1:(f2($1)) w lp.
Script: (works for gnuplot>=4.4.0, March 2010)
### different samplings in one plot command
reset
set xrange[xmin=-10:xmax=10]
f1(x) = x+3
f2(x) = x**2 + 5*x + 12
s1 = 3 # sampling 1
s2 = 101 # sampling 2
set samples (s1>s2?s1:s2) # the higher value
dx1 = real(xmax-xmin)/(s1-1) # determine dx1 for f1
plot '+' u (x0=xmin+$0*dx1):(f1(x0)) every ::0::s1-1 w lp pt 7 ti sprintf("%d samples",s1), \
f2(x) w lp pt 7 ti sprintf("%d samples",s2)
### end of script
Result:

Resources