I would like to draw a straight line that makes the average of a curve. I am plotting my data like that:
plot 'dataset' u 2:4 w p smooth bezier
My data consists of multiple columns and I would get something like that:
Any ideas of how to do it? I guess it is more an interpolation than an average. It is not relevant the ups and downs of the curve, and it would be much better to have a straight line interpolating the curve...
Using a straight line could be more or less easy to fit using fit however, how could I fit a curve that does not look like a well know curve? Let me show you an example? How could I fit a smooth curve among the main group of points? Please notice that there is some noise on the lower part of the graph that I wouldn't like to represent.
If you want to do some basic statistics on your data, gnuplot has a builtin command stats which may do what you want. Gnuplot offers some internal variables after plotting that contain data about min, max, etc. To see what these are, type show variables all after plotting your data.
Otherwise if you want to fit your data to a line, gnuplot does that as well:
f(x) = a*x + b
fit f(x) 'data.dat' using 2:4 via a,b
plot 'data.dat' using 2:4, f(x)
Related
Let's say I have a fitted curve in gnuplot (or simply sin(x) function) and file with data - points nearby the function. How to compute the distance of points from the curve and write them to the file with data in gnuplot? Is it possible to implement easily sum of squares in gnuplot? Thank you very much
Your question seems to mix two different concepts. If the curve was fitted to the points then the component term in the sum-of-squares uses the difference in y values. I.e. for a point [xi, yi] the term is (func(xi) - yi)**2.
But this is not the same thing as "distance of the point from the curve", since the nearest point on the curve may be at some different x value. The answer to that question in general requires calculus and is not something that gnuplot is designed to help you with, although if you work out the relevant equation you could use gnuplot's "fit" to find the minimum by approximation rather than by solving the differential equation analytically.
To plot the residuals after fitting
Assume data points [xi, yi] in columns 1 and 2 of file "data".
Assume fit(x) is the function you got from fitting. Then you can plot the residual for each point:
plot 'data' using 1:( (fit($1)-$2)**2 ) with linespoints
I am new to Gnuplot and unfortunately have to start with a (for me) nontrivial problem. I have X-Y-Z-Temperature data. So I have for every spatial coordinate a temperature value.
This comes somewhat closest
http://pgfplots.net/tikz/examples/contour-and-surface/
However, I would like to create a heat map (not contour) on the XY XZ and YZ plane to visualise the 4D data better (in the link it is just 3D).
So on each plane just a heat map using the same color code so that the temperatures can be compared.
Many thanks!
Toby
You can make '4d' plot with palette, e.g:
splot '3d.dat' u 1:2:3:4 palette pt 9
So you mean e.g. plotting a triedron T(x,y,z=0), T(x=0,y,z) and T(x,y=0,z) ? This should be possible with multiplot and rotating the view between each plot. This will be a fair amount of hacking, so the first question would be why you don't use other visualization software like paraview or mayavi ? These are more suited for this type of data, unless you need the flexibility of gnuplot either in terms of scripting, or in terms of plotting analytical functions on the same graph.
gnuplot will plot a data file with vectors.
I would like to set something like isosamples=40 and
have gradient vectors plotted for a 2D function.
I know I could write a python program to generate
the data file for the vectors but
I would prefer to do the entire operation within gnuplot.
Any advice?
Would this be a worthy improvement to gnuplot if not yet implemented?
You can use the vectors style and do the computations in gnuplot using the special filename ++.
Suppose that I wish to graph a direction field for the differential equation y'=y*sin(x). I could do this with the following:
set xrange[-2:2]
set yrange[-2:2]
set samples 30
set isosamples 30
unset key
f(x,y) = y*sin(x)
lf(x,y) = sqrt(1+f(x,y)**2)
lyf(x,y) = f(x,y)/lf(x,y)
plot '++' u 1:2:(0.1/lf($1,$2)):(0.1*lyf($1,$2)) with vectors
All of my calculations are done in gnuplot. I compute the direction of the vectors, and scale them to have a length of 0.1, all in the plot command.
I use the special filename '++' which generates a set of points equally spaced over the x and y plot ranges.
See 'help special-filenames' for more details on that.
i try to fit this plot as you cans see the fit is not so good for the data.
My code is:
clear
reset
set terminal pngcairo size 1000,600 enhanced font 'Verdana,10'
set output 'LocalEnergyStepZoom.png'
set ylabel '{/Symbol D}H/H_0'
set xlabel 'n_{step}'
set format y '%.2e'
set xrange [*:*]
set yrange [1e-16:*]
f(x) = a*x**b
fit f(x) "revErrEnergyGfortCaotic.txt" via a,b
set logscale
plot 'revErrEnergyGfortCaotic.txt' w p,\
'revErrEnergyGfortRegular.txt' w p,\
f(x) w l lc rgb "black" lw 3
exit
So the question is how mistake i compute here? because i suppose that in a log-log plane a fit of the form i put in the code should rappresent very well the data.
Thanks a lot
Finally i can be able to solve the problem using the suggestion in the answer of Christop and modify it just a bit.
I found the approximate slop of the function (something near to -4) then taking this parameter fix i just fit the curve with only a, found it i fix it and modify only b. After that using the output as starting solution for the fit i found the best fit.
You must find appropriate starting values to get a correct fit, because that kind of fitting doesn't have one global solution.
If you don't define a and b, both are set to 1 which might be too far away. Try using
a = 100
b = -3
for a better start. Maybe you need to tweak those value a bit more, I couldn't because I don't have the data file.
Also, you might want to restrict the region of the fitting to the part above 10:
fit [10:] f(x) "revErrEnergyGfortCaotic.txt" via a,b
Of course only, if it is appropriate.
This is a common issue in data analysis, and I'm not certain if there's a nice Gnuplot way to solve it.
The issue is that the penalty functions in standard fitting routines are typically the sum of squares of errors, and try as you might, if your data have a lot of dynamic range, the errors for the smallest y-values come out to essentially zero from the point of view of the algorithm.
I recently taught a course to students where they needed to fit such data. Lots of them beat their (matlab) fitting routines into submission by choosing very stringent convergence criteria, but even this did not help too much.
What you really need to do, if you want to fit this power-law tail well, is to convert the data into log-log form and run a linear regression on that log-log representation.
The main problem here is that the residual errors of the function values of the higher x are very small compared to the residuals at lower x values. After all, you almost span 20 orders of magnitude on the y axis.
Just weight the y values with 1/y**2, or even better: if you have the standard deviations of your data points weight the values with 1/std**2. Then the fit should converge much much better.
In gnuplot weighting is done using a third data column:
fit f(x) 'data' using 1:2:(1/$2**2") via ...
Or you can use Raman Shah's advice and linearize the y axis and do a linear regression.
you need to use weights for your fit (currently low values are not considered as important) and have a better starting guess (via "pars_file.pars")
I have been wondering about this for a while, and it might already be implemented in gnuplot but I haven't been able to find info online.
When you have a data file, it is possible to exchange the axes and assign the "dummy variable", say x, (in gnuplot's help terminology) to the vertical axis:
plot "data" u 1:2 # x goes to horizontal axis, standard
plot "data" u 2:1 # x goes to vertical axis, exchanged axes
However, when you have a function, you need to resort to a parametric function to do this. Imagine you want to plot x = y² (as opposite to y = x²), then (as far as I know) you need to do:
set parametric
plot t**2,t
which works nicely in this case. I think however that a more flexible approach would be desirable, something like
plot x**2 axes y1x1 # this doesn't work!
Is something like the above implemented, or is there an easy way to use y as dummy variable without the need to set parametric?
So here is another ugly, but gnuplot-only variant: Use the special filename '+' to generate a dynamic data set for plotting:
plot '+' using ($1**2):1
The development version contains a new feature, which allows you to use dummy variables instead of column numbers for plotting with '+':
plot sample [y=-10:10] '+' using (y**2):(y)
I guess that's what come closest to your request.
From what I have seen, parametric plots are pretty common in order to achieve your needs.
If you really hate parametric plots and you have no fear for a VERY ugly solutions, I can give you my method...
My trick is to use a data file filled with a sequence of numbers. To fit your example, let's make a file sq with a sequence of reals from -10 to 10 :
seq -10 .5 10 > sq
And then you can do the magic you want using gnuplot :
plot 'sq' u ($1**2):($1)
And if you uses linux you can also put the command directly in the command line :
plot '< seq -10 .5 10' u ($1**2):($1)
I want to add that I'm not proud of this solution and I'd love the "axis y1x1" functionality too.
As far as I know there is no way to simply invert or exchange the axes in gnuplot when plotting a function.
The reason comes from the way functions are plotted in the normal plotting mode. There is a set of points at even intervals along the x axis which are sampled (frequency set by set samples) and the function value computed. This only allows for well-behaved functions; one y-value per x-value.