Gnuplot smoothing data in loglog plot - gnuplot

I would like to plot a smoothed curve based on a dataset which spans over 13 orders of magnitude [1E-9:1E4] in x and 4 orders of magnitude [1E-6:1e-2] in y.
MWE:
set log x
set log y
set xrange [1E-9:1E4]
set yrange [1E-6:1e-2]
set samples 1000
plot 'data.txt' u 1:3:(1) smooth csplines not
The smooth curve looks nice above x=10. Below, it is just a straight line down to the point at x=1e-9.
When increasing samples to 1e4, smoothing works well above x=1. For samples 1e5, smoothing works well above x=0.1 and so on.
Any idea on how to apply smoothing to lower data points without setting samples to 1e10 (which does not work anyway...)?
Thanks and best regards!
JP

To my understanding sampling in gnuplot is linear. I am not aware, but maybe there is a logarithmic sampling in gnuplot which I haven't found yet.
Here is a suggestion for a workaround which is not yet perfect but may act as a starting point.
The idea is to split your data for example into decades and to smooth them separately.
The drawback is that there might be some overlaps between the ranges. These you can minimize or hide somehow when you play with set samples and every ::n or maybe there is another way to eliminate the overlaps.
Code:
### smoothing over several orders of magnitude
reset session
# create some random test data
set print $Data
do for [p=-9:3] {
do for [m=1:9:3] {
print sprintf("%g %g", m*10**p, (1+rand(0))*10**(p/12.*3.-2))
}
}
set print
set logscale x
set logscale y
set format x "%g"
set format y "%g"
set samples 100
pMin = -9
pMax = 3
set table $Smoothed
myFilter(col,p) = (column(col)/10**p-1) < 10 ? column(col) : NaN
plot for [i=pMin:pMax] $Data u (myFilter(1,i)):2 smooth cspline
unset table
plot $Data u 1:2 w p pt 7 ti "Data", \
$Smoothed u 1:2 every ::3 w l ti "cspline"
### end of code
Result:
Addition:
Thanks to #maij who pointed out that it can be simplified by simply mapping the whole range into linear space. In contrast to #maij's solution I would let gnuplot handle the logarithmic axes and keep the actual plot command as simple as possible with the extra effort of some table plots.
Code:
### smoothing in loglog plot
reset session
# create some random test data
set print $Data
do for [p=-9:3] {
do for [m=1:9:3] {
print sprintf("%g %g", m*10**p, (1+rand(0))*10**(p/12.*3.-2))
}
}
set print
set samples 500
set table $SmoothedLog
plot $Data u (log10($1)):(log10($2)) smooth csplines
set table $Smoothed
plot $SmoothedLog u (10**$1):(10**$2) w table
unset table
set logscale x
set logscale y
set format x "%g"
set format y "%g"
set key top left
plot $Data u 1:2 w p pt 7 ti "Data", \
$Smoothed u 1:2 w l lc "red" ti "csplines"
### end of code
Result:

Using a logarithmic scale basically means to plot the logarithm of a value instead of the value itself. The set logscale command tells gnuplot to do this automatically:
read the data, still linear world, no logarithm yet
calculate the splines on an equidistant grid (smooth csplines), still linear world
calculate and plot the logarithms (set logscale)
The key point is the equidistant grid. Let's say one chooses set xrange [1E-9:10000] and set samples 101. In the linear world 1e-9 compared to 10000 is approximately 0, and the resulting grid will be 1E-9 ~ 0, 100, 200, 300, ..., 9800, 9900, 10000. The first grid point is at 0, the second one at 100, and gnuplot is going to draw a straight line between them. This does not change when afterwards logarithms of the numbers are plotted.
This is what you already have noted in your question: you need 10 times more points to get a smooth curve for smaller exponents.
As a solution, I would suggest to switch the calculation of the logarithms and the calculation of the splines.
# create some random test data, code "stolen" from #theozh (https://stackoverflow.com/a/66690491)
set print $Data
do for [p=-9:3] {
do for [m=1:9:3] {
print sprintf("%g %g", m*10**p, (1+rand(0))*10**(p/12.*3.-2))
}
}
set print
# this makes the splines smoother
set samples 1000
# manually account for the logarithms in the tic labels
set format x "10^{%.0f}" # for example this format
set format y "1e{%+03.0f}" # or this one
set xtics 2 # logarithmic world, tic distance in orders of magnitude
set ytics 1
# just "read logarithm of values" from file, before calculating splines
plot $Data u (log10($1)):(log10($2)) w p pt 7 ti "Data" ,\
$Data u (log10($1)):(log10($2)) ti "cspline" smooth cspline
This is the result:

Related

In gnuplot show only the maxmimum point of the graph and highlight it

In Gnuplot I write below code:
set xlabel "Time in Seconds"
set ylabel "Resistance in Ohms"
while(1){
set multiplot layout 2, 1 title " " font ",12"
set tmargin 1.5
set title "MQ7 Gas Sensor Data"
unset key
plot 'putty2.log' using 0:1 with lines ,'' using 0:2:2 with labels center boxed bs 1 notitle column
set title "MQ9 Gas Sensor Data"
unset key
plot 'putty2.log' using 0:3 with lines
pause 1;
reread;
}
This code is described by drawing the multiplot of the data file 'putty.log' in Gnuplot. After doing this I got this:
but I want to show only the maximum point in the 1st multigraph.
Any help will be appreciated.
As starting point, the following script is a simple way to identify maxima in noisy curves. Actually, the random test data generation takes almost more lines than the maxima extraction.
On the smoothened curve you simply check if the 3 consecutive y-values y0,y1,y2 fulfil y0<y1 && y1>y2, then you have a maximum at y1.
The smoothing via smooth bezier might not be suitable for all type of data. Maybe some averaging together with smoothing might lead to better results.
For example, in the example below the human eye would also detect maxima at 35 and 42.
Futhermore, if you also want to display the y-values of the maxima, the Bezier smoothing probably will mostly return too low values compared to what averaging would give.
I hope you can optimize the script for your data and special needs.
Script:
### find maxima on smoothened data
reset session
# create some random test data
set table $Backbone
set samples 30
plot [0:100] '+' u 1:(rand(0)*10+10) w table
set table $CSpline
set samples 1000
plot $Backbone u 1:2 smooth cspline
set table $Data
noise(h) = (rand(0)*2-1)*h
spike(p,h) = rand(0) < p ? (rand(0)*2-1)*h : 0
plot $CSpline u 1:($2 + noise(1) + spike(0.2,3)) w table
unset table
# smooth the data to facilitate identification of maxima
set table $Smooth
set samples 200
plot $Data u 1:2 smooth bezier
unset table
# simple maxima extraction
set table $Maxima
plot x2=x1=y2=y1=NaN $Smooth u (x0=x1,x1=x2,x2=$1,y0=y1,y1=y2,y2=$2, y0<y1 && y1>y2 ? x1 : NaN):(y1) w table
unset table
set yrange[0:]
set key noautotitle
plot $Data u 1:2 w l lc "red", \
$Smooth u 1:2 w l lc "blue", \
$Maxima u 1:2 w impulses lc "black", \
'' u 1:(0):(sprintf("%.2f",$1)) w labels left offset 1,0.5 rotate by 90 tc "blue"
### end of script
Result:

Gnuplot: oscilloscope-like line style?

Is it possible in Gnuplot to emulate the drawing style of an analogue oscilloscope, meaning thinner+dimmisher lines on larger amplitudes, like this:?
The effect you see in the oscilloscope trace is not due to amplitude, it is due to the rate of change as the trace is drawn. If you know that rate of change and can feed it to gnuplot as a third column of values, then you could use it to modulate the line color as it is drawn:
plot 'data' using 1:2:3 with lines linecolor palette z
I don't know what color palette would work best for your purpose, but here is an approximation using a function with an obvious, known, derivative.
set palette gray
set samples 1000
plot '+' using ($1):(sin($1)):(abs(cos($1))) with lines linecolor palette
For thickness variations, you could shift the curve slightly up and down, and fill the area between them.
f(x) = sin(2*x) * sin(30*x)
dy = 0.02
plot '+' u 1:(f(x)+dy):(f(x)-dy) w filledcurves ls 1 notitle
This does not allow variable colour, but the visual effect is similar.
Another approach:
As #Ethan already stated, the intensity is somehow proportional to the speed of movement, i.e. the derivative. If you have sin(x) as waveform, the derivative is cos(x). But what if you have given data? Then you have to calculate the derivative numerically.
Furthermore, depending on the background the line should fade from white (minimal derivative) to fully transparent (maximum derivative), i.e. you should change the transparency with the derivative.
Code:
### oscilloscope "imitation"
reset session
set term wxt size 500,400 butt # option butt, otherwise you will get overlap points
set size ratio 4./5
set samples 1000
set xrange[-5:5]
# create some test data
f(x) = 1.5*sin(15*x)*(cos(1.4*x)+1.5)
set table $Data
plot '+' u 1:(f($1)) w table
unset table
set xtics axis 1 format ""
set mxtics 5
set grid xtics ls -1
set yrange[-4:4]
set ytics axis 1 format ""
set mytics 5
set grid ytics ls -1
ColorScreen = 0x28a7e0
set obj 1 rect from screen 0,0 to screen 1,1 behind
set obj 1 fill solid 1.0 fc rgb ColorScreen
x0=y0=NaN
Derivative(x,y) = (dx=x-x0,x0=x,x-dx/2,dy=y-y0,y0=y,dy/dx) # approx. derivative
# get min/max derivative
set table $Dummy
plot n=0 $Data u (d=abs(Derivative($1,$2)),n=n+1,n<=2? (dmin=dmax=d) : \
(dmin>d ? dmin=d:dmin), (dmax<d?dmax=d:dmax)) w table
unset table
myColor(x,y) = (int((abs(Derivative(column(x),column(y)))-dmin)/(dmax-dmin)*0xff)<<24) +0xffffff
plot $Data u 1:2:(myColor(1,2)) w l lw 1.5 lc rgb var not
### end of code
Result:

Gnuplot: Scatter plot and density

I have x- and y-data points representing a star cluster. I want to visualize the density using Gnuplot and its scatter function with overlapping points.
I used the following commands:
set style fill transparent solid 0.04 noborder
set style circle radius 0.01
plot "data.dat" u 1:2 with circles lc rgb "red"
The result:
However I want something like that
Is that possible in Gnuplot? Any ideas?
(edit: revised and simplified)
Probably a much better way than my previous answer is the following:
For each data point check how many other data points are within a radius of R. You need to play with the value or R to get some reasonable graph.
Indexing the datalines requires gnuplot>=5.2.0 and the data in a datablock (without empty lines). You can either first plot your file into a datablock (check help table) or see here:
gnuplot: load datafile 1:1 into datablock
The time for creating this graph will increase with number of points O(N^2) because you have to check each point against all others. I'm not sure if there is a smarter and faster method. The example below with 1200 datapoints will take about 4 seconds on my laptop. You basically can apply the same principle for 3D.
Script: works with gnuplot>=5.2.0
### 2D density color plot
reset session
t1 = time(0.0)
# create some random rest data
set table $Data
set samples 700
plot '+' u (invnorm(rand(0))):(invnorm(rand(0))) w table
set samples 500
plot '+' u (invnorm(rand(0))+2):(invnorm(rand(0))+2) w table
unset table
print sprintf("Time data creation: %.3f s",(t0=t1,t1=time(0.0),t1-t0))
# for each datapoint: how many other datapoints are within radius R
R = 0.5 # Radius to check
Dist(x0,y0,x1,y1) = sqrt((x1-x0)**2 + (y1-y0)**2)
set print $Density
do for [i=1:|$Data|] {
x0 = real(word($Data[i],1))
y0 = real(word($Data[i],2))
c = 0
stats $Data u (Dist(x0,y0,$1,$2)<=R ? c=c+1 : 0) nooutput
d = c / (pi * R**2) # density: points per unit area
print sprintf("%g %g %d", x0, y0, d)
}
set print
print sprintf("Time density check: %.3f sec",(t0=t1,t1=time(0.0),t1-t0))
set size ratio -1 # same screen units for x and y
set palette rgb 33,13,10
plot $Density u 1:2:3 w p pt 7 lc palette z notitle
### end of script
Result:
Would it be an option to postprocess the image with imagemagick?
# convert into a gray scale image
convert source.png -colorspace gray -sigmoidal-contrast 10,50% gray.png
# build the gradient, the heights have to sum up to 256
convert -size 10x1 gradient:white-white white.png
convert -size 10x85 gradient:red-yellow \
gradient:yellow-lightgreen \
gradient:lightgreen-blue \
-append gradient.png
convert gradient.png white.png -append full-gradient.png
# finally convert the picture
convert gray.png full-gradient.png -clut target.png
I have not tried but I am quite sure that gnuplot can plot the gray scale image directly.
Here is the (rotated) gradient image:
This is the result:
Although this question is rather "old" and the problem might have been solved differently...
It's probably more for curiosity and fun than for practical purposes.
The following code implements a coloring according to the density of points using gnuplot only. On my older computer it takes a few minutes to plot 1000 points. I would be interested if this code can be improved especially in terms of speed (without using external tools).
It's a pity that gnuplot does not offer basic functionality like sorting, look-up tables, merging, transposing or other basic functions (I know... it's gnuPLOT... and not an analysis tool).
The code:
### density color plot 2D
reset session
# create some dummy datablock with some distribution
N = 1000
set table $Data
set samples N
plot '+' u (invnorm(rand(0))):(invnorm(rand(0))) w table
unset table
# end creating dummy data
stats $Data u 1:2 nooutput
XMin = STATS_min_x
XMax = STATS_max_x
YMin = STATS_min_y
YMax = STATS_max_y
XRange = XMax-XMin
YRange = YMax-YMin
XBinCount = 20
YBinCount = 20
BinNo(x,y) = floor((y-YMin)/YRange*YBinCount)*XBinCount + floor((x-XMin)/XRange*XBinCount)
# do the binning
set table $Bins
plot $Data u (BinNo($1,$2)):(1) smooth freq # with table
unset table
# prepare final data: BinNo, Sum, XPos, YPos
set print $FinalData
do for [i=0:N-1] {
set table $Data3
plot $Data u (BinNumber = BinNo($1,$2),$1):(XPos = $1,$1):(YPos = $2,$2) every ::i::i with table
plot [BinNumber:BinNumber+0.1] $Bins u (BinNumber == $1 ? (PointsInBin = $2,$2) : NaN) with table
print sprintf("%g\t%g\t%g\t%g", XPos, YPos, BinNumber, PointsInBin)
unset table
}
set print
# plot data
set multiplot layout 2,1
set rmargin at screen 0.85
plot $Data u 1:2 w p pt 7 lc rgb "#BBFF0000" t "Data"
set xrange restore # use same xrange as previous plot
set yrange restore
set palette rgbformulae 33,13,10
set colorbox
# draw the bin borders
do for [i=0:XBinCount] {
XBinPos = i/real(XBinCount)*XRange+XMin
set arrow from XBinPos,YMin to XBinPos,YMax nohead lc rgb "grey" dt 1
}
do for [i=0:YBinCount] {
YBinPos = i/real(YBinCount)*YRange+YMin
set arrow from XMin,YBinPos to XMax,YBinPos nohead lc rgb "grey" dt 1
}
plot $FinalData u 1:2:4 w p pt 7 ps 0.5 lc palette z t "Density plot"
unset multiplot
### end of code
The result:

Sample linear interpolation of data file

I have a data file example.dat with xy values, for example
0 10
1 40
5 20
How can I sample the linear interpolation of these points in gnuplot? I want to store that sampling in another file output.dat using set table. With cubic spline smoothing I can do
set table "output.dat"
set samples 10
plot "example.dat" smooth csplines
which yields an equidistant sampling of the cubic spline interpolation with 10 points. But I found no way to have such an equidistant sampling with linear interpolation: The sampling rate is just ignored (gnuplot 5.0).
I tried without any options and with linear interpolation "smoothing", like smooth unique, hoping that this would make gnuplot think of the dataset as a function which can be sampled, but to no avail.
My application is sampling different data files at a common grid for later comparison. I am aware that this is pushing the boundaries of what gnuplot is intended for, but since there is already a sampling mechanism I wonder if I am simply missing something.
In case this might still be of interest, the following is a "gnuplot only" solution. Not very elegant, but it seems to work.
### "gnuplot only" linear interpolation of data
reset session
$Data <<EOD
0 10
1 40
5 20
EOD
stats $Data u 1 nooutput
min = STATS_min
max = STATS_max
Samples=10
Interpolate(x0,y0,x1,y1,xi) = y0 + (y1-y0)/(x1-x0)*(xi-x0)
set print $Interpol
set table $Nowhere
do for [i=1:Samples] {
xi = min + (i-1)*(max-min)/(Samples-1)
do for [j=0:STATS_records-1] {
plot $Data u (a=$1,$1):(b=$2,$2) every ::j::j with table
plot $Data u (c=$1,$1):(d=$2,$2) every ::j+1::j+1 with table
if ( xi>=a && xi<=c) {
print sprintf("%g\t%g",xi,Interpolate(a,b,c,d,xi))
break
}
}
}
unset table
set print
set colorsequence classic
plot $Data u 1:2 w lp t "original data",\
$Data u 1:2 w lp smooth cspline t "smooth cspline",\
$Interpol u 1:2 w p pt 6 t "linear interpolation"
### end code
Hope I understood the question properly. You're having an equidistant sampling between 0 and 5, which in this case gives a step of 5/9=0.555556. To get a 0.5 distance between your samples, assuming your xrange[0:5], you should do set samples 11.
However, if you want to stick to 10 samples and all in steps of 0.5, you can tweak your xrange[0.5:5.0], which will create 9 steps of 0.5.

The function disappear close to zero

I have problem with plotting fitted function.
The part of the ploted function close to zero disappears and connected with the hyperbola or something which should not be there at all. This happen only if I change set xrange to something smaller than 0. I have to do this because I have lot of data points to close zero so it would look very ugly if I would not changed it.
I tried to use conditionals x>0?f(x):1/0 but it does not help. The hyperbola disappear but the function does not continue down as it should.
I use this code:
set terminal postscript eps size 3.5,2.62 enhanced color
set output "a.eps"
set xrange [-1:]
f(x)=a*b*x/(1+a*x)
fit f(x) "./a" via a, b
plot "./a" w p title "", f(x) w l title "Langmuir isotherm"
That is simply a matter of sampling. The default sampling rate is 100 (show samples), which isn't enough to show fast-varying functions. Increase the sampling rate with e.g.
set samples 1000
to have your function plotted correctly.
A second point is, that discontinuities aren't shown properly if no sample is located exactly at that position. Consider the following plot to demonstrate this:
set xrange [-1:1]
set multiplot layout 2,1
set samples 100
plot 1/x
set samples 101
plot 1/x
unset multiplot
So, if you want to plot the function correctly on both sides of the discontinuity, you must either define a small region around the discontinuity as undefined, or you plot the parts on the left and right separately:
set xrange [-1:]
f(x)=a*b*x/(1+a*x)
fit f(x) "./a" via a, b
left(x) = (x < -1/a ? f(x) : 1/0)
right(x) = (x > -1/a ? f(x) : 1/0)
plot "./a" w p title "", left(x) w l lt 2 title "Langmuir isotherm", right(x) w l lt 2 notitle

Resources