How to remove line between "jumping" values, in gnuplot? - gnuplot

I would like to draw a line with plots that contain "jumping" values.
Here is an example: when we have plots of sin(x) for several cycles and plot it, unrealistic line will appear that go across from right to left (as shown in following figure).
One idea to avoid this might be using with linespoints (link), but I want to draw it without revising the original data file.
Do we have simple and robust solution for this problem?

Assuming that you are plotting a function, that is, for each x value there exists one and only one corresponding y value, the easiest way to achieve what you want is to use the smooth unique option. This smoothing routine will make the data monotonic in x, then plot it. When several y values exist for the same x value, the average will be used.
Example:
Data file:
0.5 0.5
1.0 1.5
1.5 0.5
0.5 0.5
Plotting without smoothing:
set xrange [0:2]
set yrange [0:2]
plot "data" w l
With smoothing:
plot "data" smooth unique

Edit: points are lost if this solution is used, so I suggest to improve my answer.
Here can be applied "conditional plotting". Suppose we have a file like this:
1 2
2 5
3 3
1 2
2 5
3 3
i.e. there is a backline between 3rd and 4th point.
plot "tmp.dat" u 1:2
Find minimum x value:
stats "tmp.dat" u 1:2
prev=STATS_min_x
Or find first x value:
prev=system("awk 'FNR == 1 {print $1}' tmp.dat")
Plot the line if current x value is greater than previous, or don't plot if it's less:
plot "tmp.dat" u ($0==0? prev:($1>prev? $1:1/0), prev=$1):2 w l

OK, it's not impossible, but the following is a ghastly hack. I really advise you add an empty line in your dataset at the breaks.
$dat << EOD
1 1
2 2
3 3
1 5
2 6
3 7
1 8
2 9
3 10
EOD
plot for [i=0:3] $dat us \
($0==0?j=0:j=j,llx=lx,lx=$1,llx>lx?j=j+1:j=j,i==j?$1:NaN):2 w lp notit
This plots your dataset three times (acually four, there is a small error in there. I guess i have to initialise all variables), counts how often the abscissa values "jump", and only plots datapoints if this counter j is equal to the plot counter i.
Check the help on the serial evaluation operator "a, b" and the ternary operator "a?b:c"

If you have data in a repetitive x-range where the corresponding y-values do not change, then #Miguel's smooth unique solution is certainly the easiest.
In a more general case, what if the x-range is repetitive but y-values are changing, e.g. like a noisy sin(x)?
Then compare two consecutive x-values x0 and x1, if x0>x1 then you have a "jump" and make the linecolor fully transparent, i.e. invisible, e.g. 0xff123456 (scheme 0xaarrggbb, check help colorspec). The same "trick" can be used when you want to interrupt a dataline which has a certain forward "jump" (see https://stackoverflow.com/a/72535613/7295599).
Minimal solution:
plot x1=NaN $Data u 1:2:(x0=x1,x1=$1,x0>x1?0xff123456:0x0000ff) w l lc rgb var
Script:
### plot "folded" data without connecting lines
reset session
# create some test data
set table $Data
plot [0:2*pi] for [i=1:4] '+' u 1:(sin(x)+rand(0)*0.5) w table
unset table
set xrange[0:2*pi]
set key noautotitle
set multiplot layout 1,2
plot $Data u 1:2 w l lc "red" ti "data as is"
plot x1=NaN $Data u 1:2:(x0=x1,x1=$1,x0>x1?0xff123456:0x0000ff) \
w l lc rgb var ti "\n\n\"Jumps\" removed\nwithout changing\ninput data"
unset multiplot
### end of script
Result:

Related

How to fill curve only up to a certain value, simultaneously with line gnuplot

I want to plot with filled curved only until when the value on the column one are less than zero. How should I do? Maybe plot two different line each time with different range?
My code is:
p "data.txt" ($1 <=0 ? $2 : 1/0) w filledcurves , '' ($1 <=0 ? $2 : 1/0) w filledcurves, \
...
But this way the value on the x-axis are wrong and the curve filled in a wrong strange way:
In this other way the x-axis are right while the curve still filled up in a werid way:
plot 'data.txt' u 1:($1<=0?$2:1/0)
Are those a bug in gnuplot?
While I would be something like this but filled up only until zero:
There is also a way to simultaneously plot with line, and select the line style?
There are several problems:
1 - You should use using or u like:
plot "data.txt" u ($1 <=0 ? $2 : 1/0)
2 - In this form you plot the 2nd column (or nothing; depending on the 1st column) aganist the 0th column (which (nearly) equals to the line number). To examine this effect, try and see:
plot "data.txt" u 2
3 - ANSWER: the correct syntax is like:
plot 'data.txt' u ($1<=0?$1:1/0):2
or
plot 'data.txt' u 1:($1<=0?$2:1/0)
which depends on what do you want to plot actually. The difference between them that the first one sets the xrange of the plot up to 0 while the second one displays a wider xrange (up to x_max of your data). However, in your case, you could use both of them. (Or you can mix them.) Eg.:
plot 'data.txt' u 1:($1<=40?$2:1/0) w filledcurves y=0, '' u 1:($1>40?$2:1/0) w l lc 1
The problem with your plot was that using a 'naked' filledcurves with a 2 column data set, the default option is closed which threats the dataset as a closed polygon.

Whether is it possible to plot normal probability distribution in gnuplot

My data file is as-
2 3 4 1 5 2 0 3 4 5 3 2 0 3 4 0 5 4 3 2 3 4 4 0 5 3 2 3 4 5 1 3 4
My requirement is to plot normal PDF in gnuplot.
I could do it by calculating f(x)
f(x) = \frac{1}{\sqrt{2\pi\sigma^2} } e^{ -\frac{(x-\mu)^2}{2\sigma^2} }
for each x using shell script.
Then I plot it in gnuplot using the command-
plot 'ifile.txt' using 1:2 with lines
But whether is it possible to plot directly in gnuplot?
gnuplot provides a number of processing options under the smooth keyword (try typing help smooth for more info). For your specific case, I would recommend a fit though.
First, note that your data points are in a row, you need to convert it to columns for gnuplot to use it. You can do it with awk:
awk '{for (i=1;i<=NF;i++) print $i}' datafile
which can be invoked from within gnuplot:
plot "< awk '{for (i=1;i<=NF;i++) print $i}' datafile" ...
Now assume that datafile has the right format for simplicity.
You can use the smooth frequency option to see how many occurrences of each value you have:
plot "datafile" u 1:(1.) smooth frequency w lp pt 7
To get the normalized distribution, you divide by the number of values. This can be done automatically within gnuplot with stats:
stats "datafile"
This will store the number of values in variable STATS_records, which in you case has value 33:
gnuplot> print STATS_records
33.0
So the normalized distribution (the probability of getting a value at x) is:
plot "datafile" u 1:(1./STATS_records) smooth frequency w lp pt 7
As you can see, your distribution doesn't really look like a normal distribution, but anyway, let's go on. Create a Gaussian for fitting and fit to your data, and plot it. You need to fit to the probability, rather than to the data itself. To do so, we plot to a table to extract the data generated by smooth frequency:
# Non-normalized Gaussian
f(x)= A * exp(-(x-x0)**2/2./sigma**2)
# Save probability data to table
set table "probability"
plot "datafile" u 1:(1./STATS_records) smooth frequency not
unset table
# Fit the Gaussian to the data, exclude points from table with grep
fit f(x) "< grep -v 'u' probability" via x0, sigma, A
# Normalize the gaussian
g(x) = 1./sqrt(2.*pi*sigma**2) * f(x) / A
# Plot
plot "datafile" u 1:(1./STATS_records) smooth frequency w lp pt 7, g(x)
set table generates some points which you should exclude, that's why I used grep to filter the file. Also, the Gaussian needs to be normalized after the fitting is done with a variable amplitude. If you want to retrieve the fitting parameters:
gnuplot> print x0, sigma
3.40584703189268 1.76237558717934
Finally note that if the spacing between data points is not homogeneous, e.g. instead of x = 0, 1, 2, 3 ... you have values at x = 0, 0.1, 0.5, 3, 3.2 ... then you'll need to use a different way to do this, for example defining bins of regular size to group data points.

Is there a way to put a label for the last entry in gnuplot?

I want to use gnuplot for real time plotting (Data gets appended to file which I use for plotting and I use replot for real time plotting). I also want to put a label for the latest entry which is plotted. So as to get a idea what is the latest value. Is there a way to do this?
If you are on a unixoid system, you can use tail to extract the last line from the file and plot it separately in whatever way you desire. To give a simple example:
plot\
"data.dat" w l,\
"< tail -n 1 data.dat" u 1:2:2 w labels notitle
This will plot the whole of data.dat with lines and the last point with labels, with the label depicting the value.
There is no need to use the Linux command tail, you can simply do it with gnuplot-only, hence platform-independently.
The principle: while plotting the data, you assign the values of column 1 and 2 to variables x0 and y0, respectively.
After the first plot command, x0 and y0 will contain the last values.
With this, you don't have to load the file a second time for extracting the last values.
For the label plotting, use these values and print the label with a sprintf() expression (check help sprintf).
The construct '+' u ... every ::0::0 is just one way of many ways to plot a single data point.
Data: SO28152083.dat
1 5.1
2 2.2
3 3.3
4 1.4
5 4.5
Script: (works with gnuplot 4.4.0, March 2010 or even with earlier versions)
### plot last value as label
reset
FILE = "SO28152083.dat"
set key noautotitle
set offsets 0.5,0.5,1,1
plot FILE u (x0=$1):(y0=$2) w lp pt 7 lc rgb "red" ti "data", \
'+' u (x0):(y0):(sprintf("%g",y0)) every ::0::0 w labels offset 0,1
### end of script
Result:

Displaying markers on specific values in Gnuplot's line plot

I have data for a CDF in a file which looks like the following:
0.033 0.0010718113612
0.034 0.0016077170418
0.038 0.0021436227224
... ...
... ...
0.847 0.999464094319
0.862 1.0
First column is the X-axis value and the second column is the CDF value on Y-axis. I set the line style as follows:
set style line 1 lc rgb 'blue' lt 1 lw 2 pt 7 ps 0.75 # --- blue
and subsequently plot the line with the following:
plot file1 using 1:2 title 'Test Line CDF' with linespoints ls 1
This all works fine, the problem seems to be that my CDF file is pretty big (about 250 rows) and Gnuplot would plot the marker/point (a circle in this case) for every data point. This results in a very "dense" line because of the over-concentration of markers such that the underlying line is almost not visible as I show in an example image below:
How can I selectively draw the markers so that instead of having them on all data points, I plot them after every 50 data points, without having to decrease the number of data points (which I believe is what "every n" in the plot command would do) in my data file or decrease the marker size?
There is no need to use two plots commands, just use the pointinterval option:
plot 'data' pointinterval 5 with linespoints
That plots every line segment, but only every fifth point symbol.
The big advantage is, that you can control the behaviour with set style line:
set style line 1 lc rgb 'blue' lt 1 lw 2 pt 7 ps 0.75 pi 5
plot 'data' w lp ls 1
You can plot the same function twice, once with lines only, and then with points every n points. This will draw less points without decreasing the amount of segments. I think this is what you want to achieve. For this example I have done set table "data" ; plot sin(x) to generate numerical sampling of the sin(x) function.
What you have at the moment is:
plot "data" with linespoints pt 7
which gives
Now you can do the following:
plot "data" with lines, "data" every 10 with points pt 7 lc 1
which gives what you want:
You can change the styling to meet your needs.
Although #Miguel beat me to it, but I'm also posting my solution below:
The idea is to once draw the line and then draw the points with the "every n" specifier. I changed my own Gnuplot script in the following manner. A kind of hack but works:
set style line 1 lc rgb 'blue' lt 1 lw 2 pt 7 ps 0 # --- blue
plot file1 using 1:2 title '' with linespoints ls 1, "" using 1:2 every 20 title 'Test Line CDF' with points ls 1 ps 0.75
This retains the nice curve, without quantizing it too coarsely while also keeping the points much better spaced.

Gnuplot read line style from data file column

I'd like to draw an impulse graph from a text file that looks like this:
II 5 0 0 288.40 1.3033e+14
II 6 0 0 289.60 1.5621e+14
II 1 4 0 302.70 3.0084e+13
II 2 4 0 303.40 4.0230e+13
II 1 5 1 304.40 3.4089e+13
The plot conceptually should be plot "datafile.dat" using 5:6 w impulses ls $2.
Basically, given a previously defined set of line styles, I'd like to input the line style number from column 2 for every couple of plotted points from column 5 and 6.
Also I'd like to create a text box, for every plotted point, taking strings from the first four columns.
Does somebody know if that's possible?
To use the data from column two as line style use set style increment user and linecolor variable:
set style increment user
plot "datafile.dat" using 5:6:2 with impulses lc var
In order to place a label, use the labels plotting style:
plot "datafile.dat" using 5:6:1 with labels offset 0,1
Putting everything together, you have:
set style increment user
set for [i=1:6] style line i lt i
set yrange [0:*]
set offsets 0,0,graph 0.1,0
plot "datafile.dat" using 5:6:2 with impulses lc var, "" using 5:6:1 with labels offset 0,1
The result with 4.6.3 is:
Thanks for the helpful answer above. It almost solved my problem
I'm actually trying to use a column from my data file to specify a linestyle (dot, squares,triangles, whatever as long as it's user-defined), and not a linecolor. Is there any way to do that?
This line works : I get points with different colors (specified in column 4), but the point style is the same.
plot "$file" u 1:2:4 w p notitle lc var, "" using 1:2:3 with labels offset 0,1 notitle
Replacing lc with ls after defining my own styles doesn't work (ls can't have variable as an option)
I can live without different linestyles, but it would be much prettier.
You only have to replace the lineset for [i=1:6] style line i lt i for set for [i=1:6] style line i lt i pt %, Where % can be any type of point you want

Resources