Smooth command not supporting variable colors? - gnuplot

I am trying to make a plot in GNUplot using smooth csplines command. The data file can have many different sections to plot (not constant) and i wold like to use the lc variable option to differentiate them with different color. Am i wrong of is not supporting the lc variable option?

Correct, you cannot mix smooth and lc palette in a single plot command. You could write the smoothed data to an intermediate file with set table and then plot this data with lc palette.
Consider the example file test.txt:
1
3
2
5
4
6
Now plot this with:
set table 'tmp.txt'
plot 'test.txt' using 0:1 smooth cspline
unset table
And then plot the file tmp.txt with lc rgb variable or similar:
rgb(r,g,b) = 65536 * int(r) + 256 * int(g) + int(b)
plot 'test.txt' using 0:1 pt 7 t 'original', \
'tmp.txt' using 1:2:($2 < 4.2 ? rgb(255,0,0) : rgb(0,255,0)) with lines lc rgb var t 'smoothed'
Result with 4.6.4:
Note, that this doesn't allow you to use some criteria contained in an additional column of your original data for coloring (say, in the third column of test.txt). That would require much more fiddling.

Related

Skipping alternate data points in gnuplot

I am using the following code to get a graph:
set term jpeg size "600,600"
set output "test2.jpeg"
unset key
set xtic 500
set ytic 100
set title "DD-ME2"
plot "nkBDDME2.out" us 2:1 lc -1 lw 2 with lines , "nkDDME2.out" us 2:1 lc rgb "#FF4433" pt 5 ps 0.5
plot
In the plot, the points are very close together making the other line less visible. Is there any way to plot alternate data points to space out the points? Is there any way of doing this without directly manipulating the data file by deleting the alternate data values?

Subtract smoothed data from original

I wonder whether there is a way to subtract smoothed data from original ones when doing things of the kind:
plot ["17.12.2020 08:00:00":"18.12.2020 20:00:00"] 'data3-17-28.csv1' using 4:5 title 'Sensor 3' with lines, \
'' using 4:5 smooth acsplines
Alternatively I would need to do it externally, of course.
As #Suntory already suggested you can plot smoothed data into a table.
However, keep in mind, the number of datapoints will be determined by set samples, default setting is 100 and the smoothed datapoints will be equidistant. So, if you set samples to the number of your datapoints and your data is equidistant as well, then all should be fine.
Concatenating data line by line is not straightforward in gnuplot, since gnuplot is not intended to do such operations.
The following gnuplot-only solution assumes that you have your data in a datablock $Data without headers and empty lines. If not, you could either plot it with table from file into a table named $Data or use the following approach in the accepted answer of this question: gnuplot: load datafile 1:1 into datablock
If you don't have equidistant data, you need to interpolate data, which is also not straightforward in gnuplot, see: Resampling data with gnuplot
It's up to you: either you use external tools (which might not be platform-independent) or you apply a somewhat cumbersome platform independent gnuplot-only solution.
Code:
### plot difference of data to smoothed data
reset session
$Data <<EOD
1 0
2 13
3 16
4 17
5 11
6 8
7 0
EOD
stats $Data u 0 nooutput # get number of rows or datapoints
set samples STATS_records
set table $Smoothed
plot $Data u 1:2 smooth acsplines
unset table
# put both datablock into one
set print $Difference
do for [i=1:|$Data|] {
print sprintf('%s %s',$Data[i],$Smoothed[i+4])
}
set print
plot $Data u 1:2 w lp pt 7, \
$Smoothed u 1:2 w lp pt 6, \
$Difference u 1:($2-$4) w lp pt 4 lc "red"
### end of code
Result:
If I well understand you would like this :
First write your smooth's data in out.csv file
set table "out.csv" separator comma
plot 'file' u 4:5 smooth acsplines
unset table
Then this line will paste 'out.csv' to file as an appended column.You will maybe need to delete first lines using sed command (sed '1,4d' out.csv)
stats 'file' matrix
Thanks to stats we automatically get the number of column in your original data (STATS_size_x).
plot "< paste -d' ' file out.csv" u 4:($5-$(STATS_size_x+2)) w l
Could you please try this small code on your data.

Remove duplicated outliers in gnuplot boxplot [duplicate]

I have a large set of data points. I try to plot them with a boxplot, but some of the outliers are the exact same value and they are represented on a line beside each other. I found How to set the horizontal distance between outliers in gnuplot boxplot, but it doesn't help too much, as it is apparently not possible.
Is it possible to group the outliers together, print one point and then print a number in brackets beside it to indicate how many points there are? I think this would make it more readable in a graph.
For information, I have three boxplots for one x value and that times six in one graph. I am using gnuplot 5 and already played around with the pointsize, which doesn't reduce the distance anymore.
I hope you can help!
Edit:
set terminal pdf
set output 'dat.pdf'
file0 = 'dat1.dat'
file1 = 'dat2.dat'
file2 = 'dat3.dat'
set pointsize 0.2
set notitle
set xlabel 'X'
set ylabel 'Y'
header = system('head -1 '.file0);
N = words(header)
set xtics ('' 1)
set for [i=1:N] xtics add (word(header, i) i)
set style data boxplot
plot file0 using (1-0.25):1:(0.2) with boxplot lw 2 lc rgb '#8B0000' fs pattern 16 title 'A'
plot file1 using (1):1:(0.2) with boxplot lw 2 lc rgb '#00008B' fs pattern 4 title 'B'
plot file2 using (1+0.25):1:(0.2) with boxplot lw 2 lc rgb '#006400' fs pattern 5 title 'C'
for [i=2:N] plot file0 using (i-0.25):i:(0.2) with boxplot lw 2 lc rgb '#8B0000' fs pattern 16 notitle
for [i=2:N] plot file1 using (i):i:(0.2) with boxplot lw 2 lc rgb '#00008B' fs pattern 4 notitle
for [i=2:N] plot file2 using (i+0.25):i:(0.2) with boxplot lw 2 lc rgb '#006400' fs pattern 5 notitle
What is the best way to implement it with this code already in place?
There is not option to have this done automatically. Required steps to do this manually in gnuplot are:
(In the following I assume, that the data file data.dat has only a single column.)
Analyze your data with stats to determine the boundaries for the outliers:
stats 'data.dat' using 1
range = 1.5 # (this is the default value of the `set style boxplot range` value)
lower_limit = STATS_lo_quartile - range*(STATS_up_quartile - STATS_lo_quartile)
upper_limit = STATS_up_quartile + range*(STATS_up_quartile - STATS_lo_quartile)
Count only the outliers and write them to a temporary file
set table 'tmp.dat'
plot 'data.dat' using 1:($1 > upper_limit || $1 < lower_limit ? 1 : 0) smooth frequency
unset table
Plot the boxplot without the outliers, and the outliers with the labels plotting style:
set style boxplot nooutliers
plot 'data.dat' using (1):1 with boxplot,\
'tmp.dat' using (1):($2 > 0 ? $1 : 1/0):(sprintf('(%d)', int($2))) with labels offset 1,0 left point pt 7
And this needs to be done for every single boxplot.
Disclaimer: This procedure should work basically, but having no example data I couldn't test it.

Displaying markers on specific values in Gnuplot's line plot

I have data for a CDF in a file which looks like the following:
0.033 0.0010718113612
0.034 0.0016077170418
0.038 0.0021436227224
... ...
... ...
0.847 0.999464094319
0.862 1.0
First column is the X-axis value and the second column is the CDF value on Y-axis. I set the line style as follows:
set style line 1 lc rgb 'blue' lt 1 lw 2 pt 7 ps 0.75 # --- blue
and subsequently plot the line with the following:
plot file1 using 1:2 title 'Test Line CDF' with linespoints ls 1
This all works fine, the problem seems to be that my CDF file is pretty big (about 250 rows) and Gnuplot would plot the marker/point (a circle in this case) for every data point. This results in a very "dense" line because of the over-concentration of markers such that the underlying line is almost not visible as I show in an example image below:
How can I selectively draw the markers so that instead of having them on all data points, I plot them after every 50 data points, without having to decrease the number of data points (which I believe is what "every n" in the plot command would do) in my data file or decrease the marker size?
There is no need to use two plots commands, just use the pointinterval option:
plot 'data' pointinterval 5 with linespoints
That plots every line segment, but only every fifth point symbol.
The big advantage is, that you can control the behaviour with set style line:
set style line 1 lc rgb 'blue' lt 1 lw 2 pt 7 ps 0.75 pi 5
plot 'data' w lp ls 1
You can plot the same function twice, once with lines only, and then with points every n points. This will draw less points without decreasing the amount of segments. I think this is what you want to achieve. For this example I have done set table "data" ; plot sin(x) to generate numerical sampling of the sin(x) function.
What you have at the moment is:
plot "data" with linespoints pt 7
which gives
Now you can do the following:
plot "data" with lines, "data" every 10 with points pt 7 lc 1
which gives what you want:
You can change the styling to meet your needs.
Although #Miguel beat me to it, but I'm also posting my solution below:
The idea is to once draw the line and then draw the points with the "every n" specifier. I changed my own Gnuplot script in the following manner. A kind of hack but works:
set style line 1 lc rgb 'blue' lt 1 lw 2 pt 7 ps 0 # --- blue
plot file1 using 1:2 title '' with linespoints ls 1, "" using 1:2 every 20 title 'Test Line CDF' with points ls 1 ps 0.75
This retains the nice curve, without quantizing it too coarsely while also keeping the points much better spaced.

Gnuplot read line style from data file column

I'd like to draw an impulse graph from a text file that looks like this:
II 5 0 0 288.40 1.3033e+14
II 6 0 0 289.60 1.5621e+14
II 1 4 0 302.70 3.0084e+13
II 2 4 0 303.40 4.0230e+13
II 1 5 1 304.40 3.4089e+13
The plot conceptually should be plot "datafile.dat" using 5:6 w impulses ls $2.
Basically, given a previously defined set of line styles, I'd like to input the line style number from column 2 for every couple of plotted points from column 5 and 6.
Also I'd like to create a text box, for every plotted point, taking strings from the first four columns.
Does somebody know if that's possible?
To use the data from column two as line style use set style increment user and linecolor variable:
set style increment user
plot "datafile.dat" using 5:6:2 with impulses lc var
In order to place a label, use the labels plotting style:
plot "datafile.dat" using 5:6:1 with labels offset 0,1
Putting everything together, you have:
set style increment user
set for [i=1:6] style line i lt i
set yrange [0:*]
set offsets 0,0,graph 0.1,0
plot "datafile.dat" using 5:6:2 with impulses lc var, "" using 5:6:1 with labels offset 0,1
The result with 4.6.3 is:
Thanks for the helpful answer above. It almost solved my problem
I'm actually trying to use a column from my data file to specify a linestyle (dot, squares,triangles, whatever as long as it's user-defined), and not a linecolor. Is there any way to do that?
This line works : I get points with different colors (specified in column 4), but the point style is the same.
plot "$file" u 1:2:4 w p notitle lc var, "" using 1:2:3 with labels offset 0,1 notitle
Replacing lc with ls after defining my own styles doesn't work (ls can't have variable as an option)
I can live without different linestyles, but it would be much prettier.
You only have to replace the lineset for [i=1:6] style line i lt i for set for [i=1:6] style line i lt i pt %, Where % can be any type of point you want

Resources