Stepwise function in gnuplot - gnuplot

I want to know how to create the plot of a stepwise function in Gnuplot. The function I want to plot includes the operations cost for several distance range and multiple products. For instance, if the distance is 0-300 Km for product 1 the cost is 1.05 $/Km and for product 2, it is 0.86 $/Km. When the distance increases, the cost for each product decrease.
I have defined one function for each product and plot them functions together:
gnuplot> f(x)=x<=300 ? 1.05 : x<=650 ? 0.65 : x<=1300 ? 0.46 : x<=1950 ? 0.4 : x<=3250 ? 0.31 : 0.22
gnuplot> x<=300 ? 0.86 : x<=650 ? 0.53 : x<=1300 ? 0.38: x<=1950 ? 0.32 : x<=3250 ? 0.24 : 0.19
gnuplot> plot [0:5000][0:3] f(x), g(x)
There is one problem: I can not remove the vertical lines. Any idea?
Thanks for your help

There are basically two approaches you can take. The best approach is to use a datafile, but you can use functions, although it will be more difficult.
Datafile Approach
You are probably going to have trouble doing this as a function, because you are going to get those vertical lines. A datafile gives you a little better control, and even allows you to mark the end points of the pieces of the piecewise function with the typical open/closed dots. Set up your data file with this format:
x y # left point of piece 1
x y # right point of piece 1
# one single blank line
x y # left point of piece 2
x y # right point of piece 2
# one single blank line
...
With your function f, we can do this like
0 1.05
300 1.05
300 0.65
650 0.65
650 0.46
1300 0.46
1300 0.4
1950 0.4
1950 0.31
3250 0.31
3250 0.22
6000 0.22
then plot datafile with lines gives
We can get even fancier with†
plot datafile w lines,\
last=0,\
"" u 1:(oldlast=last,last=$1,$1==oldlast?$2:1/0) w points pt 6 lt 1,\
last=0,\
"" u 1:(oldlast=last,last=$1,$1==oldlast?1/0:$2) w points pt 7 lt 1
to produce
Here we first plot the same curve as before. Then we initialize the variable last to be 0 (the value of the first x coordinate)‡, and plot the open dots.
To do this we evaluate (oldlast=last,last=$1,$1==oldlast?$2:1/0) which first stores the value of last as oldlast and then stores the value of the first column (the x coordinate) as last to use on the next point. Finally we check to see if the x-coordinate is the same as the value of oldlast (the value of the x-coordinate from the last point). If it is, we use the 2nd column value, otherwise we use the unplottable 1/0. This will cause points to be plotted only if the are the first point in the two point blocks. We plot these with points using pointstyle 6 (an open point) and linetype 1 (the same as used in the lines).
We do the same thing again, but this time plot the second points with filled dots (pointtype 7).
We can either add the points for the function g to the same file, separating it from the others by two blank lines and then use indexes to refer to them, or create a separate datafile for g. We can then add similar plot commands to the current command. For example, if we use the same file with function f followed by function g, we can do:
plot datafile i 0 w lines,\
last=0,\
"" i 0 u 1:(oldlast=last,last=$1,$1==oldlast?$2:1/0) w points pt 6 lt 1,\
last=0,\
"" i 0 u 1:(oldlast=last,last=$1,$1==oldlast?1/0:$2) w points pt 7 lt 1,\
datafile i 1 w lines,\
last=0,\
"" i 1 u 1:(oldlast=last,last=$1,$1==oldlast?$2:1/0) w points pt 6 lt 1,\
last=0,\
"" i 1 u 1:(oldlast=last,last=$1,$1==oldlast?1/0:$2) w points pt 7 lt 1
Function Approach
As far as getting only one jump, your functions have a lot of redundant conditions. Redefine f (and similarly for g) as
f(x)=x<=300 ? 1.05 : x<=650 ? 0.65 : x<=1300 ? 0.46 : x<=1950 ? 0.4 : x<=3250 ? 0.31 : 0.22
and then plot it. Make sure that the samples are set high enough, otherwise you may end up collecting multiple jumps together or get undesirable slanted lines. With
set xrange[0:6000]
set yrange[0:2]
set samples 1000
plot f(x)
we get
However, this will still get the vertical connecting lines. This is going to be very hard to avoid with a function. The best way that I can think of to avoid this is to inject a very small non-plottable value just before the breaks. For f(x), we can do this with
f(x)=x<=290 ? 1.05 : x<=300? (1/0) : x<=640? 0.65 : x<=650 ? (1/0) : x<=1290 ? 0.46 : x<=1300 ? (1/0) : x<=1940? 0.4 : x<=1950 ? (1/0) : x<=3240 ? 0.31: x<=3250? (1/0) : 0.22
Here, we have inject a non-plottable value of 1/0 for a region of length 10 just before the breaks. Smaller lengths can be used as well. If we set the samples high enough to be sure that the sampling hits each of these breaks (in this case a sample of 1000 like before is good enough), it will avoid connecting the points.
With samples set too small (for example 100), we might still get the connecting lines
Thus if we use a gap with a size smaller than 10, we may need to use higher sampling to avoid the connecting lines. Larger gaps may work with smaller sampling.
Depending on the sampling, the gaps might be larger than specified as well if the sampling is too low. For example, setting the gaps to a size of 100 with
f(x)=x<=200 ? 1.05 : x<=300? (1/0) : x<=550? 0.65 : x<=650 ? (1/0) : x<=1200 ? 0.46 : x<=1300 ? (1/0) : x<=1850? 0.4 : x<=1950 ? (1/0) : x<=3150 ? 0.31: x<=3250? (1/0) : 0.22
and a sampling of 10, we get
where the gaps have a size of 222.22 (I have added labels to make it easy to compute the gap sizeΔ), but with a sampling of 1000, we get
where the gaps have size 101.1, very close to the value of 100 specified in the function.
To use functions to do this, therefore, use this model and set the gap size to a value small enough that it will appear non-existent on the final graph (notice that on the graph from 0 to 6000, we can barely see the gap size of 10), and then set the samples reasonably high.
With the function approach, I don't know of any way to add the filled and open dots if those are desired.
† Gnuplot version 5.1 (the current development version) supports a pointtype variable option which can simplify this to
plot last=0,\
datafile u 1:2:(oldlast=last,last=$1,$1==oldlast?6:7) w linespoints pt var lt 1
Here we just plot all points, but use the same test as before to select between pointtype 7 or 6. As we can do both point types at once, we can just use the linespoints style instead of doing two separate plots.
‡ Initializing last to a value less than the first x-coordinate will cause that first point to be filled.
Δ To draw these labels, in the first case (with set xrange[0:1000] and set samples 10), I used
plot f(x),\
"+" u 1:(f($1)+0.1):(abs($1-250)<150||abs($1-600)<160?sprintf("%0.2f",$1):"") w labels
and in the case of set samples 1000
plot f(x),\
"+" u 1:(f($1)+0.1):(abs($1-250)<51||abs($1-600)<51?sprintf("%0.2f",$1):"") w labels
It takes a little playing around with the bounds on the abs functions here to get the desired labels to appear. Examining the output using set table can be helpful for getting them right.

Your ternary statements are trying to do too much. When you are writing
f0(x)=(x_low < x <= x_high) ? y : 0
you should be writing
f0(x)=((x_low < x) && ( x <= x_high)) ? y : 0
So your function f(x) should look like
f(x)=(x<=300) ? 1.05 : (x<=650) ? 0.65 : (x<=1300) ? 0.46 : (x<=1950) ? 0.4 : (x<=3250) ? 0.31 : (x<3250) ? 0.22 : 0
As for the plotting style if you want it to be discontinuous, use separate functions for the steps like and plot them individually. Your first function would be split up like:
f1(x)= (x<=300) ? 1.05 : 1/0
f2(x)=(x>300) && (x<=650) ? 0.65 : 1/0
...
If you just want steps without interpolation, use steps
plot [0:6000][0:3] f(x) w steps, g(x) w steps

Related

Understanding the use of the keyword every

My question regards the keyword every that is used to sample an input data file (i.e., .csv, .dat etc.). I am reading the documentation of the keyword that says the following:
plot 'file' every {<point_incr>}
{:{<block_incr>}
{:{<start_point>}
{:{<start_block>}
{:{<end_point>}
{:<end_block>}}}}}
The thing is I cannot completely comprehend how to adapt this to a data set. For instance, if I have some dummy data that I wish to use to create a bar chart for example and the data are the following
# first bars group
#x axis #y axis
0 2
0.2 3
0.4 4
0.6 5
0.8 6
#second bars group
1 1
1.2 2
1.4 3
1.6 4
1.8 5
#etc.
3 10
3.2 20
3.4 30
3.6 40
3.8 50
4 20
4.2 30
4.4 40
4.6 50
4.8 60
And lets say that I want to create four bar clusters from the data. One for every block. How can I use the syntax of the keyword? Could someone give me some examples to better understand the use of it? Thank you in advance
As you've found, the every keyword allows you to cherry-pick a subset of single-newline-separated points and double-newline-separated blocks from your datafile. Your example datafile shows 20 points divided into 4 blocks.
So to plot the first block (indexed 0 in gnuplot), you only need to specify the end block, and use the default values for the other every parameters. Try:
plot 'data.txt' every :::::0 with boxes
It seems your goal is to plot each block with separate styling. Here's how you could do that with a few extra styling commands. (Note my use of gnuplot's shorthand for some keywords.)
set key left top
set boxwidth 0.2
p 'data.txt' ev :::0::0 w boxes t 'first',\
'data.txt' ev :::1::1 w boxes t 'second',\
'data.txt' ev :::2::2 w boxes t 'third',\
'data.txt' ev :::3::3 w boxes t 'fourth'
From help every:
The data points to be plotted are selected according to a loop from
<start_point> to <end_point> with increment <point_incr> and the
blocks according to a loop from <start_block> to <end_block> with
increment <block_incr>.
This should be pretty clear, however, you have to know if blocks are separated by two (or more) empty lines, you have to address them differently. Check help index. To my opinion the documentation is a bit confusing about datablock, (sub-)block, dataset, etc...
Check the following example. I assume this is not your final graph, but still needs some tuning. Depending on your detailed requirements you also might want to check help histograms.
For example every :::i::i will plot all datapoints in in block i, i.e. from block i to block i.
Code:
### plotting using "every"
reset session
$Data <<EOD
# first bars group
#x axis #y axis
0 2
0.2 3
0.4 4
0.6 5
0.8 6
#second bars group
1 1
1.2 2
1.4 3
1.6 4
1.8 5
#etc.
3 10
3.2 20
3.4 30
3.6 40
3.8 50
4 20
4.2 30
4.4 40
4.6 50
4.8 60
EOD
set key top left
set boxwidth 0.2
set key out noautotitles
set style fill solid 0.3
set yrange [:70]
plot for [i=0:3] $Data u 1:2 every :::i::i w boxes
### end of code
Result:

How to interpolate data with Gnuplot for further calculations

I am (somehow) familiar with the smooth/interpolation techniques in Gnuplot. It seems to me that these interpolations work only for plotting the interpolated values. However, I need the interpolated values for further calculations.
A simple example may illustrate this:
Let’s say we are selling a specific item on four days and have the number of sales stored in input_numbers.dat:
# days | number_of_sold_items
1 4
2 70
3 80
4 1
Now, I want to plot my income for each day. But the relation between the price per item and the number of sold items is not a simple linear relation, but something complicate which is only known for a few examples – stored in input_price.dat:
# number_of_sold_items | price_per_item
1 5.00
3 4.10
10 3.80
100 3.00
How can I do something like this (pseudocode):
make INTERPOLATED_PRICE(x) using "input_price.dat"
plot "input_numbers.dat" using 1:($2*INTERPOLATED_PRICE($2))
I can do it by fitting but it is not what I want. The relation of the data is too complicated.
P.S.: I know that the price per item vs the number of items in such an example is more like a step-like function and not smooth. This is just an example for some interpolation in general.
It’s hard to prove the non-existence of something but I am pretty confident that this cannot be done with Gnuplot alone, as:
I am under the illusion to be sufficiently familiar with Gnuplot that I would know about it if it existed.
I cannot find anything about such a feature.
It would completely go against Gnuplot’s paradigm to be a one-purpose tool for plotting (fitting is already borderline) and not to feature data processing.
Gnuplot can do something like this:
text = "%f*x + %f"
a = 2
b = 10
eval("f(x) = ".sprintf(text,a,b))
set grid x y
plot f(x)
which basically means that complicated functions can be defined dynamically: The sprintf command converts the text "%f*x + %f" into "2.0*x + 10", the dot operator . concatenates the strings "f(x) = " and "2.0*x + 10", and the eval command defines the function f(x) = 2.0*x + 10. The result can be plotted and gives the expected diagram:
This behavior can be used for creating a piecewise interpolation function as follows:
ip_file = "input_price.dat"
stats ip_file nooutput
n = STATS_records - 1
xmin = STATS_min_x
xmax = STATS_max_x
ip_f = sprintf("x < %f ? NaN : ", xmin)
f(x) = a*x + b # Make a linear interpolation from point to point.
do for [i=0:n-1] {
set xrange [xmin:xmax]
stats ip_file every ::i::(i+1) nooutput
xmintemp = STATS_min_x
xmaxtemp = STATS_max_x
set xrange [xmintemp:xmaxtemp]
a = 1
b = 1
fit f(x) ip_file every ::i::(i+1) via a, b
ip_f = ip_f.sprintf("x < %f ? %f * x + %f : ", xmaxtemp, a, b)
}
ip_f = ip_f."NaN"
print ip_f # The analytical form of the interpolation function.
eval("ip(x) = ".ip_f)
set samples 1000
#set xrange [xmin:xmax]
#plot ip(x) # Plot the interpolation function.
unset xrange
plot "input_numbers.dat" using 1:($2*ip($2)) w lp
The every in combination with stats and fit limits the range to two successive datapoints, see help stats and help every. The ternary operator ?: defines the interpolation function section by section, see help ternary.
This is the resulting analytical form of the interpolation function (after some formatting):
x < 1.000000 ? NaN
: x < 3.000000 ? -0.450000 * x + 5.450000
: x < 10.000000 ? -0.042857 * x + 4.228571
: x < 100.000000 ? -0.008889 * x + 3.888889
: NaN
This is the resulting interpolation function (plotted by plot ip(x)):
This is the resulting plot using the interpolation function in another calculation (plot "input_numbers.dat" using 1:($2*ip($2))):
I don't know the limits on how many ternary operators you can nest and on how long a string or a function definition can be, ...
Tested with Gnuplot 5.0 on Debian Jessie.
Linear interpolation is not available, but how about this:
set xr [0:10]
set sample 21
# define an inline example dataset
$dat << EOD
0 1
2 2
4 4
6 5
8 4
10 3
EOD
# plot interpolated data to another inline dataset
set table $interp
plot $dat us 1:2 with table smooth cspline
unset table
plot $dat w lp, $interp w lp
As I understand your question, you are not looking for interpolation but for a lookup-table, i.e. depending on the number of sold items you have a different price.
What you can do with gnuplot is:
(mis)using stats to create a lookup-string (check help stats)
(mis)using sum to create a lookup-function (check help sum)
Comment: I assume it will be a difference if you for example sell 3 times 1 item on a single day or 1 time 3 items on a single day, because of the graduation of prices.
So, I would suggest a different input data format, i.e. with a date.
(However, not yet implemented in the example below, but can be done. Then, you can make use of the smooth frequency option.) Some data format, e.g. like this:
# date sold_items
2022-09-01 1
2022-09-01 1
2022-09-01 1
2022-09-02 3
Script: (works with gnuplot 5.0.0, Jan. 2015)
### implement lookup table
reset session
$SALES <<EOD
# days | number_of_sold_items
1 4
2 70
3 80
4 1
EOD
$PRICE <<EOD
# number_of_sold_items | price_per_item
1 5.00
3 4.10
10 3.80
100 3.00
EOD
LookupStr = ''
stats $PRICE u (LookupStr=LookupStr.sprintf(" %g %g",$1,$2)) nooutput
Lookup(v) = (p0=NaN, sum [i=1:words(LookupStr)/2] (v>=real(word(LookupStr,i*2-1)) ? \
p0=real(word(LookupStr,i*2)) : 0), p0)
set grid x,y
set key noautotitle
set multiplot
plot $SALES u 1:2 w lp pt 6 lc "dark-grey" ti "sold items", \
'' u 1:($2*Lookup($2)) w lp pt 7 lc "red" ti "total income"
# price table as graph inset
set origin x0=0.41, y0=0.42
set size sx=0.30, sy=0.28
set obj 1 rect from screen x0,y0 to screen x0+sx,y0+sy fs solid noborder lc "white" behind
set margins 0,0,0,0
set xrange [:150]
set yrange [2.5:5.5]
set xlabel "pieces" offset 0,0.5
set ylabel "price / piece"
set logscale x
plot $PRICE u 1:2 w steps lc "blue", \
'' u 1:2 w p pt 7 lc "blue"
unset multiplot
### end of script
Result:

nested do and if statements miss plots

I am using
G N U P L O T
Version 4.6 patchlevel 4 last modified 2013-10-02
Build System: Linux x86_64
which I want to use to plot data from a file that is roughly set up like this
0.0 a1 b1
0.0 a2 b2
...
0.1 a1 b1*
0.1 a2 b2*
...
for each unique value in the first column I want to plot b over a. To do this I have created a do loop which contains conditional plotting
do for [t=0:34] {
print 0.2000*t
plot 'twopi5101/profile.dat' u ($1==0.2000*t ? ($3-7.5) : 1/0):8 notitle w l lt 1 lc 1, \
'twopi5101/profile.dat' u ($1==0.2000*t ? ($3-7.5) : 1/0):9 notitle w l lt 1 lc 2
}
unfortunately this loop (and similar loops for other files) will consistently miss some plots
0.0
0.2
0.4
0.6
warning: Skipping data file with no valid points
warning: Skipping data file with no valid points
more> ;print 0.2000*t;plot 'twopi5101/profile.dat' u ($1==0.2000*t ? ($3-7.5) : 1/0):8 notitle w l lt 1 lc 1, 'twopi5101/profile.dat' u ($1==0.2000*t ? ($3-7.5) : 1/0):9 notitle w l lt 1 lc 2;
^
x range is invalid
however, if I manually input the 0.6 there is no problem at all
gnuplot> plot 'twopi5101/profile.dat' u ($1==0.6 ? ($3-7.5) : 1/0):8 notitle w l lt 1 lc 1, \
'twopi5101/profile.dat' u ($1==0.6 ? ($3-7.5) : 1/0):9 notitle w l lt 1 lc 2
gnuplot>
There seems to be no logical explanation for why this should happen, or even a pattern for points missed.
of the interval [0.0:6.0] gnuplot skipped:
0.6,1.2,1.4,2.4,2.8,3.4,3.8,4.6,4.8,5.6,5.8
and it does so consistently every time I run the loop, even if I run it over just part of that interval (e.g. running from 0.6 to 2.0 will again skil 0.6,1.2 and 1.4).
I've run into the same behavior in a number of other cases for larger intervals/more plots. I have no idea what would even cause something like this or if there is some error in my formatting of the loop to explain it.
(terminals I use are either 'wxt' or postscript enhanced)
That's because testing floating point values on equality is generally not a good idea. Let's consider for example:
gnuplot> print 0.6==0.6
1
gnuplot> print 0.6==3*0.2
0
This is a consequence of the fact that numbers like 0.2 are not represented exactly.
I would suggest to first convert the first column in your data to an integer value by, e.g.,
floor(($1 + 0.05)*10)
Here it is assumed that the column in question contains only multiples of 0.1. The factor 0.05 is present to ensure that possible inaccurate input such as or example 0.1000001 or 0.0999999 gets converted to 1.
This converted value can be then used in the filtering within the plot command, e.g.,
plot 'twopi5101/profile.dat' u (floor(($1+0.05)*10)==2*t?($3-7.5):1/0):8
Alternatively, one could replace the condition $1==0.2000*t with something like abs($1 - 0.2000*t)<5E-2

gnuplot: connecting last to first point in polar plot

I have a data file like
0 8.4
60 7.5
120 8.9
180 9.2
240 8.3
300 6.9
My gnuplot script looks this way:
unset xtics
unset ytics
set polar
set angle degrees
set rrange [0:10]
set rtics 2
set grid polar
set size square
p 'data.txt' u 1:2 w lp
My problem is that I want the first and last data point to be connected by the line. I get the expected result if I repeat the first point in my data file again at the end of the file like:
0 8.4
60 7.5
120 8.9
180 9.2
240 8.3
300 6.9
0 8.4
Ist this the only way to get the expected result? I'm asking because my real file has a lot of data-sets which I reference by the gnuplot index command like p 'data.txt' index 1276 u 1:2 w lp and always duplicating the first data point again at the end at each block is quite annoying.
A solution is to connect the first and last point by doing a second plot with only those two points.
Using the syntax every (check help plot every), you can plot only the first point (number 0) and last point (number N)
with :
every N::0::N
In the example you gave, the line 9 should be modified as follow :
p 'data.txt' u 1:2 w lp ls 1, 'data.txt' u 1:2 ev 5::0::5 w lp ls 1 noti
A first flaw in this solution is you have to specify the style of the second line to ensure it looks like the first one, and does not appear in the keys (hence the ls 1 and noti).
The second flaw is you need to know the number of points in your block. It can be obtained in gnuplot using the stats syntax, as shown here. For your example, I would use it on the column 0 (for the points numbers) as follows :
stats 'data.txt' u 0
N = STATS_max

How can I use different point types in the same line in a 2D gnuplot graph?

I'm need to plot one line in which for instance some points may be red circles and some points may be blue circles. Another case is to have in the same line some points represented as filled circles and some points as empty circles. I'd like to know if there's any way to explicitly define which point type should be used for each point or group (interval) of points on the same line.
Please consider a simple dataset such as
1 1.59
2 0.39
3 0.88
4 1.23
5 1.00
In this case I need to use filled cicles for points (3,0.88) and (4,1.23) and use empty circles for the remaining ones.
Here comes an example of what I'd like to do: http://i.stack.imgur.com/VMwfV.jpg
This is very easy to do with a conditional plot. You need to plot the same file twice: once requiring that the points be between 3 and 4 and the rest:
plot "data" using 1:($1 >= 3 && $1 <= 4 ? $2 : 1/0) pt 1, \
"data" using 1:($1 >= 3 && $1 <= 4 ? 1/0 : $2) pt 2
The first plot plots column 2 if the value in column 1 is between 3 and 4 (inclusive) and second plot does the opposite, each plot uses a different point type, as requested:
The number following pt changes the point style.

Resources