Gnuplot using stats in for loop - statistics

i wanna plot the maximum and average value of different files into one plot.
I got several ntp-stats, so i thought:
input = "./peerstats/s_peerstats.201407"
set terminal svg size 600 400
set xlabel "Day in July (s)"
set ylabel "Jitter (ms)"
set yrange[0:0.65]
set output "ntpq_month_07.svg"
do for [k=10:31]{
stats input.k."_pps" using ($8*1000.0) nooutput name "PPS",\
stats input.k."_rz1" using ($8*1000.0) nooutput name "RZ1",\
stats input.k."_rz2" using ($8*1000.0) nooutput name "RZ2",\
set "ntpq_month_07.svg"
print ($k):PPS_max
print ($k):RZ1_max
print ($k):RZ2_max
print ($k):PPS_mean
print ($k):RZ1_mean
print ($k):RZ2_mean
}
This is the error by gnuplot:
;
stats input.k."_pps" using ($8*1000.0) nooutput name "PPS", stats input.k."_rz1" using ($8*1000.0) nooutput name "RZ1", stats input.k."_rz2" using ($8*1000.0) nooutput name "RZ2", set "ntpq_month_07.svg" ;
print ($k):PPS_max;
print ($k):RZ1_max;
print ($k):RZ2_max;
print ($k):PPS_mean;
print ($k):RZ1_mean ;
print ($k):RZ2_mean;
^
line 20: Expecting [no]output or prefix
Where is the syntax wrong?
Thanks a lot :)!

First you must have all stats commands on a separate line. That works fine. Then it is your print syntax which is broken.
Consider the file test.dat
1
2
and the call
do for [i=10:11] {
stats 'test.dat' using 1 nooutput name "PPS"
stats 'test.dat' using ($1*i) nooutput name "RZ1"
print sprintf("max(PPS_%d): %f", i, PPS_max)
print sprintf("max(RZ1_%d): %f", i, RZ1_max)
}
gives the output
max(PPS_10): 2.000000
max(RZ1_10): 20.000000
max(PPS_11): 2.000000
max(RZ1_11): 22.000000
So your script should look as follows:
input = "./peerstats/s_peerstats.201407"
set terminal svg size 600 400
set xlabel "Day in July (s)"
set ylabel "Jitter (ms)"
set yrange[0:0.65]
set output "ntpq_month_07.svg"
do for [k=10:31]{
stats input.k."_pps" using ($8*1000.0) nooutput name "PPS"
stats input.k."_rz1" using ($8*1000.0) nooutput name "RZ1"
stats input.k."_rz2" using ($8*1000.0) nooutput name "RZ2"
set "ntpq_month_07.svg"
print sprintf('%d: %f', k, PPS_max)
print sprintf('%d: %f', k, RZ1_max)
print sprintf('%d: %f', k, RZ2_max)
print sprintf('%d: %f', k, PPS_mean)
print sprintf('%d: %f', k, RZ1_mean)
print sprintf('%d: %f', k, RZ2_mean)
}
Of course your svg files don't contain any output, because you don't plot anything.

Related

gnuplot: histogram of events: issue with timecolumn()

I would like to see the number of events per timeperiod.
My rows look like this
"2020-11-11 09:15:50",field2,field3
This is what I have tried
binwidth = 3600 # 1h in seconds
bin(t) = (t - (int(t) % binwidth) + binwidth/2)
set datafile separator ","
#set xdata time
set timefmt '"%Y-%m-%d %H:%M:%S"'
set boxwidth binwidth
plot 'Statistics.log' using (bin(timecolumn(1, '"%Y-%m-%d %H:%M:%S"'))):(1) smooth freq with boxes
I'm getting
unknown type in magnitude()
How would I debug errors like these? (How do I dump what gnuplot "sees" for timecolumn() etc.?)
(gnuplot 4.6)
At first, The timecolumn() in gnuplot 4.6 is a single-argument function, and only the argument for the column number is allowed. Therefore, the plot command can be rewritten as,
plot "test.dat" using (bin(timecolumn(1))):(1) smooth freq with boxes
Secondly, do not include leading and trailing double quotes in your timefmt formatting.
set timefmt '%Y-%m-%d %H:%M:%S'
For more information about this, please refer to the "help data" section.
...
However, whitespace inside a pair of double quotes is ignored when
counting columns, so the following datafile line has three columns:
1.0 "second column" 3.0
Finally, your code can be modified as follows (for gnuplot 4.6)
binwidth = 3600 # 1h in seconds
bin(t) = (t - (int(t) % binwidth) + binwidth/2)
set datafile separator ","
set xdata time
set timefmt '%Y-%m-%d %H:%M:%S'
set boxwidth binwidth
plot 'Statistics.log' using (bin(timecolumn(1))):(1) smooth freq with boxes
A few minutes too late... while testing... #binzo basically already answered.
The only difference: if your data uses double quotes for the date
"2020-11-11 09:15:50",field2,field3`
and you don't want to change your existing data, you have to specify it in set timefmt. For some strange reason which I cannot explain right now, if you set datafile separator "," it will mess up the graph, but it seems to work without.
Code: (tested with gnuplot 4.6.0)
### timedata in histogram (gnuplot 4.6)
reset
FILE = 'Statistics.log'
myTimeFmt = '"%Y-%m-%d %H:%M:%S"'
# create some test data
myDate = strptime(myTimeFmt, '"2020-11-11 11:11:11"')
myRandomDate(n) = myDate + 3*3600*invnorm(rand(0))
set print FILE
do for [i=1:500] {
print sprintf("%s,%g,%g",strftime(myTimeFmt,myRandomDate(0)),rand(0),rand(0))
}
set print
# set datafile separator "," # if uncommented this will messup the plot, don't know why
set xdata time
set format x "%Y-%m-%d\n%H:%M"
set timefmt '"%Y-%m-%d %H:%M:%S"'
binwidth = 3600 # 1 h in seconds
bin(t) = (t - (int(t) % binwidth) + binwidth/2)
set boxwidth binwidth
set style fill solid 0.5
set xtics 4*3600 # 4 h in seconds
plot FILE u (bin(timecolumn(1))):(1) smooth freq w boxes notitle
### end of code
Result:

How to apply tail command in gnuplot

I am defining my two colum data file as below in gnuplot file, plot.gnu.
FILE2='case.out'
I want to store the last value of second colum of case.out as Max. I tried as
Max =`(tail -n 2 FILE2 | awk '{print $2}')`
But it gives gives me error
Max =
^
"plot.gnu", line 37: constant expression required
But if I define exact name of file, case.out, instead writing FILE2 in Max command then it works well.
My case.out is something line
3.2853 243.4008
3.2936 243.6239
3.3019 243.8089
3.3103 243.9544
3.3186 244.0590
3.3269 244.1221
3.3353 244.1432
and I want the the Max command should store 244.1432 value.
i.e
print Max
should give 244.1432
Have a look into the manual and or in the gnuplot console type help stats. No need for awk here.
Code:
stats "case.out" u 2 nooutput
print STATS_max
Result:
244.1432
Addition:
Please check the manual about how stats works.
Code:
stats "case.out" u 1:2 nooutput
print STATS_min_x, STATS_max_x
print STATS_min_y, STATS_max_y
Result:
3.2853 3.3353
243.4008 244.1432
Or you can even "rename" the stats results.
Code:
stats "case1.out" u 1:2 nooutput name "First"
print First_min_x, First_max_x
print First_min_y, First_max_y
stats "case2.out" u 1:2 nooutput name "Second"
print Second_min_x, Second_max_x
print Second_min_y, Second_max_y

store commented value from data file in gnuplot

I have multiple data files output_k, where k is a number. The files look like
#a=1.00 b = 0.01
# mass mean std
0.2 0.0163 0.0000125
0.4 0.0275 0.0001256
Now I need to retrieve the values of a and b and to store them in a variable, so I can use them for the title or function input etc. The looping over the files in the folder works. But I need some help with reading out the the parameters a and b. This is what i have so far.
# specify the number of plots
plot_number = 100
# loop over all data files
do for [i=0:plot_number] {
a = TODO
b = TODO
#set terminal
set terminal postscript eps size 6.4,4.8 enhanced color font 'Helvetica,20' linewidth 2
set title "Measurement \n{/*0.8 A = a, B = b}"
outFile=sprintf("plot_%d.eps", i)
dataFile=sprintf("output_%d.data", i)
set output outFile
plot dataFile using 1:2:3 with errorbars lt 1 linecolor "red", f(a,b)
unset output
}
EDIT:
I am working with gnuplot for windows.
If you are on a Unixoid system, you can use system to get the output of standard command line tools, namely head and sed, which again allow to extract said values form the files:
a = system(sprintf("head -n 1 output_%i.data | sed \"s/#a=//;s/ b .*//\"", i))
b = system(sprintf("head -n 1 output_%i.data | sed \"s/.*b = //\"", i))
This assumes that the leading spaces to all lines in your question are actually a formatting mistake.
A late answer, but since you are working under Windows you either install the comparable utilities or you might be interested in a gnuplot-only solution (hence platform-independent).
you can use stats to extract information from the datablock (or file) to variables. Check help stats.
the extraction of your a and b depends on the exact structure of that line. You can split a line at spaces via word(), check help word and get substrings via substr() or indexing, check help substr.
Script: (works with gnuplot>=5.0.0)
### extract information from commented header without external tools
reset session
$Data <<EOD
#a=1.00 b = 0.01
# mass mean std
0.2 0.0163 0.0000125
0.4 0.0275 0.0001256
EOD
set datafile commentschar ''
set datafile separator "\t"
stats $Data u (myHeader=strcol(1)[2:]) every ::0::0 nooutput
set datafile commentschar # reset to default
set datafile separator # reset to default
a = real(word(myHeader,1)[3:])
b = real(word(myHeader,4))
set label 1 at graph 0.1,0.9 sprintf("a=%g\nb=%g",a,b)
plot $Data u 1:2 w lp pt 7 lc "red"
### end of script
Result:

gnuplot "stats" command unexpected min & "out of range" results

I’m trying to develop a histogram script. The plot itself seems correct, but I have some problems or questions:
I don’t understand why the “stats” output says my data file has “out of range” points. What does that mean?
The “stats” minimum value doesn’t look correct, either. From the data file, minimum = -0.0312, but stats reports 0.0.
The script:
# Gnuplot histogram from "Gnuplot In Action", 13.2.1 Jitter plots and histograms (p. 256)
# these functions put data points (x) into bins of specified width
bin(x,width) = width*floor(x/width)
binwidth = 0.01
set boxwidth binwidth
# data file
data_file = "sorted.csv"
png_file = "sorted.png"
datapoint_count = 14
# taking explanations from the data file
set style data linesp
set key autotitle columnheader
set datafile separator "," # CSV format
# histogram
myTitle = "Histogram from \n" . data_file
set title myTitle
set style fill solid 1.0
set xlabel "Slack"
set mxtics
set ylabel "Count"
set yrange [0:*] # min count is always 0
set terminal png # plot file format
set output png_file # plot to file
print "xrange="
show xrange
print "yrange="
show yrange
stats data_file using ($1)
print "STATS_records=", STATS_records
print "STATS_invalid=", STATS_invalid
print "STATS_blank=", STATS_blank
print "STATS_min=", STATS_min
print "STATS_max=", STATS_max
plot data_file using (bin($1,binwidth)):(1) smooth frequency with boxes
The data file:
slack
-0.0312219
-0.000245109
-4.16338e-05
-2.08616e-05
-1.82986e-05
8.31485e-06
1.00136e-05
1.23084e-05
0
0.000102907
0.000123322
0.000138402
0.19044
0.190441
The output:
gnuplot sorted.gp
Could not find/open font when opening font "arial", using internal non-scalable font
xrange=
set xrange [ * : * ] noreverse nowriteback # (currently [-10.0000:10.0000] )
yrange=
set yrange [ 0.00000 : * ] noreverse nowriteback # (currently [:10.0000] )
* FILE:
Records: 9
Out of range: 5
Invalid: 0
Blank: 0
Data Blocks: 1
* COLUMN:
Mean: 0.0424
Std Dev: 0.0792
Sum: 0.3813
Sum Sq.: 0.0725
Minimum: 0.0000 [3]
Maximum: 0.1904 [8]
Quartile: 0.0000
Median: 0.0001
Quartile: 0.0001
STATS_records=9.0
STATS_invalid=0.0
STATS_blank=0.0
STATS_min=0.0
STATS_max=0.190441
If you give a single column to the stats command, the yrange is used to select the range from this column.
At first sight this doesn't make sense, but behaves like a plot command which has only a single column, in which case this single column is the y-value and the row number is choosen as x-value.
So, just move the set yrange part behind the stats command.
data_file = 'sorted.csv'
stats data_file using 1
show variables all
set yrange [0:*]
plot data_file ...

How to print input file next to graph in gnuplot?

Is it possible with gnuplot to print the data that I plotted next to the graph?
If I have a text file input.txt:
#x y
1 2
2 5
3 6
4 7
And I do plot 'input.txt' I'd like to have it plotted as usual and next to the plot I'd like to have the table printed. Is this possible?
Note: I'm on Windows and I'd like to format the output.
A bit late, but the OP asked for Windows... so, in short:
data = system('type yourfile.dat') # Windows
In Windows, if you give a path, you need to pay attentention about \, spaces and doublequotes ".
Data: SO22225051.dat
#x y
1 2
2 5
3 6
4 7
Script:
Solution working for both Linux and Windows. Version 1 for gnuplot>=5.2.0, Version 2 for gnuplot>=4.6.0.
### place data as table/text in graph
reset
FILE = 'SO22225051.dat'
set rmargin 15
set label 1 at screen 0.9,0.7 font "Courier New,12"
# Version 1: Windows & Linux using system() command;
# GPVAL_SYSNAME only available for gnuplot>=5.2.0
getData(f) = GPVAL_SYSNAME[1:7] eq "Windows" ? \
system(sprintf('type "%s"',f)) : \
system(sprintf('cat "%s"',f)) # Linux/MacOS
Data = getData(FILE)
set label 1 Data
plot FILE u 1:2 w lp pt 7 lc rgb "red"
pause -1
# Version 2: gnuplot-only, platform-independent, working at least with gnuplot>=4.6.0
Data = ''
set datafile commentschar ''
set datafile separator "\t"
stats FILE u (Data=Data.strcol(1)."\n") nooutput
set datafile commentschar # restore default
set datafile separator # restore default
set label 1 Data
plot FILE u 1:2 w lp pt 7 lc rgb "red"
### end of script
Result:
The only difference between version 1 and 2 is that in version 2 gnuplot will remove leading spaces for each data line.
Sure you can. The simplest way to do this in gnuplot is read in the file by calling an external command (cat on *nix, not sure on Windows) and storing the output as a variable, then setting a label on the graph. Here is how I do it:
set rmargin 8
datas = system('cat data.dat')
print datas
set label datas at graph 1.1,0.7
plot 'data.dat' notitle
This puts the data file off to the side, in place of a key.

Resources