Gnuplot missing points and xtics in plot with Year-month xrange - gnuplot

I'm having a problem plotting a particular chart on Gnuplot 5.4. Here is the data (assumed to be in file test.csv)
report_date,contract_area,emerg_tally,mean_response_time,median_response_time,percentile_80_response_time
2022-03,10,1,1.133333,1.133333,1.133333
2022-04,10,12,33.013888,4.6166665,27.0133328
2022-05,10,20,3.608333,3.175,4.473333
2022-06,10,21,6.703174,2.533333,4.7
2022-07,10,2,2.766666,2.7666665,3.1966664
2022-04,11,14,19.255951,3.6749995,15.8433328
2022-05,11,8,5.789583,3.05,3.993333
2022-06,11,17,75.061764,2.083333,3.3199996
2022-07,11,11,3.15606,2.583333,3.8
2022-04,12,9,35.253703,4.816666,8.373333
2022-05,12,14,3.140475,2.458333,3.3233332
2022-06,12,14,9.305952,2.8999995,7.8299998
2022-07,12,4,5.508333,2.708333,7.5399998
2022-03,13,1,0.9,0.9,0.9
2022-04,13,4,2.583333,2.7249995,3.1533328
2022-05,13,22,4.797726,2.6499995,6.1233328
2022-06,13,21,7.394444,2.5,4.966666
2022-07,13,1,2.85,2.85,2.85
The first column is a timestamp (Year-month).
The plot script is as follows:
set datafile separator comma
set datafile columnheaders
set timefmt '%Y-%m'
set xdata time
set format x '%Y-%m'
set xlabel 'Job completion Date'
set ylabel 'Median completion time'
set xrange ['2022-04':*]
set xtics '2022-04', '2022-05' '2022-06', '2022-07'
plot "./test.csv" index 0 using 1:5 title "Area 10" with lines lc 1, \
'' index 1 using 1:5 title "Area 11" with lines lc 2, \
'' index 2 using 1:5 title "Area 12" with lines lc 3, \
'' index 3 using 1:5 title "Area 13" with lines lc 4
This is not what I am expecting:
The 2nd and 3rd data series are missing an initial point (for 2022-04). It could be that all series are missing there initial data point, but that this is masked by the range starting at 2022-04 (series 1 and 4 have a 2022-03 data point).
The x-axis is only showing a tic mark for 2022-06. I would expect a 2022-05 tic mark (as 2022-04 and 2022-07 will be at the left and right boundaries respectively).
I have a set of similar scripts which differ only in the date format ('%Y-%m-%d' rather than '%y-%m').
Does anyone have any idea how to correct these issues?

Missing points - The program interprets the command set datafile columnheaders to mean that the first line of each data set in the file will consist of column headers. That's not what you want, since in your file only the first data set has a separate line of headers. Instead you can use the skip keyword in the plot command to ignore lines at the start of the file (see below)
Unexpected lack of x-axis ticmarks - gnuplot has always been bad at auto-selection of tic intervals along a time axis with dates. There are some improvements in the development version but I'd say it has only reached the level of 'not quite as bad'. You can fix this by giving an explicit tic interval of one month = (~2592000 seconds). Obviously that is not quite correct because some months have 31 days, but for sparse time points like the ones you have it is close enough.
Modified script
set datafile separator comma
set datafile nocolumnheaders # only the 1st data set has headers
set timefmt '%Y-%m'
set xdata time
set format x '%Y-%m'
set xlabel 'Job completion Date'
set ylabel 'Median completion time'
set xrange ['2022-04':*] noextend
set xtics 2592000 # 3600 * 24 * 30 ~= seconds per month
plot "./test.csv" skip 1 index 0 using 1:5 title "Area 10" with lines lc 1, \
'' index 1 using 1:5 title "Area 11" with lines lc 2, \
'' index 2 using 1:5 title "Area 12" with lines lc 3, \
'' index 3 using 1:5 title "Area 13" with lines lc 4
Or you could specify individual tic marks along x. Your script attempted that but the syntax was missing parentheses and a comma. You could use:
set xtics ('2022-04', '2022-05', '2022-06', '2022-07')

Related

Xtics too close together with GnuPlot and lots of datapoints

I am trying to graph roughly 15k different data points. I have tried all of the different variations that I can think of to reduce the number of xtics, but I can't seem to change the frequency that they are drawn.
Here's my gnuplot file:
reset
# need to call with two variables -e "filename='...'" -e "machine='...'"
set title machine." Activity ".filename
set datafile separator ","
set autoscale x
set autoscale y
set autoscale y2
set yrange [0:*]
set y2range [0:*]
set y2tics
set style data lines
set ylabel "% CPU"
set xlabel "Time"
set y2label "Memory (MB)"
set bmargin 7 # room for the xtic label
set term pngcairo size 960,720
set output filename.".png"
# I've also tried autofreq and explicit labels
set xtics axis out rotate 90 scale 0.5 (20101939, 1000000, 25102219)
plot filename \
using 2:xtic(1) title "CPU" with points pt 1 axes x1y1, \
"" using ($3 / 1024 / 1024):xtic(1) title "Memory" with points pt 1 axes x1y2
The format of my data is:
datetime(ddHHMMss), %cpu, mem-in-bytes, pid, process-alias, process-name
My data looks like the following (roughly sampled every 30 seconds for 15k records):
20101939,0,137932800,6172,process-alias,process-name
20102009,0.15623667978147077,139509760,6172,process-alias,process-name
20102039,0.15623669540380838,139866112,6172,process-alias,process-name
20102109,0.41663098777764329,141488128,6172,process-alias,process-name
20102139,0.052078915131769939,141455360,6172,process-alias,process-name
Despite my xtics command with explicit start, interval, and end, my graph always ends up with the xtic labels being overlapped. Here's what it looks like:
To filter the tic labels, replace xtic(1) with a criterion for printing a non-blank label. For example, this will print every 25th label.
plot filename \
using 2:xtic( int($0)%25 ? "" : strcol(1) ) title "CPU" with points pt 1
int($0) is the line number; strcol(1) is the content of column 1 read as a string

Gnuplot plotting wrong lines and some strange values as well

I am using gnuplot to postprocess some calculation that I have done and I am having hard time getting gnuplot to select the right lines as it is outputting some strange values that I do not know where come from.
The first 200 points of the results start in line 3 and stop in 202 but that is not working when I use every ::3::202.
Does anyone have any suggestions of what I am doing wrong?
Gnuplot image:
Datafile
set terminal pngcairo transparent nocrop enhanced size 3200,2400 font "arial,40"
set output "Mast41_voltage_muffe.png"
set key right
set samples 500, 500
set xzeroaxis ls 1 lt 8 lw 3
set style line 12 lc rgb '#808080' lt 0 lw 1
set style line 13 lt 0 lw 3
set grid back ls 12
set decimalsign '.'
set datafile separator whitespace
set ylabel "Spenna [pu]"
set xlabel "Timi [s]"
plot "mrunout_01.out" every ::3::202 using 2:3 title '5 ohm' with lines lw 3 linecolor rgb '#D0006E',\
"mrunout_01.out" every ::203::402 using 2:3 title '10 ohm' with lines lw 3 linecolor rgb '#015DD4',\
"mrunout_01.out" every ::403::602 using 2:3 title '15 ohm' with lines lw 3 linecolor rgb '#F80419',\
"mrunout_01.out" every ::603::802 using 2:3 title '20 ohm' with lines lw 3 linecolor rgb '#07826A'
unset output
unset zeroaxis
unset terminal
every refers to the actual plottable points. In your case, you have to skip 2 lines and the bunch of data at the end of your datafile.
Since you know the actual lines you need to plot I would pre-parse the file with some external tools like sed
So you can omit the every and your plot line becomes:
plot "< sed -n '3,202p' mrunout_01.out" using 2:3 title '5 ohm' with lp lw 3 linecolor rgb '#D0006E'
With yor datafile as it is, gnuplot has problems reading it. It can't even run stats on it:
stats 'mrunout_01.out'
bad data on line 1 of file mrunout_01.out
There is no need for using external tools, you can simply do it with gnuplot.
It's advantageous with your data that it is regular, every 200 points plotted in a different color.
And the data you want to plot is separated by one empty line from some additional data at the end of the file which you don't want to plot.
So, you simply address the 4th set of 200 lines in the 0th block via every ::600:0:799:0.
From help every:
Syntax:
plot 'file' every {<point_incr>}
{:{<block_incr>}
{:{<start_point>}
{:{<start_block>}
{:{<end_point>}
{:<end_block>}}}}}
Comments:
you can skip two lines at the beginning of the files with skip 2
you can plot your curves in a loop plot for [i=1:4] ...
you can define your color myColor(n) via index n from a string "#D0006E #015DD4 #F80419 #07826A"
you can define the legend myTitle(n) also from a list "5 10 15 20"
Script: (tested with gnuplot 5.0.0, version at the time of OP's question)
### plot parts of a file in a loop
reset session
FILE = "SO36103041.dat"
myColor(n) = word("#D0006E #015DD4 #F80419 #07826A",n)
myTitle(n) = word("5 10 15 20",n)
set xlabel "Timi [s]"
set ylabel "Spenna [pu]"
set yrange[0:30]
plot for [i=1:4] FILE u 2:3 skip 2 every ::((i-1)*200):0:(200*i-1):0 \
w l lw 3 lc rgb myColor(i) ti myTitle(i)
### end of script
Result:

GNUPLOT Illegal day of month

I am trying to make a graph out of my stat.dat file containing:
----system---- ----total-cpu-usage---- ------memory-usage----- -net/total-
time |usr sys idl wai hiq siq| used buff cach free| recv send
22-04 16:44:48| 0 0 100 0 0 0| 162M 57.1M 360M 3376M| 0 0
22-04 16:44:58| 0 0 100 0 0 0| 161M 57.1M 360M 3377M| 180B 317B
And I have a gnu.sh containing:
#!/usr/bin/gnuplot
set terminal png
set output "top.png"
set title "CPU usage"
set xlabel "time"
set ylabel "percentage"
set xdata time
set timefmt "%d-%m %H:%M:%S"
set format x "%H:%M"
plot "stat.dat" using 1:3 title "system" with lines, \
"stat.dat" using 1:2 title "user" with lines, \
"stat.dat" using 1:4 title "idle" with lines
When I run the gnu file I receive this error:
Could not find/open font when opening font "/usr/share/fonts/truetype/ttf-liberation/LiberationSans-Regular.ttf", using internal non-scalable font
----system---- ----total-cpu-usage---- ------memory-usage----- -net/total...
stat.dat:1:"./gnu.sh", line 12: illegal day of month
Is anyone familiar with this error and any solution which would help?
When parsing the first line of your data file, gnuplot encounters an illegal day of month.
You must somehow skip the first two lines of your data file before they are parsed. Using version 5.0 or 4.6.6 you can use the new skip option to achieve this. This skips a number of lines at the beginning of the data file without parsing them (as opposed to every, which can skip a number of lines after parsing, which would still give an error):
set xdata time
set timefmt "%d-%m %H:%M:%S"
set format x "%H:%M"
set style data lines
set border back
plot "stat.dat" using 1:4 skip 2 lw 3 title "system", \
"" using 1:3 skip 2 lw 3 title "user", \
"" using 1:5 skip 2 lw 3 title "idle"
Note also, that your column counting is wrong. The date contains a white space, so that it counts for two columns when you refer to the following values.
Alternatively, if you don't have version 5.0 or 4.6.6, you can use an external tool like tail to skip the first two lines:
set xdata time
set timefmt "%d-%m %H:%M:%S"
set format x "%H:%M"
set style data lines
set border back
plot "< tail -n +3 stat.dat" using 1:4 lw 3 title "system", \
"" using 1:3 lw 3 title "user", \
"" using 1:5 lw 3 title "idle"

Gnuplot percentage difference between 2 months

i have a csv file from 5 year malware data collected there are 2 columns the dates and the ips every date have 1 or more ips example
1/5/2013 12.234.123
1/5/2013 45.123.566
1/5/2013 100.546.12
1/6/2013 42.153.756
3/4/2014 75.356.258 etc... (every day for 5 years)
now i am trying to get the percentage difference between every month example:
November 2014 - 10%
December 2014 - 15%
i tried to put the percentage on y axis and in x2 axis but im getting some crazy results i am new to gnuplot and im still learning it here is the code i have right now:
set title 'Results Per Month'
set xlabel 'Date'
set ylabel 'Percentage'
set terminal png size 2800,900
set datafile sep ','
set xdata time
set timefmt '%Y/%m/%d'
set xrange['2009/3/22':'2014/12/02']
set xtics 30*24*60*60
set format x '%Y/%m'
set autoscale x2fix
set x2tics
set x2range[0:*]
set format x2 "%g %%"
set xtics nomirror rotate by -90
set grid ytics xtics
set ytics 10
set yrange [0:*]
set term png
set output 'file.png'
plot 'export.csv' using (timecolumn(1) - (tm_mday(timecolumn(1))-1)*24*60*60):(1) smooth frequency w lp pt 7 ps 2 notitle, \
'' using (($1-$2)/$1*100):x2ticlabels(2) axes x2y1 with points ps 2 lw 2
I would suggest you to use some external script for such kind of preprocessing (you can also do this on-the-fly). Yes, you can do this in gnuplot in two steps, but can become quite complicated and requires some more profound knowledge of gnuplot.
Here is a working script, but I won't go into detail about the many different aspects of the actual implementation:
set xdata time
set timefmt '%Y/%m/%d'
set datafile separator ','
set table 'temporaryfile.dat'
set format x '%Y/%m/%d'
plot 'export.csv' using (timecolumn(1) - (tm_mday(timecolumn(1))-1)*24*60*60):(1) smooth frequency
unset table
set y2tics
set ytics nomirror
set timefmt '"%Y/%m/%d"'
set format x '%b %Y'
set xtics rotate by 90 right
set datafile separator white
set yrange[0:*]
x0=x1=0
plot 'temporaryfile.dat' using 1:(strcol(3) eq "i" ? $2 : 1/0) w lp pt 7 ps 2 title 'IP count', \
'' using 1:(x1=x0, x0=$2, strcol(3) eq "i" ? ($0 == 0 || x0 == 0 ? 0 : (x0-x1)/x0 * 100.0) : 1/0) axes x1y2 w lp title 'percentual change'
Basically, first you plot the result data of smooth frequency into a second data file. Then you can plot this, and to the calculations for the percentages.
Please note, that I used a timeformat which corresponds to your test data (and the data of your previous question), which doesn't correspond with what you have in your script! Please pay attention to this.
Also note, that the timefmt before the actual plot must be extended by quote signs which are written around the dates in tmp.dat.
Finally, the strcol(3) eq 'i' is necessary to circumvent a gnuplot bug, which causes a last line to be written with invalid data.

In the gnuplot how do I plot data from two different files into a single plot?

I have two different files to plot in the gnuplot. they use a) different separator b) different time on x-axis
hence for each of them to plot separately I need to pass
set datafile separator
set timefmt
I would like to impose/overlay both data in a single graph such, that they are aligned with time
how could I do this?
The problem with the different separators can be addressed by using the format after the using modifier to specify a different separator for each file, e.g.:
plot 'file1.dat' u 1:2 '%lf,%lf'
plots a two column file with comma separator. See help\using for some more detail.
I am not expert of time formats, so I don't know how to deal with the timestamp format problem. But maybe you can use some function like strftime(). I never tried it, but it seems to me it does what you need.
You're right, you will need to pass set datafile separator and set timefmt once per file. You can do it like this:
set terminal <whatever>
set output <whatever.wht>
set xdata time # tell gnuplot to parse x data as time
set format x '%F' # time format to display on plot x axis
set datafile separator ' ' # separator 1
set timefmt '%F' # time format 1
plot 'file1'
set datafile separator ',' # separator 2
set timefmt '%s' # time format 2
replot 'file2'
The replot command by itself replots the previous line, and if you specify another line to be plotted that will go on top of the first one like I did here.
It seems to me that you have 2 options. The first is to pick a datafile format and beat both datafiles into that format, maybe using awk:
plot '<awk "-f;" "{print $1,$2}" data1' using 1:2 w lines,\
'data2' using 1:2 w lines
*Note, your awk command will almost certainly be different, this just shows how to use awk in an inline pipe.
Your second option is to use multiplot with explicit axes alignment:
set multiplot
set xdata time
set datafile sep ';' #separator for first file
set timefmt "..." #time format for first file
set lmargin at screen 0.9
set rmargin at screen 0.1
set tmargin at screen 0.9
set bmargin at screen 0.1
unset key
plot 'data1' u 1:2 w lines ls 1 nontitle
set key #The second plot command needs to add both "titles" to the legend/key.
set datafile sep ',' #separator for second file
set timefmt "..." #time format for second file
unset border
unset xtics
unset ytics
#unset other stuff that you set to prevent it from being plotted twice.
plot NaN w lines ls 1 title "title-for-plot-1", \
'data1' u 1:2 w lines ls 2 title "title-for-plot-2"
The plot NaN trick is only necessary if you want to have things show up correctly in the legend. If you're not using a legend, you can not worry about it.
This works for me :
reset
set term pngcairo
set output 'wall.png'
set xlabel "Length (meter)"
set ylabel "error (meter)"
set style line 1 lt 1 linecolor rgb "yellow" lw 10 pt 1
set style line 2 lt 1 linecolor rgb "green" lw 10 pt 1
set style line 3 lt 1 linecolor rgb "blue" lw 10 pt 1
set datafile separator ","
set key
set auto x
set xtics 1, 2, 9
set yrange [2:7]
set grid
set label "(Disabled)" at -.8, 1.8
plot "file1.csv" using 1:2 ls 1 title "one" with lines ,\
"file2.csv" using 1:2 ls 2 title "two" with lines ,\
"file3.csv" using 1:2 ls 3 title "three" with lines
set output

Resources