Plotting lines with missing datapoints for multidimensional data - gnuplot

I'm trying to plot multiple lines representing GPU usage over time from a dataset which records data of multiple GPUs. Each row contains the timestamp, a GPU index and the usage in percent.
My dataset looks like this:
$ cat gpu.txt
#time #index # usage (%)
1,1,10
1,2,5
2,1,20
2,2,10
3,1,40
3,2,30
and this is my gnuplot script:
$ cat plot.gplot
set datafile separator ","
set autoscale # scale axes automatically
unset log # remove any log-scaling
unset label # remove any previous labels
set xtic auto # set xtics automatically
set ytic auto # set ytics automatically
set title
set term png
set title "GPU usage"
set xlabel "Time"
set ylabel "Usage"
set output "gpu.png"
plot "gpu.txt" using ($2 == 1 ? $1 : NaN):($2 == 1 ? $3 : NaN) title 'GPU1' with linespoints ls 10 linecolor rgb "blue", \
"gpu.txt" using ($2 == 2 ? $1 : NaN):($2 == 2 ? $3 : NaN) title 'GPU 2' with linespoints ls 10 linecolor rgb "red", \
Unfortunately, this only ever draws the singular datapoints, but no lines. I think this is because of "missing" datapoints - which is not the case obviously because I have the custom filters in place to plot usage data per GPU index. I tried to indicate this to gnuplot via the NaN value, but it doesn't seem to work.
Example output:

This is kind of a recurring filtering data question.
You can define linestyles and then use it in the plotting loop via ls i.
Essential if you want connecting lines is the line: set datafile missing NaN.
My minimal suggestion would be:
Code:
### filtering data
reset session
$Data <<EOD
#time #index # usage (%)
1,1,10
1,2,5
2,1,20
2,2,10
3,1,40
3,2,30
EOD
set datafile separator ","
set title "GPU usage"
set xlabel "Time"
set ylabel "Usage"
set key top left
set datafile missing NaN
myFilter(datacol,filtercol,value) = value==column(filtercol) ? column(datacol) : NaN
plot for [i=1:2] $Data u (myFilter(1,2,i)):3 w lp pt 7 title sprintf('GPU%d',i)
### end of code
Result:

Related

GnuPlot: stacked histogram causes hovering bars

since two days I am trying to solve this problem. The bars of this stacked histogram are not printed above each other. They are floating freely around.
Secondly, I only want to print any 5th xtic-label. I am using GnuPlot v 4.6 patchlevel 6.hovering bars in stacked bargraph
Here are the first data rows (generated with libreoffice):
05.06,-,-,1
06.06,3,-,0
07.06,12,-,3
08.06,0,5,4
09.06,7,2,0
10.06,86,2,1
11.06,31,4,1
12.06,17,1,0
01.07,1,7,1
Here comes the command set:
gnuplot> set datafile separator ','
gnuplot> set style data histogram
gnuplot> set style histogram rowstacked
gnuplot> set style fill solid border -1
gnuplot> set xlabel "Zeit"
gnuplot> set ylabel "Anzahl"
gnuplot> set yrange [0:250]
gnuplot> plot 'test.csv' using 2:xtic(1) title "Menge A",''
gnuplot> using 3:xtic(1) title "Menge B",''
gnuplot> using 4:xtic(1) title "Menge C"
Gnuplot seems to get confused with - as only column content. Also a set datafile missing '-' doesn't help. You need a datafile with really empty fields, like
05.06,,,1
06.06,3,,0
07.06,12,,3
If you cannot get LibreOffice to save the data file properly you can use e.g. sed to process the file on-the-fly:
plot "< sed 's/-//g' test.csv" using 2:xtic(1), '' ...
(This works properly if you don't have negative values, which I suppose is the case).
To the second part: Instead of xtic(1) you can also put any expression which evaluates to a string inside of xtic, like
xtic(int($0)%5 == 0 ? strcol(1) : '')
This uses the string in the first column as xticlabel if the row number is a multiple of 5, otherwise an empty string:
set datafile separator ','
set style data histogram
set style histogram rowstacked
set style fill solid border -1
set xlabel "Zeit"
set ylabel "Anzahl"
set yrange [0:*]
plot '< sed "s/-//g" test.csv' using 2:xtic(int($0)%5 == 1 ? strcol(1) : '') title "Menge A",\
'' using 3 title "Menge B",\
'' using 4 title "Menge C"
As Christoph has already explained, the problem is caused by the - in your input data.
Therefore, the best and cleanest solution is to make LibreOffice display missing data differently.
However, everything worked fine for me when I mask the using COLUMNNUMBER part by using $COLUMNNUMBER. Hence, I changed the last line of your code to
plot 'test.csv' u ($2):xtic(1) t "Menge A", '' u ($3) t "Menge B", \
'' u ($4) t "Menge C"
As you see, you can shorten using to u and title to t. Moreover, you should use :xtic(1) only for the first data set.
Here is my outoput

Gnuplot - How do not plot a piece of line for non-contiguous date/time

I'm trying to plot last 24 hours from datafile. This data file has
date/time and value
Below are the contents of datafile.dat:
2015-12-17-21:07:41,74.30
2015-12-17-21:08:41,74.10
2015-12-17-21:08:41,74.10
2015-12-30-21:08:41,79.10
2015-12-30-21:09:41,79.10
....
below gnuplot script
set datafile separator ","
set terminal png font arial 12 size 1000,600
set xdata time
set timefmt "%Y-%m-%d-%H:%M:%S"
set format x "%d/%m\n%H:%Mh"
set xrange [ time(0) - 86400 : time(0) ] # 86400 sec = 1 day
set grid
set output "/data/weather/humidity.png"
plot "datafile.dat" using 1:2 with lines smooth bezier title ""
As I don't have data in the file for day 29, why does gnuplot draw a line from day 29 to day 30?
I don't have rows in the data file for day 29, and I'd like to not draw them.
If I don't have 24 hours of data in the the file, I would like to draw just what I have.
How can I do that?
Gnuplot has an option set clip which controls how lines connecting to points outside the given range are drawn: set clip one, which is the default, draws lines between two points if one of the points is in-range. The line is clipped at the plot border. set clip two would plot line parts even if both points are out-range, but the line goes through the plot area.
Use unset clip to plot only lines between points which are both in-range.
set datafile separator ","
set xdata time
set timefmt "%Y-%m-%d-%H:%M:%S"
set format x "%d/%m\n%H:%Mh"
set xrange [ time(0) - 86400 : time(0) ]
unset clip
plot "datafile.dat" using 1:2 with lines title ""
Unfortunately, that doesn't work properly with smoothing, because gnuplot first does the smoothing between all points (in-range and out-range), and then only applies the clipping rules. In order to have the smoothing handles properly you must filter the points before handling them to gnuplot, e.g. with awk:
set datafile separator ","
set xdata time
set timefmt "%Y-%m-%d-%H:%M:%S"
set format x "%d/%m\n%H:%Mh"
set xrange [time(0) - 86400 : time(0)]
filter = '< awk -F, -v d="$(date -d''24 hours ago'' +''%F-%T'')" ''$1>=d'' datafile.dat'
plot filter using 1:2 with lines smooth bezier title ""
Note, that the comparison $1 >= d in awk is a string comparison, but that is fine for the time format you are using.
As Christoph already mentioned, apparently smooth will smooth the data also outside the current plotting range. However, there is no need for awk or external tools, you can do it with gnuplot only (hence platform-independent).
Simply filter the data outside the range with gnuplot via:
myFilter(col,t) = t<t0 ? NaN : column(col)
The script creates some random test data with some missing data gap. When plotting only a part of the range you will get the unwanted (and unexpected) smoothing with data outside the plotting range, although you set unset clip.
Script:
### smooth filtered data
reset session
myTimeFmt = "%Y-%m-%d:%H:%M:%S"
set datafile separator ","
# create some random test data
set table $Data
set samples 50
SecPerDay = 3600*24
t0 = time(0)
y0 = 100
plot '+' u (sprintf("%s,%g", strftime(myTimeFmt,t0=t0-3600),y0=y0+rand(0)-0.5)) w table, \
t0 = t0 - 3*SecPerDay, \
'+' u (sprintf("%s,%g", strftime(myTimeFmt,t0=t0-3600),y0=y0+rand(0)-0.5)) w table
unset table
set format x "%b %d" timedate
set key out noautotitle
set grid x,y
N = 3 # N last days
t1 = time(0)
t0 = t1 - N*SecPerDay
myFilter(col,t) = t<t0 ? NaN : column(col)
set multiplot layout 3,1
set title "full data range"
plot $Data u (timecolumn(1,myTimeFmt)):2 w l ti "data", \
'' u (timecolumn(1,myTimeFmt)):2 smooth bezier w l ti "smooth"
set title "limited range with unwanted smoothing outside range"
unset clip
set xrange[t0:t1]
plot $Data u (timecolumn(1,myTimeFmt)):2 w l ti "data", \
'' u (timecolumn(1,myTimeFmt)):2 smooth bezier w l ti "smooth"
set title "limited range with filter"
set xrange[t0:t1]
plot $Data u (t=timecolumn(1,myTimeFmt)):(myFilter(2,t)) w l ti "data", \
'' u (t=timecolumn(1,myTimeFmt)):(myFilter(2,t)) smooth bezier w l ti "smooth"
unset multiplot
### end of script
Result:

How do I plot multiple Y axis for a single X axis in a single Gnuplot window?

I am using the 32-bit version of GNUPlot in a Window 7 "Professional" OS Environment (...sadly!) and I want to do a "stack-plot" of boxes using ONLY ONE x-axis for ALL which is "TIME" in the format of a series of "Dates".
ALL of the GNUPlot Code works but, each of the plots uses its own individual x-axis which consumes a lot of graphing real estate.
I also need to be able to have variable y-axis scales for each of the stacked-plots...
Here is the "labeled" (CSV) data file:
Date,Time,Weight(kg),Height(cm),BMI,BP Max.(mmHg),BP Min.(mmHg),P/min,% Fat 09/09/2015,13:16:00,77.4,171,26.5,121,73,75,22.5 16/07/2015,09:14:34,76.9,170,26.6,111,70,76,23.5 26/06/2015,18:14:48,76.9,170,26.6,123,72,78,23.2 19/06/2015,08:45:42,77,172,26,96,60,89,22.1 15/06/2015,12:29:48,77.7,170,26.9,117,73,87,23.6 15/06/2015,12:15:58,77.8,170,26.9,127,76,77,23.7 15/06/2015,12:11:05,77.7,171,26.6,118,74,83,22.8 23/03/2015,16:39:55,78.6,170,27.2,119,72,78,24 20/03/2015,09:07:30,77.6,169,27.2,138,74,77,24.1 09/01/2015,14:30:00,79.2,170,27.4,114,71,75,24.1 07/10/2014,16:06:00,78.4,171,26.8,119,73,108,24.8 07/10/2014,16:08:00,78.4,170,27.1,109,72,75,25.1 15/09/2014,08:18:23,76.9,171,26.3,116,69,102,24.8 15/09/2014,09:20:27,76.7,172,25.9,132,76,91,21 04/09/2014,12:05:00,75.6,169,26.5,115,71,96,25.4 01/04/2014,11:18:00,76.2,171,26,115,69,70,22.9 19/03/2014,09:48:23,75.3,171,25.8,113,69,55,22.1 14/03/2014,10:39:29,75.6,170,26.2,108,69,78,22.5 05/03/2014,16:45:00,75.9,170,26.3,129,73,84,23.3 09/05/2013,17:31:00,74.5,171,25.5,135,75,92,21
And here is the "current" GNUPlot Code that I am using to generate the 5 stacked plots:
reset
set terminal windows size 1325, 625
set multiplot layout 5, 1 title "Individual Employee Biometric Data vs. Time"
set xlabel "DATE"
set timestamp
set key outside
set key center right
set pointsize 1.0
set grid lw 1
set timefmt "%d/%m/%Y"
set xdata time
set format x "%d/%m/%Y"
set xrange [ "09/05/2013\t0000" : "09/09/2015\t0000" ] noreverse nowriteback
set datafile sep ','
set arrow from 10.0,0 to 10.0, 0.5 lw 3
set label ' ' at 10.2,0.03
set label '(C) 2015' at 2050.0,-0.85
set border lw 2
set yrange [73.0:80.0]
set ylabel "(kg)"
plot 'K8.dat' using 1:3 title "BODY\nWEIGHT" with linespoints lw 2 lt rgb 'red'
set yrange [25.0:30.0]
set ylabel "kg/m^2"
plot 'K8.dat' using 1:5 title "BODY\nMASS\nINDEX" with linespoints lw 2 lt rgb 'green'
set yrange [50.0:150.0]
set ylabel "(mmHg)"
plot 'K8.dat' using 1:6 title "SYS" with linespoints lw 2 lt rgb 'blue', \ 'K8.dat' using 1:7 title "DIAS" with linespoints lw 2 lt rgb 'coral'
set yrange [40.0:120.0]
set ylabel "(bpm)"
plot 'K8.dat' using 1:8 title "HEART\nRATE" with linespoints lw 2 lt rgb 'purple'
set xlabel "DATE"
set yrange [15.0:30.0]
set ylabel "(%)"
plot 'K8.dat' using 1:9 title "BODY\nFAT" with linespoints lw 2 lt rgb 'orange'
PS - This code is from a previous GNUPlot routine so "excuse" the '#" commenting-out...
You can use multiplot to stack several plots on top of each other. You just have to switch off the plot borders appropriately for each, see help set border, and unset the abscissa xtics for all but the lowermost plot.
set multiplot
set origin 0.1, 0.1
set size 0.9,0.3
set xrange [a:b]
plot "first"
set origin 0.1,0.4
unset xtics
set border 2 # only plot left border
plot "second"
set origin 0.1,0.7
plot "third"
unset multi
Crucial is fixing the xrange for all plots, because after switching off the xtics for the following plots, you can't see if it is actually identical.
(too long for a comment)
Ok, I get what you mean by stacked plots now. To my knowledge, having several y-axes (more than 2) above a single x axis is not possible.
What you COULD however do is try to fake more than 2 axes by plotting all data in the roughly 30...150 range on the y(1)-axis, and all data in the 15...30 range on the y2axis. However, the lines would be all kind of overlapping and not as cleanly separated.
Another alternative would be to first normalize all data into an e.g. 0...10 range by subtracting the min value and dividing by max-min, then stacking these on top of each other by adding 0 for the first line, 10 for the second, and so on. However, you would then have to add hand-made y-axis tics (which is possible but somewhat bothersome).
Actually, here is a working template for the fancier solution I outlined above (implemented for three data sets, but can be extended to basically arbitrarily many)
reset
set datafile separator ","
inputfile = 'data0.txt'
stats inputfile using 3 name 'STATS_WEIGHT'
STATS_WEIGHT_range = STATS_WEIGHT_max - STATS_WEIGHT_min
stats inputfile using 4 name 'STATS_HEIGHT'
STATS_HEIGHT_range = STATS_HEIGHT_max - STATS_HEIGHT_min
stats inputfile using 9 name 'STATS_FAT'
STATS_FAT_range = STATS_FAT_max - STATS_FAT_min
# more stats for further data -- apparently needs to be BEFORE the date/time stuff
set timefmt "%d/%m/%Y"
set xdata time
set format x "%d/%m/%Y"
set xrange [ "09/05/2013\t0000" : "09/09/2015\t0000" ] noreverse nowriteback
# define the offset at which the fake y-axes start; decrease or increase offsetIncrease for spacing (effectively: blank labels) between 'graphs'
startYTicsOffset = 0
numberOfFakeYTicsPerData = 6
scalingFactor = 1.0/(numberOfFakeYTicsPerData - 1.0)
offsetIncrease = numberOfFakeYTicsPerData + 0.5
#to get rid of actual yrange numbering, set a dummy label that will be overwritten
set ytics ("dummy" 0)
#increase total actual yrange factor as needed for additional series
set yrange [0: 3 * offsetIncrease]
#add tics for weight, note that %.Xf prints the number with X decimals
do for[i=0:numberOfFakeYTicsPerData-1]{
set ytics add (sprintf("%.0f kg", STATS_WEIGHT_min + i * scalingFactor * STATS_WEIGHT_range) startYTicsOffset+i)
}
#add tics for height
startYTicsOffset = startYTicsOffset + offsetIncrease
do for[i=0:numberOfFakeYTicsPerData-1]{
set ytics add (sprintf("%.1f cm", STATS_HEIGHT_min + i * scalingFactor * STATS_HEIGHT_range) startYTicsOffset+i)
}
#add tics for fat - I couldn't figure out how to get gnuplot to print actual '%' character in sprintf directive (should be '%%' but doesn't appear to work)
startYTicsOffset = startYTicsOffset + offsetIncrease
do for[i=0:numberOfFakeYTicsPerData-1]{
set ytics add (sprintf("%.1f percent", STATS_FAT_min + i * scalingFactor * STATS_FAT_range) startYTicsOffset+i)
}
###### ... add further tics ...
plot inputfile using 1:( 0 * offsetIncrease + ($3 - STATS_WEIGHT_min)/ (STATS_WEIGHT_range * scalingFactor) ) w lp title "weight",\
inputfile using 1:( 1 * offsetIncrease + ($4 - STATS_HEIGHT_min)/ (STATS_HEIGHT_range * scalingFactor) ) w lp title "height",\
inputfile using 1:( 2 * offsetIncrease + ($9 - STATS_FAT_min) / (STATS_FAT_range * scalingFactor) ) w lp title "fat %"
### ... add further data ...
by the way: if you post or edit a question or an answer, try clicking the image icon above the editing window. It will open a little window where you can drag and drop images directly without needing a web hosting service. Like that:

Unexpected output GNUPlot

I am drawing a simple graph using GNUPlot but output is not what I expected order.
Here is my script :
set title 'cost function vs clusters'
set xlabel '#clusters'
set ylabel 'cost function'
set terminal postscript
set output '| ps2pdf - output.pdf'
plot filename using 1:2 title "x" with linesp
Data on which I am plotting the data is :
13 0.004945370902817711
8 0.06739505462909719
2 0.28378378378378377
17 0.004657849338700402
5 0.015181138585393904
20 0.0018401380103507763
And here is my ouput :
I want points to be joined in sequential order of x.
How I can achieve this?
For the data you showed, you can use smooth unique. This sorts the data and replaces the same x-values with a single point having the averaged y value. If you can be sure, that you'll never have two equal x-values, then you can use this:
set title 'cost function vs clusters'
set xlabel '#clusters'
set ylabel 'cost function'
set terminal pdfcairo
set output 'output.pdf'
plot filename using 1:2 smooth unique title "x" with lp
And call it with gnuplot -e 'filename="aboveFile"' plot.gpi.
The other variant using sort also works fine:
plot '< sort -n '.filename using 1:2 title "x" with lp

Plotting 2 series from one data set in gnuplot

I have the following data file:
Time;Server;Hits
2011.05.05 12:00:01;Server1;12
2011.05.05 12:00:01;Server2;10
2011.05.05 12:00:02;Server1;2
2011.05.05 12:00:02;Server2;4
So, far I have come up with the following gnuplot script:
set datafile separator ";"
set autoscale
set xdata time
set timefmt "%Y.%m.%d %H:%M:%S"
set xtics rotate
set term png
set output "hits.png"
set style fill solid 0.5
plot "hits.log" using 1:3 title 'Hits'
But that one plots data from both servers on the same graph as one data series. How do I make gnuplot to display 2 data series: one for each server?
I found a soluton myself:
plot "hits.log" using 1:(stringcolumn(2) eq "Server1" ? $3 : 1/0) title 'Server1' with lines,\
"hits.log" using 1:(stringcolumn(2) eq "Server2" ? $3 : 1/0) title 'Server2' with lines

Resources