How to plot month numbers in gnuplot 5.4? - gnuplot

I am trying to get the month of a date in gnuplot 5.4.
Consider the following data:
2019/01/01
2019/02/01
2019/03/01
2019/04/01
2019/05/01
2019/06/01
2019/07/01
2019/08/01
2019/09/01
2019/10/01
2019/11/01
2019/12/01
2020/01/01
2020/02/01
2020/03/01
2020/04/01
2020/05/01
2020/06/01
2020/07/01
2020/08/01
2020/09/01
2020/10/01
2020/11/01
2020/12/01
For each data point, I want to show the full date on the x axis and the month number (0-11) on the y axis.
The gnuplot documentation recommends using the tm_mon function for such a task, which should return the month number for a given date.
As far as I understand, the following gnuplot script should do what I want:
#!/bin/gnuplot
set timefmt '%Y/%m/%d'
set xdata time
set format x '%m/%y'
set datafile separator ','
plot "data.csv" using 1:(tm_mon($1)) title 'data'
But that is not the case. This gnuplot script correctly shows dates on the x axis but has a constant 0 on the y axis.
What am I doing wrong? Why is tm_mon($1) constantly returning 0?

I'm not sure whether I fully got your intention.
I understand you want column 1 as xtic labels.
However, there is a difference between just taking the column 1 as xtic labels or interpreting column 1 as date/time and scaling the x-axis accordingly and gnuplot would take care automatically about the xtic labels.
For the first case, the width of the graph might not be wide enough to show all labels without overlap, so, I reformatted date. Check help strptime and help strftime. By the way, help tm_mon says it needs input in seconds, not in your date format as it is in $1, there for use strptime().
I understand you want the yrange ranging from 1 to 12 for the months of a year.
But where is the data you want to plot?
Maybe the following examples is a starting point to better find out what you really want.
Code:
### plotting dates...
reset session
$Data <<EOD
2019/01/01
2019/02/01
2019/03/01
2019/04/01
2019/05/01
2019/06/01
2019/07/01
2019/08/01
2019/09/01
2019/10/01
2019/11/01
2019/12/01
2020/01/01
2020/02/01
2020/03/01
2020/04/01
2020/05/01
2020/06/01
2020/07/01
2020/08/01
2020/09/01
2020/10/01
2020/11/01
2020/12/01
EOD
myTimeInputFmt = "%Y/%m/%d"
myTimeOutputXFmt = "%Y\n%m\n%d"
set xlabel "Date"
set format x myTimeOutputXFmt # to get enough space for the 3 lines of the label
set ylabel "Month"
set yrange [0.5:12.5]
set ytics 1
plot $Data u 0:(tm_mon(strptime(myTimeInputFmt,strcol(1)))+1): \
xtic(strftime(myTimeOutputXFmt,strptime(myTimeInputFmt,strcol(1)))) \
with points pt 7 lc "red" title 'data'
### end of code
Result:

Another way to read data properly
There is also the function 'timecolumn(N, "timeformat")' in gnuplot, although it can only be used in using. This is the function for reading the Nth column of input data as a datetime. The second argument of this function is optional if you use the format given by set timefmt. With it, your script will work by the following script.
#!/bin/gnuplot
set timefmt '%Y/%m/%d'
set xdata time
set format x '%m/%y'
set datafile separator ','
plot "data.csv" using 1:(tm_mon(timecolumn(1))) title 'data'
Why is tm_mon($1) constantly returning 0?
In your script, tm_mon($1) is interpreted as tm_mon("2019/01/01") when reading first line from 'data.csv'. tm_mon basically accepts the real value that represents an UNIX time as the argument. But, if a string value is given, tm_mon try to convert it into the real value and interpret as an UNIX time. So, tm_mon doesn't constantly returning 0. You would understand this behavior by trying the following commands.
gnuplot> print tm_mon(1607698800) ### 2020-12-12 in UNIX time
11.0
gnuplot> print tm_mon("1607698800")
11.0
gnuplot> print tm_mon("1607698800/05/06")
11.0

Related

Labeling Points with a date/time x axis in GNUPLOT

I have a simple dataset consisting of a date/time field (YYYY-MM-DD HH:MM) and a temperature field, taken at 1 minute intervals. I am trying to learn how to label points on a plot such than I can label the max and min temperatures.
The data is coming from an SQL server in a node.js app and the commands / data are being piped to gnuplot via STDIN with the output being a PNG. I can successfully plot the data, but now I am just trying to label the max and min temperature points on the plot, with the "coordinates" coming from an SQL query such that a min or max point would look like (x,y) = (2021-02-11 18:34, 72.57). Every command I try that should label points on the plot have no effect. And the examples that I find usually don't involve a date/time x-axis.
What is the magic ju-ju needed to be able to take an arbitrary data point and label it on the plot with a time/date x-axis? FWIW, since the data is being retrieved from an SQL server, I can format the data in whatever way makes it the easiest method needed to plot the data and more importantly, label the points that I want.
Thanks in advance for any advice!!
Justin
EDIT (per the first comment):
Here is what I am using in terms of gnuplot commands:
set term png size 1280,600
unset output
set datafile separator ","
set xdata time
set timefmt "%Y-%m-%d %H:%M"
set format x "%H:%M"
set ylabel "Temperature ˚F
set xlabel "Time
set style line 100 lt 1 lc rgb "grey" lw 0.5
set grid ls 100
set xtics border
set xtics rotate
set key off
$mydata << EOD
// this is where my code "prints" the SQL data to STDIN of gnuplot
// data is formatted: 2021-02-11 22:48,74.42
EOD
plot $mydata using 1:2 with lines
Note, I can format the data in whatever way that makes it easiest for gnuplot to process. I have found that sending YYYY-MM-DD HH:MM,TEMP has worked without issue.
I am tring to use the 'set label' command using a time/date as the x value and the temperature as the y value:
set label 1 'Maximum' at 2021-02-11 19:10,72.50
which does not produce any labels.
Here is an example plot where I want to label the max and min points:
When the x axis is time, the units are seconds. You are looking for the functions that convert between seconds (a floating point value) and a formatted time/date string.
strptime("timeformat",s) reads the time from the string s using the
timeformat specifiers and converts it into seconds since the year 1970.
strftime("timeformat",t) produces a string by applying the timeformat
specifiers to the time t
given in seconds since the year 1970.
In the case you show, this would correspond to:
myformat = "%Y-%m-%d %H:%M"
set label 1 'Maximum' at strptime(myformat, "2021-02-11 19:10"), 72.50

How to plot week number with string and control xtics increment using Gnuplot?

How do I plot this time samples with a string (W) inside (column 3) ?
And How do I control xtics increment with time format ?
Few lines of Data :
France,FR,2020-W09,118,3318,67012883,4.95128675481698,3.55635925256178,TESSy
France,FR,2020-W10,996,11101,67012883,16.5654714482288,8.97216466984956,TESSy
France,FR,2020-W11,4297,29623,67012883,44.2049329529667,14.5056206326166,TESSy
France,FR,2020-W12,10595,73235,67012883,109.28495644636,14.4671263740015,TESSy
France,FR,2020-W13,24156,122870,67012883,183.35280396756,19.6598030438675,TESSy
France,FR,2020-W14,30304,127029,67012883,189.55907329043,23.8559698966378,TESSy
France,FR,2020-W15,24925,140316,67012883,209.386604065371,17.7634767239659,TESSy
My script :
#https://www.ecdc.europa.eu/en/publications-data/covid-19-testing
#Data (105,77K) here :
system("wget https://opendata.ecdc.europa.eu/covid19/testing/csv -P $PWD -O testing.csv")
reset
set term wxt font ',11' size 1200,800
set datafile separator ","
set grid
#set key at screen 0.9, 0.9
timefmt = "%Y-%s%W"
set xdata time
set xtics format timefmt timedate rotate by -45
SECPERWEEK = 3600.*24.*7.
Y_W(col) = timecolumn(col,timefmt) + SECPERWEEK * (strcol(col)[2:3] - 1)
plot '< grep France testing.csv' u (Y_W(3)):4 notitle w l
Thank you
Here is a suggestion how I would do it. It's maybe not obvious and looks maybe a bit complicated, but it is a gnuplot-only solution.
Since I do not run Linux, I do not have grep, that's why I define myFilter() in gnuplot itself which is platform independent.
Everytime this filter gives a hit, the counter t will be increased by one which has the advantage that the data can contain a interlaced mix of countries. I assume that's what grep would allow as well. The only assumption here is that the week numbers are in (ascending) order, they would not be sorted.
I guess here it is not necessary to have the x-axis as timeformat.
The situation would be different if there are missing calendar week(s) and you want to keep an according gap for them.
With myOffset=0 and myEvery=2 you set how many x-tic labels you want to have displayed.
There is certainly room for improvement and I'm sure there are other solutions... so, just as a starting point...
Code:
### plot filtered data with custom xtics
reset session
$Data <<EOD
France,FR,2020-W09,118,3318,67012883,4.95128675481698,3.55635925256178,TESSy
France,FR,2020-W10,996,11101,67012883,16.5654714482288,8.97216466984956,TESSy
France,FR,2020-W11,4297,29623,67012883,44.2049329529667,14.5056206326166,TESSy
Luxembourg,LU,2020-W11,11,222,33333333,44.4444444444444,55.5555555555555,fghij
Luxembourg,LU,2020-W12,11,222,33333333,44.4444444444444,55.5555555555555,fghij
France,FR,2020-W12,10595,73235,67012883,109.28495644636,14.4671263740015,TESSy
France,FR,2020-W13,24156,122870,67012883,183.35280396756,19.6598030438675,TESSy
Belgium,BE,2020-W13,1111,222222,33333333,444.44444444444,55.5555555555555,abcde
Belgium,BE,2020-W14,1111,222222,33333333,444.44444444444,55.5555555555555,abcde
France,FR,2020-W14,30304,127029,67012883,189.55907329043,23.8559698966378,TESSy
France,FR,2020-W15,24925,140316,67012883,209.386604065371,17.7634767239659,TESSy
EOD
set datafile separator comma
set datafile missing NaN
set xtics rotate by -45
myFilter(dcol,fcol,key) = strcol(fcol) eq key ? (t=t+1, column(dcol)) : NaN
myXtic(col) = sprintf("%s",(t+myOffset)% myEvery ? "" : strcol(col))
myKey = 'France'
myOffset = 0
myEvery = 2
plot t=1 $Data u (t):(myFilter(4,1,myKey)):xtic(myXtic(3)) w lp pt 7 title myKey
### end of code
Result:
The basic error is that the Y_W function is looking in the wrong columns for the week number. It should be substring 7 to 8 not 2 to 3.
Y_W(col) = timecolumn(col,"%Y") + SECPERWEEK * (strcol(col)[7:8])
As explained by theozh in this answer, gnuplot uses American week numbers by default, not ISO 8601, so I have not addressed that here.

Gnuplot: multiple lables on x axis

I'm trying to plot the following data
29.07.2012 18:45:04;23.6;54
29.07.2012 18:50:04;22.7;56
29.07.2012 18:55:04;22.2;56
29.07.2012 19:00:04;22.0;56
29.07.2012 19:05:04;21.9;57
29.07.2012 19:10:04;21.8;56
29.07.2012 19:15:04;21.8;54
29.07.2012 19:20:04;21.7;53
29.07.2012 19:25:04;21.7;53
(Date, time, temperature, humidity) in the following style (cropped at the top):
The labels on the x axis are the hour from the time of day and below are the weekdays and the date. I don't have the weekdays in my data file, but I'd like to have the date below the hours.
My plotfile:
set datafile separator ";"
set terminal png size 5280,1024
set output '~/tfd.png'
set xdata time
set timefmt "%d.%m.%Y %H:%M:%S"
set format x "%H"
plot "data.csv" using 1:2 title 'temperatur'
I can think of three method to do this. If you don't have to have those dates on the same axis, the second method is probably the most stable. Both the first and third methods have their advantages and disadvantages. Between those two, the third is possibly the better approach, but it requires more work.
For these examples, in order to make sure that the data would span more than 1 day, I used your same data but added one extra line
31.07.2012 19:30:04;22.7;53
All three methods work with version 5.0.
Method 1 does not line up correctly in version 4.6, but can be made to with one extra command.
Method 2 should work in any reasonably recent version.
Method 3 will not place all date labels in version 4.6 due to an overflow bug in iteration (see here for some explanation), but can be made to work by changing the iteration to place the labels.
Method 1 - Multiplot
We can do this by superimposing the same plot over itself using multiplot and doing the x-axis different each time.
set datafile separator ";"
set xdata time
set timefmt "%d.%m.%Y %H:%M:%S"
# Increase bottom margin to allow room for dates
set bmargin at screen 0.1
set multiplot layout 1,1
# tics starting at 0 every 6 hours showing hour
set xtics 0,60*60*6 format "%H"
plot "data.csv" using 1:2 with lines t "Temperature"
# Tics starting at 0 every 24 hours showing day.month
# moved down by 1 character to be under hours
set xtics 0,60*60*24 format "%d.%m" offset 0,-1
set origin 0,0 # This is not needed in version 5.0
replot
unset multiplot
Other than the difference in the axis labels, the plots must be identical to avoid them not lining up, and it does cause the y-axis labels to be slightly bolded as they are written over themselves.
In version 5.0 the set origin command is not needed, but is needed in version 4.6.
Method 2 - Secondary Axis
If using the secondary axis is acceptable, you could also approach it that way. For example, if the day is shown on the x2 axis and the hour on the x1 axis, we could do
set datafile separator ";"
set xdata time
set x2data time
set timefmt "%d.%m.%Y %H:%M:%S"
set xtics 0,60*60*6 format "%H"
set x2tics 0,60*60*24 format "%d.%m"
plot "data.csv" using 1:2 with lines t "Temperature"
This eliminates some of the problems of the multiplot method, but results in the two data labels being on different axes.
Method 3 - Setting Manual Labels
Finally we can manually set the labels. Fortunately, we can use a loop in the most recent version of gnuplot, so we don't have to issue a separate command for it, but we do have to compute the labels ourselves.
We can use the stats command to compute the labels. The stats command will complain if we give it time data, so we must use it before setting the time mode on, and we must do a little bit of work for computing the day boundaries. To make sure that we are working with the start of each day and not sometime in the middle, we parse the dates into an internal representation (seconds since the Unix Epoch), and round down to the nearest multiple of 86400 (the number of seconds in a day).
Altogether we can do
# in-large margin for date labels
set bmargin at screen 0.1
set datafile separator ";"
# Get first and last day in data file as STATS_min and STATS_max
stats "data.csv" u (floor(strptime("%d.%m.%Y %H:%M:%S",stringcolumn(1))/86400)*86400) nooutput
set xdata time
set timefmt "%d.%m.%Y %H:%M:%S"
set for [i=STATS_min:(STATS_max+86400):86400] label strftime("%d.%m",i) at i,graph 0 center offset 0,char -2
# set xtics every 6 hours
set xtics 0,60*60*6 format "%H"
plot "data.csv" using 1:2 with lines t "Temperature"
We can improve this by numbering the labels if we need to later remove them ((i-STATS_min)/86400+1 will number them 1, 2, 3, etc). Note that like the first method we needed to increase the margin size on the bottom. I added one extra day to the labels to cover the possible rounding up that gnuplot will do on the x-axis.
There is a bug dealing with iteration and integer overflow in version 4.6. To use this solution in 4.6, change
set for [i=STATS_min:(STATS_max+86400):86400] label strftime("%d.%m",i) at i,graph 0 center offset 0,char -2
to
days = (STATS_max-STATS_min)/86400+1
set for [i=0:days] label strftime("%d.%m",i*86400+STATS_min) at (i*86400+STATS_min),graph 0 center offset 0,char -2

Gnuplot: How to default to '0' when data is missing from a date sequence

I have a bunch of data that is basically a date+time with a number I'd like to plot using gnuplot. The problem is that the data is pulled from a database, so when there are times of day with zero activity, there are no rows created, so my 'csv' file I'm feeding to gnuplot has gaps in the sequence.
Plot config:
set term jpeg medium size 800,600
set output "yesterday.jpg"
set datafile separator ":"
set title "Yesterday's Uploads"
set xlabel "Hour of day (Eastern)" offset 0,-2
set ylabel "Items per minute"
unset key
set bmargin 10
set xdata time
set timefmt "%m/%d/%Y-%H-%M"
set xtics rotate
set style fill solid 0.5
plot "yesterday.stats" u 1:2 w boxes
Example data:
08/27/2013-23-00:34
08/27/2013-22-59:20
08/27/2013-22-58:79
08/27/2013-22-53:6
08/27/2013-22-52:24
08/27/2013-22-51:15
08/27/2013-22-50:12
08/27/2013-22-42:1
08/27/2013-22-38:58
08/27/2013-22-37:36
Note the missing minutes (such as from 38 to 42, and 42 to 50) where there was no activity, thus no DB entries, thus no info in my plot input file.
When I try to plot this using the config example above, the 'gaps' show up as a horizontal filled bar that is the width of the missing data.
I'd like the missing data to simply be a 'zero' with no activity shown on the graph. I'm thinking that there must be a way to handle this in gnuplot and wanted to check with you all before I wrote some script to insert dummy entries into my data.
Any suggestions? Perhaps a different type of plot besides boxes that doesn't "connect" the datapoints, thus leading to those odd horizontal areas?
You can specify the width of a box by the third parameter to using:
plot "yesterday.stats" u 1:2:(50) w boxes
I use this small script to explicitly add zeros to my (numerically timestamped) data: https://chkno.net/fill-in
gnuplot can execute commands instead of reading files with the < syntax:
plot "< fill-in 60 0 yesterday.stats"
(Setting a box width works for "with boxes", but adding the zeros allows you to use any style.)

Plotting date/time data points and functions in the same graph

I want to plot a set of date/time data points with the x value specified by unix epoch, seconds since 1970. I also want to plot a trend function along with the data (yes I know gnuplot can do this for me, but this is an example). So I have data.txt that looks like this:
1303430400 67.5
1303862400 65.5
1304208000 62.9
1304553600 60.2
And I have a gnuplot program that looks like this:
set terminal png
set output 'plot.png'
set timefmt "%s"
set xdata time
plot "data.txt" using 1:2 title "Data points", \
-7.0/(1123200)*x + 8190.43 title "Trend"
Now, the function is simply a linear approximation of the data. I have checked the formula and it should be ok. For instance, plugging in 1304553600 (the last value of the range), you get 60.2.
I would expect to see my data points plotted along with the function roughly following the data points. Instead, the function plot is way off, about 6000 too high. Apparently I do not understand something about date/time plots. What should I do to get my expected result?
I found this in the man page: "[In date plots] gnuplot will also accept an integer expression, which will be interpreted as seconds from 1 January 2000."
There is a difference of 946684800 seconds between unix epoch and gnuplot time. This gnuplot script gives the expected plot (note the 946684800 addition to x):
set terminal png
set output 'plot.png'
set timefmt "%s"
set xdata time
plot "stackoverflow.txt" using 1:2 title "Data points", \
-7.0/(1123200)*(x+946684800) + 8190.43 title "Trend"

Resources