gnuplot: time series always at midnight marks; read bar width from file - gnuplot

I want gnuplot to plot an irregular timeseries as bars, but my bars are always placed at day boundaries (the midnight marks), as if the time info were ignored (e.g. the first two entries show up on top of each other at midnight). The days are spaced widely enough, it's not an issue of the bars being too scrunched up. Sample data:
07/09/2012-00:00 1 741 0.50
07/09/2012-12:00 2 3087 0.50
07/12/2012-00:00 1 2011 0.33
07/12/2012-08:00 2 814 0.33
07/12/2012-16:00 2 99 0.33
The relevant gnuplot code below. The xtics settings are just for aesthetics, they have no bearing on the issue.
set xdata time
set timefmt "%m/%d/%y-%H:%M"
set xtics format "%m/%d"
set xtics "07/08/2012-00:00", 2*172800 ,"08/28/2012-00:00"
plot FILE using 1:3:2 with boxes lc variable
Two separate, but related questions: 1) Can I remove the leading zeros from x-axis labels (i.e 7/8, not 07/08) to save space ("%m" and "%d" always give me leading zeros)? 2) Can I vary the width of a bar based on data from file (in this case, I'd like the 4th column to be a fractional multiplier for the standard bar width)? Thanks.

Your time format is wrong. you want %Y (4-digit year) instead of %y (2-digit year).
In order to specify the width, you'll need 4 columns of data:
plot FILE using x_column:y_column:x_width:linestyle w boxes lc variable
where x_width is the width of the box in seconds (since that is the unit on a time-axis).

Related

Gnuplot: sum occurrences in time interval

i'm having a csv like this
2021-10-31;20:30:26
2021-10-31;20:32:15
2021-10-31;20:39:17
2021-10-31;20:40:15
2021-10-31;20:42:13
2021-11-01;08:37:15
...
i would like to calculate the entries within a 10 minute interval and display it in an bar graph. In the example above there are from 20:30 till 20:40 there are 3 hits, from 20:40 till 20:50 there ar 2 hits, and so on.
Is there any way to ge this done with gnuplot? Or do i've to prepare the data?
Thank you, Martin
You can try the smooth frequency option like this:
reset
# formatting of output data (graph)
set format x "%Y-%m-%d\n%H:%M" timedate
# y-axis, bar graph should start at 0
set yrange [0:*]
set ylabel "Occurences"
set ytics 1
# make some space for large x axis labels
set rmargin at screen 0.95
# put input values into bins/time intervals
binwidth=10*60 # 10 minutes in seconds
bin(val) = binwidth * floor(val/binwidth)
# configure bar graph
set boxwidth binwidth
# final plot command
plot "a.dat" using (bin(timecolumn(1, "%Y-%m-%d;%H:%M:%S"))):(1) smooth freq with boxes fs solid 0.25 notitle
Documentation from help smooth freq:
The `frequency` option makes the data monotonic in x; points with the same
x-value are replaced by a single point having the summed y-values.
To plot a histogram of the number of data values in equal size bins,
set the y-value to 1.0 so that the sum is a count of occurances in that bin:
Example:
binwidth = <something> # set width of x values in each bin
bin(val) = binwidth * floor(val/binwidth)
plot "datafile" using (bin(column(1))):(1.0) smooth frequency
You have time data, so column must be replaced by timecolumn, see
help timecolumn for details.
The command set boxwidth is used by the boxes plotting style, see help plotting styles boxes for details.
This is the result:

Is there a way to change the y axis on Gnuplot so that my image graphs from hour 16 to hour 15 instead of 0 to 24?

I'm sorry if this has already been asked, I couldn't find it anywhere, but I have an image plot on gnuplot of a three-columned data file for a y range [0:24] and I can't figure out how to use gnuplot to rearrange the image graph so my y axis runs from 16:24 and then 0:16 (in that order and on the same axis). The command I've been using is "plot [] [0:24] '/Users/eleanor/PycharmProjects/attempt2.gray' u 1:2:3 w image" but I don't know what command to use so that hour 16 is at the very bottom instead of 0, and then when y reaches 23:59 y goes to 0 next and then continues increasing up to 15:59 at the very top of the axis. I'm not sure if that makes sense or not, and I've already tried changing the y range to [16:15] and that did nothing except give me an error lol. Any tips would be very much appreciated! :)
a piece of the file im using is below (with the first column being the day of year, the second being the time in decimal hours, and the third being the data):
20 0.0 7.327484247409568
20 0.002777777777777778 8.304658863945411
20 0.005555555555555556 11.641408500506405
20 0.008333333333333333 6.543382279013497
20 0.011111111111111112 13.922090817182697
20 0.013888888888888888 10.696406455987988
20 0.016666666666666666 12.537636516165243
20 0.019444444444444445 11.816216763447612
20 0.022222222222222223 8.914413125514413
20 0.025 5.8225423124691496
20 0.027777777777777776 10.896730484548698
20 0.030555555555555555 9.097140108173859
As currently implemented, with image treats the entire block of data as a single entity. You can't chop it up into pieces within a single plot command. However if your data is dense enough, it may be that you can approximate the same effect by plotting each pixel as a colored square:
set xrange [*:*] noextend
set yrange [0:24]
plot 'datafile' using 1:(($2>16.)? ($2-16.) : ($2+8.)):3 with points pt 5 lc palette
I strongly recommend not making the range limits part of the plot command. Set them beforehand using set xrange and set yrange.
If necessary, you can adjust the size of the individual square "pixels" by using set pointsize P where P is a scale factor. It probably looks best if you make the points just large enough (or small enough) to touch each other. I think the default ones in the image I show are too large.
You can also use the boxxyerror plotting style instead of the image plotting style. Well, here's what the help for boxxyerror says
gnuplot> ? boxxyerror
The `boxxyerror` plot style is only relevant to 2D data plotting.
It is similar to the `xyerrorbars` style except that it draws rectangular areas
rather than crosses. It uses either 4 or 6 basic columns of input data.
Additional input columns may be used to provide information such as
variable line or fill color (see `rgbcolor variable`).
4 columns: x y xdelta ydelta
6 columns: x y xlow xhigh ylow yhigh
....
If you adopt the four-column plotting style above, you must specify xdelta and ydelta in addition to x and y to specify the rectangle. The xdelta and ydelta should be the half-width and half-height of each pixel. From your data, let's say xdelta is half of 1 and ydelta is half of 0.002777777777777778 hours.
Our final script will look like this.
In this script, the second column of "using" is the same as Ethan's answer.
dx = 1.0/2.0
dy = 0.002777777777777778/2.0
set xrange [-1:32]
set yrange [0:24]
set ytics ("16" 0, "20" 4, "0" 8, "4" 12, "8" 16, "12" 20, "16" 24)
set palette defined (0 "green", 0.5 "yellow", 1 "red")
unset key
plot "datafile" using 1:($2>16?($2-16):($2+8)):(dx):(dy):3 \
with boxxy palette

GnuPlot plotting time-zero-based data starting with day-zero on the x axis

I have a file of data where the first column is time in seconds. The start of time is 0, and onward from there.
I want to plot the data with an x-axis formatted as days:hours:minutes:seconds. T=0 should map to 00:00:00:00. I can't figure out how to get days to start at 00 instead of 01. I have tried the below. I also tried setting xrange to [-86400:173000], but that maps to day 365, not 0. Shouldn't it be common to plot some time-sampled data, that may span days, starting with T=0?
It seems that GnuPlot needs a different set of time format characters for zero-based time plotting, instead of date-based. Unless it already has it and I have missed it.
data
0 0
3600 10
7200 30
21600 50
160000 100
GnuPlot script
set xdata time
set format x "%02j:%H:%M:%S"
set timefmt "%s"
set xrange [0:173000]
plot "data" using 1:2 with lines
I can get you part way there. Gnuplot has a separate set of time formats for relative time. Zero-based and handles positive and negative intervals. It's hard to find in the documentation, but here is a section from "help time_specifiers".
Format Explanation
%tH +/- hours relative to time=0 (does not wrap at 24)
%tM +/- minutes relative to time=0
%tS +/- seconds associated with previous tH or tM field
Examples of time format:
The date format specifiers encode a time in seconds as a clock time
on a particular day. So hours run only from 0-23, minutes from 0-59,
and negative values correspond to dates prior to the epoch
(1-Jan-1970). In order to report a time value in seconds as some
number of hours/minutes/seconds relative to a time 0, use time
formats %tH %tM %tS. To report a value of -3672.50 seconds
set format x # default date format "12/31/69 \n 22:58"
set format x "%tH:%tM:%tS" # "-01:01:12"
set format x "%.2tH hours" # "-1.02 hours"
set format x "%tM:%.2tS" # "-61:12.50"
Using these relative time formats with your sample data I can get as far as:
$data << EOD
0 0
3600 10
7200 30
21600 50
160000 100
EOD
set xtics time format "%tH:%tM:%tS"
set title 'set xtics time format "%tH:%tM:%tS"'
set xrange [0:173000]
plot $data using 1:2 with lp
Now the problem is that there is no equivalent relative day format. Call that a bug or at least a missing feature. Let's take a stab at adding days to the format by hand.
secperday = 3600*24
days(t) = gprintf("%02g:", int(t)/secperday)
hours(t) = strftime("%02tH:%tM:%tS", int(t)%secperday)
# Create ten days worth of tic labels
# Every six hours with no label; once a day with full label
set xtics 6*3600 format ""
do for [i=0:10] {
T = day * secperday
set xtics add ( days(T).hours(T) T )
}
plot $data using 1:2 with lp
As mentioned in the comments above, one workaround would be using week days, which however would limit you to 7 days.
Since 0 seconds correspond to Thursday, 01.01.1970 00:00:00 you have to subtract 4 days = 24*3600*4 seconds to make it a Sunday (=0).
Another strange workaround would be to use multiplot and plot twice, just for the day labels. You have to set a bottom margin to exactly "overplot" the previous plot. There would be still room for fine tuning.
By the way: If the scale is several days then the question is if seconds in the label are actually relevant?
Code:
### timedate days starting from zero
reset session
$Data <<EOD
0 0
3600 10
7200 30
21600 50
160000 100
450000 222
500000 333
EOD
set multiplot layout 2,1
# first workaround, limited to 7 days
set format x "day %1w\n%H:%M:%S" timedate
plot $Data u ($1-24*3600*4):2 w lp pt 7 notitle
# second workaround, using multiplot
set format x "\n%H:%M:%S" timedate
set bmargin 3
plot $Data u 1:2 w lp pt 7 notitle
set multiplot previous
set format x "day %s"
set xrange[GPVAL_X_MIN/86400:GPVAL_X_MAX/86400]
plot $Data u ($1/86400):2 w p ps 0 notitle # point size zero, i.e. invisible
unset multiplot
### end of code
Result:

Gnuplot: multiple lables on x axis

I'm trying to plot the following data
29.07.2012 18:45:04;23.6;54
29.07.2012 18:50:04;22.7;56
29.07.2012 18:55:04;22.2;56
29.07.2012 19:00:04;22.0;56
29.07.2012 19:05:04;21.9;57
29.07.2012 19:10:04;21.8;56
29.07.2012 19:15:04;21.8;54
29.07.2012 19:20:04;21.7;53
29.07.2012 19:25:04;21.7;53
(Date, time, temperature, humidity) in the following style (cropped at the top):
The labels on the x axis are the hour from the time of day and below are the weekdays and the date. I don't have the weekdays in my data file, but I'd like to have the date below the hours.
My plotfile:
set datafile separator ";"
set terminal png size 5280,1024
set output '~/tfd.png'
set xdata time
set timefmt "%d.%m.%Y %H:%M:%S"
set format x "%H"
plot "data.csv" using 1:2 title 'temperatur'
I can think of three method to do this. If you don't have to have those dates on the same axis, the second method is probably the most stable. Both the first and third methods have their advantages and disadvantages. Between those two, the third is possibly the better approach, but it requires more work.
For these examples, in order to make sure that the data would span more than 1 day, I used your same data but added one extra line
31.07.2012 19:30:04;22.7;53
All three methods work with version 5.0.
Method 1 does not line up correctly in version 4.6, but can be made to with one extra command.
Method 2 should work in any reasonably recent version.
Method 3 will not place all date labels in version 4.6 due to an overflow bug in iteration (see here for some explanation), but can be made to work by changing the iteration to place the labels.
Method 1 - Multiplot
We can do this by superimposing the same plot over itself using multiplot and doing the x-axis different each time.
set datafile separator ";"
set xdata time
set timefmt "%d.%m.%Y %H:%M:%S"
# Increase bottom margin to allow room for dates
set bmargin at screen 0.1
set multiplot layout 1,1
# tics starting at 0 every 6 hours showing hour
set xtics 0,60*60*6 format "%H"
plot "data.csv" using 1:2 with lines t "Temperature"
# Tics starting at 0 every 24 hours showing day.month
# moved down by 1 character to be under hours
set xtics 0,60*60*24 format "%d.%m" offset 0,-1
set origin 0,0 # This is not needed in version 5.0
replot
unset multiplot
Other than the difference in the axis labels, the plots must be identical to avoid them not lining up, and it does cause the y-axis labels to be slightly bolded as they are written over themselves.
In version 5.0 the set origin command is not needed, but is needed in version 4.6.
Method 2 - Secondary Axis
If using the secondary axis is acceptable, you could also approach it that way. For example, if the day is shown on the x2 axis and the hour on the x1 axis, we could do
set datafile separator ";"
set xdata time
set x2data time
set timefmt "%d.%m.%Y %H:%M:%S"
set xtics 0,60*60*6 format "%H"
set x2tics 0,60*60*24 format "%d.%m"
plot "data.csv" using 1:2 with lines t "Temperature"
This eliminates some of the problems of the multiplot method, but results in the two data labels being on different axes.
Method 3 - Setting Manual Labels
Finally we can manually set the labels. Fortunately, we can use a loop in the most recent version of gnuplot, so we don't have to issue a separate command for it, but we do have to compute the labels ourselves.
We can use the stats command to compute the labels. The stats command will complain if we give it time data, so we must use it before setting the time mode on, and we must do a little bit of work for computing the day boundaries. To make sure that we are working with the start of each day and not sometime in the middle, we parse the dates into an internal representation (seconds since the Unix Epoch), and round down to the nearest multiple of 86400 (the number of seconds in a day).
Altogether we can do
# in-large margin for date labels
set bmargin at screen 0.1
set datafile separator ";"
# Get first and last day in data file as STATS_min and STATS_max
stats "data.csv" u (floor(strptime("%d.%m.%Y %H:%M:%S",stringcolumn(1))/86400)*86400) nooutput
set xdata time
set timefmt "%d.%m.%Y %H:%M:%S"
set for [i=STATS_min:(STATS_max+86400):86400] label strftime("%d.%m",i) at i,graph 0 center offset 0,char -2
# set xtics every 6 hours
set xtics 0,60*60*6 format "%H"
plot "data.csv" using 1:2 with lines t "Temperature"
We can improve this by numbering the labels if we need to later remove them ((i-STATS_min)/86400+1 will number them 1, 2, 3, etc). Note that like the first method we needed to increase the margin size on the bottom. I added one extra day to the labels to cover the possible rounding up that gnuplot will do on the x-axis.
There is a bug dealing with iteration and integer overflow in version 4.6. To use this solution in 4.6, change
set for [i=STATS_min:(STATS_max+86400):86400] label strftime("%d.%m",i) at i,graph 0 center offset 0,char -2
to
days = (STATS_max-STATS_min)/86400+1
set for [i=0:days] label strftime("%d.%m",i*86400+STATS_min) at (i*86400+STATS_min),graph 0 center offset 0,char -2

gnuplot xrange min does not show

I have my dataset (d.asc) as follows:
0.1 0.5
0.12 0.56
...
90.4 0.34
...
100 0.78
I have my plot generation file as follows:
set xrange [0.1:100]
set grid
plot "d.asc" using 1:2 notitle with lines
I.e. I want to see first column on x-axis, and second column on y-axis. But, the x-axis values start from 0 and increment by 10 upto 100.
[1] Why it does not start from 0.1?
[2] Also is there a way to have only three (or four, etc.) specific value points on x-axis? For example I want to see on x-axis only 0.1, 90.4, and 100. Thanks.
[1] Why it does not start from 0.1?
Gnuplot likes to pick round numbers for its tic increments and positions. In your case the increments are 10, so they would appear at 0, 10, ... 100. Since you manually set the x range to start at 0.1 a tic does not appear until 10.
[2] Also is there a way to have only three (or four, etc.) specific value points on x-axis?
Yes, you can specify specific points with this syntax:
set xtics ("0.1" 0.1, "90.4" 90.4, "100" 100)
The value in quotes is the text that appears at the tic, and the number is the actual position at which it appears. (help set xtics for more format info.)

Resources