Two data points on same x coordinate overlapping - gnuplot

I started keeping a record of days that I've gone running, and the distance. I like plotting this using boxes to get an overview of how active I have been lately.
I ran into a problem today when I added yesterday's data.
As you can see from 05/04/13 there are two runs, and the plot shows two boxes on the same day (far left box). I like this behavior. 06/26/13 I had two runs again but this time the plot was only showing one (far right box). After a little playing around I realized it's because on 05/04, the larger number (in column 2) comes first, so the smaller number gets plotted on top of it. The opposite is true for 06/26, and the result is only being able to see the larger number for that day.
Is there a way to fix this without altering my data file?
If it's possible to do in the plot script, I wouldn't have to watch how I enter data to my file.
Here is the data:
05/04/13 1.59
05/04/13 0.81
05/05/13 1.56
05/06/13 1.90
05/08/13 2.77
05/11/13 2.19
05/12/13 0.93
05/14/13 2.50
05/15/13 1.04
05/16/13 1.66
06/02/13 4.02
06/03/13 1.80
06/04/13 1.04
06/05/13 0.93
06/12/13 1.18
06/15/13 1.78
06/16/13 1.26
06/19/13 0.86
06/21/13 0.93
06/26/13 1.05
06/26/13 1.39
The script:
set terminal x11 nopersist size 1200,645
unset mouse
unset key
unset label
unset grid
set boxwidth 86400 absolute
set style fill solid 1.00 border lt -1
set bmargin at screen 0.08
set xdata time
set timefmt x "%m/%d/%y"
set format x "%b %d"
set xtics 86400 nomirror rotate by -90
set mxtics 0
set xrange [ "05/01/13" : "06/30/13" ] noreverse nowriteback
set ylabel "Distance"
set ylabel textcolor lt -1 rotate by -270
set yrange [ 0.00000 : 4.50000 ] noreverse nowriteback
plot "/Users/user/Dropbox/nvalt/walks.txt" using 1:2 with boxes lt rgb "#777777"
An image of the plot:

For this type of files, it doesn't really matter in what order the days are, but as you mention, the ordering of the data is important. I was able to obtain the required output, by simply replacing
plot "/Users/user/Dropbox/nvalt/walks.txt" using 1:2 with boxes lt rgb "#777777"
By
plot "<sort -r /Users/user/Dropbox/nvalt/walks.txt" using 1:2 with boxes lt rgb "#777777"
This should also work for more than two data points for the same date.

Related

gnuplot: string values xticlabel & adjusting fontsize

I have data I would like to plot in a histogram style with a "cumulated" curve on top. I have the following problem:
My data consists of one column with the categories ("discharge") and one column with the quantity of values ("probability") that belong to the respective category. The last value of the category-column is ">100" summarizing all power plants that have a bigger discharge than the last numeric value ("100 m^3/s"). I have not found a solution to plot this last category and the respective values with the command plot 'datafile.dat' using 1:2 with boxes ... because (as I assume) in this case only numerical values are read out for the x-ticlabels, so the last category is missing. If
I plot it with this command plot 'datafile.dat' using 2:xtics(1) with boxes ... I get the last category ">100" plotted just fine.
BUT: if I use the latter command the x-axis labels appear in the normal font size. Even though I have the line set format x '\footnotesize \%10.0f' in my code.
I have read about explicit labels in the plotcommand line that overwrite format style which was set before but was not able to adapt it to my code.
Changing ytic font size in gnuplot epslatex (multiplot)
Do you have an idea how to do this?
Excel screenshot to visualize what I want to achieve
'datafile.dat'
discharge probability cumulated
10 20 20%
20 10 10%
30 5 5%
40 6 6%
50 4 4%
60 12 12%
70 8 8%
80 15 15%
90 20 20%
100 6 6%
>100 4 4%`
[terminal=epslatex,terminaloptions={size 15cm, 8cm font ",10"}]
set xrange [*:*]
set yrange [0:20]
set y2range [0:100]
set xlabel 'Discharge$' offset 0,-1
set ylabel 'No. of power plants' offset 10.5
set y2label 'Cumulated probability' offset -10
set format xy '$\%g$'
set format x '\footnotesize \%10.0f'
set format y '\footnotesize \%10.0f'
set format y2 '\footnotesize \%10.0f'
set xtics rotate by 45 center offset 0,-1
set style fill pattern border -1
set boxwidth 0.3 relative
set style line 1 lt 1 lc rgb 'black' lw 2 pt 6 ps 1 dt 2
plot 'datafile.dat' using 1:2 with boxes axes x1y1 fs pattern 6 lc black notitle, \
'datafile.dat' using 1:3 with linespoints axes x1y2 ls 1 notitle
I am confused by your datafile; the numbers in the third column do not seem to be cumulative, and do not add up to 100%. Here is a solution that uses only the first two columns of your file:
set term epslatex standalone header "\\usepackage[T1]{fontenc}"
set output 'test.tex'
stats "datafile.dat" using 2
total = STATS_sum
set xlabel "Discharge" offset 0, 1.5
set xtics rotate
set ylabel "No. of power plants"
set ytics nomirror
set yrange [0:*]
set y2label "Cumulative probability"
set y2tics
set y2range [0:]
set boxwidth 0.3 relative
set style line 1 lt 1 lc rgb 'black' lw 2 pt 6 ps 1 dt 2
plot \
'datafile.dat' using 2:xtic("\\footnotesize " . stringcolumn(1)) with boxes axes x1y1 fs pattern 6 lc black notitle, \
'datafile.dat' using ($2/total) smooth cumulative with linespoints axes x1y2 ls 1 notitle
set output
The trick is to add the latex command \footnotesize in front of each label in the using command. It also first computes the total number of power plants so that it can compute probabilities, and computes cumulative values with the smooth cumulative option.

gnuplot histogram bars empty space

I am trying a very easy gnuplot graph.
Data is
Average 0.177 0.167
Median 0.179 0.173
and graph code:
set style fill solid border 0
set boxwidth 1.5
set style histogram clustered
set style data histograms
plot "PDR.txt" using 2:xtic(1) lt rgb "#406090",\"" 0" using 3 lt rgb "#40FF0
problem is that the graph produced has way too much space everywhere, in the middle, but especially left and right. How can I reduce those?
You should set the xrange to the desired length, e.g. set xrange [-0.25:1.5] looks alright on my computer

gnuplot - autoscale y axis with filledcurves + xrange + xdata time

in gnuplot 5.0 patchlevel 1 on my old server I used:
set term pngcairo transparent truecolor size 190,40
set output "some.png"
unset bmargin
set bmargin 0
set lmargin 0
set rmargin 0
set tmargin 0
unset border
unset xtics
unset ytics
unset y2tics
unset key
unset title
unset colorbox
set timefmt '%Y-%m'
set xdata time
set style fill transparent solid 0.25 noborder
tt = "`date +%Y-%m-%d\ %H:%M`"
TIMEFMT = "%Y-%m-%d %H:%M"
now_secs = strptime(TIMEFMT,tt)
two_years_past = now_secs - 3600.0*24*365*2
eval(sprintf('set xrange ["%s":]',strftime(TIMEFMT,two_years_past)))
set autoscale yfix
plot "datafile" using 1:2 with filledcurves below x1 lw 1 lc rgb "#a7eeeeee" title ''
...it produced a graph with y range correctly auto-scaled.
But on my new server with gnuplot 5.0 patchlevel 3 installed it does not work anymore. It seems they screwed something in the code. The yrange is computed from all x timedata, not over the selected xrange only.
I have no idea how to correct the yrange in this case. It could be computed using the stats command, but the "xdata time" must be switched off before, but in that case I do not know, how to set the right xrange for the stats command.
Regards
Pavel
EDIT:
minimal datafile:
2014-01 2
2014-06 6
2015-01 4
2015-06 8
2016-01 6
2016-06 10
I can reproduce your y-range autoscale issue with gnuplot<=5.0.1 and all versions >5.0.1 to 5.4.0.
Although, gnuplot is not scaling to the full y-data-range as you assumed, but apparently filledcurves x1 seems to always (auto)scale to 0 unless there is y-data <0.
To me, this looks more like a bug than a feature. I don't see a reason why filledcurves should always autoscale to 0.
In contrary to this behaviour, the plotting style with boxes will still autoscale in y to the minimum as you want it.
So, as a workaround to keep your desired behaviour you need to add two lines:
stats $Data u 2 nooutput
set yrange[STATS_min:]
The x-range is already limited when executing the stats command, hence you will get the y-minimum in STATS_min which you can use to set the y-range.
By the way: I cleaned up your script a bit. Why making a platform-dependent system call for getting the current time, if you have the gnuplot function time()? Check help time.
Script: (works identical for all gnuplot versions >=5.0.0)
### adjust time range via current time with proper y autoscale
reset session
$Data <<EOD
2020-01 12
2020-06 6
2021-01 4
2021-06 8
2022-01 6
2022-06 10
EOD
t0 = time(0) # now, i.e. seconds from Jan, 1st 1970 00:00:00
TwoYearsInSec = 3600*24*365*2 # two years in seconds
myTimeFmt = "%Y-%m"
set format x "%Y\n%m" timedate
set style fill transparent solid 0.25 noborder
set xrange [t0-TwoYearsInSec:t0]
set multiplot layout 2,1
set title "undesired y-autoscaling to 0 with filledcurves"
plot $Data u (timecolumn(1,myTimeFmt)):2 w filledcurves above x1 lc rgb 0xff0000 not
set title "workaround to scale to y-minimum"
stats $Data u 2 nooutput
set yrange[STATS_min:]
replot
unset multiplot
### end of script
Result: (created with gnuplot 5.4.0)

gnuplot heat map with different scales

I have troubles creating a heat map with gnuplot for data with different scales.
Consider the following sample data set:
0.100 1.000 10.0
0.010 1.000 20.0
0.001 1.000 40.0
0.100 10.00 20.0
0.010 10.00 40.0
0.001 10.00 80.0
0.100 100.0 40.0
0.010 100.0 80.0
0.001 100.0 160.0
If I plot it using a heatmap, it only seems to be correct if I scale the x-values such that they are in the same range as the y-values.
Please find below an illustrating example. Only the second plot gives me the correct values of the heat map (high values in the top left corner, low values in the bottom right corner):
set multiplot layout 2,1
set pm3d
set dgrid3d 20,20
set view map
set xlabel 'unscaled'
splot 'data.dat' u 1:2:3
set xlabel 'scaled by factor 1000'
splot 'data.dat' u ($1*1000):2:3
How can I achieve this also for the non-scaled values?
Any help is appreciated. Many thanks.
Here you go :
set dgrid3d 20,20
set pm3d explicit
set view map
set table "interpolated_data.dat"
splot 'data.dat' using ($1*1000):2:3
set output 'heatmap.png'
set terminal pngcairo
set multiplot layout 2,1
unset table
unset dgrid3d
set pm3d
unset surface
set xlabel 'scaled by factor 1000'
splot 'interpolated_data.dat' u 1:2:3
set xlabel 'unscaled'
splot 'interpolated_data.dat' u ($1/1000):2:3
The scaled plot looks correct, but I'm not sure whether it really is correct. At least there seems to be an artifact in the lower left corner, a local maximum which probably should not be there. You can see it better if you remove set view map:
I think the reason is the dgrid3d. It does some fancy weighting of the neighboring points which can lead to unexpected results.
My suggestion would be to use a linear interpolation by removing set dgrid3d 20,20 and using set pm3d interpolate 20,20. This gives the following picture:
Finally, your data somehow asks for at least to try a logscale plot:
My script for the last plot follows. Nothing special compared to yours. I had to specify xrange for the unscaled plot, and it is longer because of the 4 plots.
reset
set terminal png size 1200,800
set output "data_log.png"
set logscale x
set logscale y
set multiplot layout 2,2 title "With \"interpolate\" and logscale"
set pm3d at s interpolate 20,20
set hidden3d
set xlabel 'unscaled'
set origin 0.5,0.5
set xrange [0.001:0.1]
splot 'data.dat' u 1:2:3 notitle
set autoscale x
set xlabel 'scaled by factor 1000'
set origin 0.5,0.0
splot 'data.dat' u ($1*1000):2:3 notitle
set view map
set xlabel 'unscaled'
set origin 0.0,0.5
set xrange [0.001:0.1]
splot 'data.dat' u 1:2:3 notitle
set autoscale x
set xlabel 'scaled by factor 1000'
set origin 0.0,0.0
splot 'data.dat' u ($1*1000):2:3 notitle
unset multiplot
set output

gnuplot additional parameter to X axis

I wonder how I can add a parameter to every X parameter. Like on the picture, where every X parameter has an additional parameter.
I run gnuplot with the following command
gnuplot -p -e "reset; set yrange [0:1]; set term png truecolor size 1024,1024; set grid ytics; set grid xtics; set key bottom right; set output 'Recall.png'; set key autotitle columnhead; plot for [i=2:3] 'Recall' using 1:i with linespoints linecolor i pt 0 ps 3
Recall file has the following content
train approach1 approach2
2 0.6 0.07
7 0.64 0.076
9 0.65 0.078
I wonder if I can add additional parameter as follows
train approach1 approach2
2(10) 0.6 0.07
7(15) 0.64 0.076
9(20) 0.65 0.078
The actual plotting should be according the real X parameters (2,7,9) an additional parameter is only for visualization and should be printed together with X.
Many gnuplot's terminals provide an enhanced option
that mimics the functionality provided by the postscript
terminal, functionality described here.
What you want can be done using an enhanced terminal in conjunction with the set xtics command (see help set xtics for the correct sintax):
gnuplot> set term qt enhanced
gnuplot> set xrange [2:10]
gnuplot> set xtics ('{/=8 3} {/=20 (a)}' 3, '6 (c)' 6)
gnuplot> plot sin(x)
Please refer to the link for a complete description of the available commands.
Update
To produce automatically the x axis labels, one can use backticks substitution, either directly in a gnuplot command file or on the command line, as in the OP approach.
The command line is longish...
gnuplot -p -e "reset; set yrange [0:1]; set term png truecolor size 1024,1024; set grid ; set key bottom right; set output 'Recall.png'; set key autotitle columnhead; `awk -f Recall.awk Recall` ; plot for [i=2:3] 'Recall' using 1:i with linespoints linecolor i pt 0 ps 3"
The key point is using an awk script that outputs the appropriate gnuplot command, and here it is the awk script
% cat Recall.awk
BEGIN { printf "set xtics (" }
NR>1 {
printf (NR==2?"":",")
printf ("'{/=8 %d} {/=16 (%d)}' %d", $1, $4, $1) }
END { print ")"}
Oooops!
I forgot to show the modified format of data file...
% cat Recall
train approach1 approach2
2 0.6 0.07 10
7 0.64 0.076 15
9 0.65 0.078 20
and here it is the product of the previous command line
If you want to take an xtic label from your data file, you can use using ...:xtic(1) which would take the value of the first column as xtic label.
The disadvantage might be, that for every value in your data file you'll get an xtic, and no other ones. So, using the data file
train approach1 approach2
2(10) 0.6 0.07
7(15) 0.64 0.076
9(20) 0.65 0.078
you could plot with
reset
set term png truecolor size 1024,1024
set grid ytics
set grid xtics
set key bottom right
set output 'Recall.png'
set key autotitle columnhead
plot for [i=2:3] 'Recall' using 1:i:xtic(1) with linespoints linecolor i pt 7 ps 3
and get
Note, that this uses the correct x-values only, because gnuplot itself drops the content inside the parenthesis, not being a valid number.
If you want to use different font sizes for the label parts, you could add an additional column which contains the parameter.
Data file Recall2
train add approach1 approach2
2 (10) 0.6 0.07
7 (15) 0.64 0.076
9 (20) 0.65 0.078
Now, instead of using xtic(1), you can also construct the string to be used as xticlabel:
reset
set term pngcairo truecolor enhance size 1024,1024
set grid ytics
set grid xtics
set key bottom right
set output 'Recall2.png'
set key autotitle columnhead
myxtic(a, b) = sprintf("{%s}{/*1.5 %s}", a, b)
plot for [i=3:4] 'Recall2' using 1:i:xtic(myxtic(strcol(1), strcol(2))) with linespoints linecolor i pt 7 ps 3

Resources