x range for non-numerical data in Gnuplot

x range for non-numerical data in Gnuplot - gnuplot

When running the following script, I get an error message:
set terminal postscript enhanced color
set output '| ps2pdf - histogram_categorie.pdf'
set auto x
set key off
set yrange [0:20]
set style fill solid border -1
set boxwidth 5
unset border
unset ytic
set xtics nomirror
plot "categorie.dat" using 1:2 ti col with boxes
The error message that I get is
smeik:plots nvcleemp$ gnuplot categorie.gnuplot
plot "categorie.dat" using 1:2 ti col with boxes
^
"categorie.gnuplot", line 13: x range is invalid
The content of the file categorie.dat is
categorie aantal
poussin 13
pupil 9
miniem 15
cadet 15
junior 6
senior 5
veteraan 8
I understand that the problem is that I haven't defined an x range. How can I make him use the first column as values for the x range? Or do I need to take the row numbers as x range and let him use the first column as labels? I'm using Gnuplot 4.4.
I'm ultimately trying to get a plot that looks the same as the plot I made before this one. That one worked fine, but had numerical data on the x axis.
set terminal postscript enhanced color
set output '| ps2pdf - histogram_geboorte.pdf'
set auto x
set key off
set yrange [0:40]
set xrange [1935:2005]
set style fill solid border -1
set boxwidth 5
unset border
unset ytic
set xtics nomirror
plot "geboorte.dat" using 1:2 ti col with boxes,\
"geboorte.dat" using 1:($2+2):2 with labels
and the content of the file geboorte.dat is
decennium aantal
1940 2
1950 1
1960 3
1970 2
1980 3
1990 29
2000 30

the boxes style expects that the x-values are numeric. That's an easy one, we can give it the pseudo-column 0 which is essentially the script's line number:
plot "categorie.dat" using (column(0)):2 ti col with boxes
Now you probably want the information in the first column on the plot somehow. I'll assume you want those strings to become the x-tics:
plot "categorie.dat" using (column(0)):2:xtic(1) ti col with boxes
*careful here, this might not work with your current boxwidth settings. You might want to consider set boxwidth 1 or plot ... with (5*column(0)):2:xtic(1) ....
EDIT -- Taking your datafiles posted above, I've tested both of the above changes to get the boxwidth correct, and both seemed to work.

Related

Horizontal bar chart in gnuplot

When Googling "horizontal gnuplot bar chart", the first result I could find http://www.phyast.pitt.edu/~zov1/gnuplot/html/histogram.html suggests rotating (!) the final bar chart which seems rather baroque. Nonetheless I tried the approach but the labels are cut off.
reset
$heights << EOD
dad 181
mom 170
son 100
daughter 60
EOD
set yrange [0:*] # start at zero, find max from the data
set boxwidth 0.5 # use a fixed width for boxes
unset key # turn off all titles
set style fill solid # solid color boxes
set colors podo
set xtic rotate by 90 scale 0
unset ytics
set y2tics rotate by 90
plot '$heights' using 0:2:($0+1):xtic(1) with boxes lc variable
Is there a better approach?

The link you are referring to is from approx. 2009. gnuplot has developed since then. As #Christoph suggested, check help boxxyerror.
Script: (edit: shortened by using 4-columns syntax for boxxyerror, i.e. x:y:+/-dx:+/-dy)
### horizontal bar graph
reset session
$Data << EOD
dad 181
mom 170
son 100
daughter 60
EOD
set yrange [0:*] # start at zero, find max from the data
set style fill solid # solid color boxes
unset key # turn off all titles
myBoxWidth = 0.8
set offsets 0,0,0.5-myBoxWidth/2.,0.5
plot $Data using (0.5*$2):0:(0.5*$2):(myBoxWidth/2.):($0+1):ytic(1) with boxxy lc var
### end of script
Result:
Addition:
what does
2:0:(0):2:($0-myBoxWidth/2.):($0+myBoxWidth/2.):($0+1):ytic(1) mean?
Well, it looks more complicated than it is. Check help boxxyerror. From the manual:
6 columns: x y xlow xhigh ylow yhigh
So, altogether:
x take value from column 2, but not so relevant here since we will use the xyerror box
y take pseudocolumn 0 which is line number starting from zero, check help pseudocolumns, but not so relevant here as well
xlow (0) means fixed value of zero
xhigh value from column 2
ylow ($0-myBoxWidth/2.), line number minus half of the boxwidth
yhigh ($0+myBoxWidth/2.), line number plus half of the boxwidth
($0+1) together with ... lc var: color depending on line number starting from 1
ytic(1): column 1 as ytic label
For some reason (which I don't know) gnuplot still doesn't seem to have a convenient horizontal histogram plotting style, but at least it offers this boxxyerror workaround.

How to assign specific title to each line in the data file in gnuplot

I have a data file which keeps all the x, y coordinates and radius values for drawing circles. Each circle stand for a region. Up to now I drew the circles. But I want to assign specific legend to each line in the data file. Because after drawing regions, I want to put some points on this regions depend on the region number. However I couldn't figure out how to do it. Is there anyone who know how to assign a specific legend to the circles depend on its line number in the data file. The data file looks like
X Y R Legend
5 6 0.1 1
....
and so on. I want to use the last column as title to assign to the circles. Is there any way to do that?

It depends how exactly you want to show the corresponding "title". Let's assume that the data file circles.dat contains following data:
5 6.0 0.1 1
5 5.5 0.1 2
4 5.0 0.2 3
One option would be to plot the circles and use the fourth column as labels which are placed at the centers of the individual circles. This can be directly achieved with the with labels plotting style as:
set terminal pngcairo
set output 'fig1.png'
fName = 'circles.dat'
unset key
set xr [3:6]
set yr [4:7]
set size square
set tics out nomirror
set xtics 3,1,6
set mxtics 2
set ytics 4,1,7
set mytics 2
plot \
fName u 1:2:3 w circles lc rgb 'red' lw 2, \
'' u 1:2:4 w labels tc rgb 'blue'
This produces:
Alternatively, one might want to put those labels into the legend of the graph. Perhaps there is a more elegant solution, nevertheless one way is to
plot each line of the data file separately and extract the fourth column (to be used as key title) manually:
set terminal pngcairo
set output 'fig2.png'
fName = 'circles.dat'
unset key
set xr [3:6]
set yr [4:7]
set size square
set tics out nomirror
set xtics 3,1,6
set mxtics 2
set ytics 4,1,7
set mytics 2
set key top right reverse
stat fName nooutput
plot \
for [i=0:STATS_records-1] fName u 1:2:3 every ::i::i w circles t system(sprintf("awk 'NR==%d{print $4}' '%s'", i+1, fName))
This gives:

gnuplot - intersection of two plots

I am using gnuplot to plot data from two separate csv files (found in this link: https://drive.google.com/open?id=0B2Iv8dfU4fTUZGV6X1Bvb3c4TWs) with a different number of rows which generates the following graph.
These data seem to have no common timestamp (the first column) in both csv files and yet gnuplot seems to fit the plotting as shown above.
Here is the gnuplot script that I use to generate my plot.
# ###### GNU Plot
set style data lines
set terminal postscript eps enhanced color "Times" 20
set output "output.eps"
set title "Actual vs. Estimated Comparison"
set style line 99 linetype 1 linecolor rgb "#999999" lw 2
#set border 1 back ls 11
set key right top
set key box linestyle 50
set key width -2
set xrange [0:10]
set key spacing 1.2
#set nokey
set grid xtics ytics mytics
#set size 2
#set size ratio 0.4
#show timestamp
set xlabel "Time [Seconds]"
set ylabel "Segments"
set style line 1 lc rgb "#ff0000" lt 1 pi 0 pt 4 lw 4 ps 0
plot "estimated.csv" using ($1):2 with lines title "Estimated", "actual.csv" using ($1):2 with lines title "Actual";
Is there any way where we can print out (write to a file) the values of the intersection of these plots by ignoring the peaks above green plot? I also have tried to do an sql-join query but it doesn't seem to print out anything for the same reason I explained above.
PS: If the blue line doesn't touch the green line (i.e. if it is way below the green line), I want to take the values of the closest green line so that it will be a one-to-one correspondence (or very close) with the actual dataset.

Perhaps one could somehow force Gnuplot to reinterpolate both data sets on a fine grid, save this auxiliary data and then compare it row by row. However, I think that it's indeed much more practical to delegate this task to an external tool.
It's certainly not the most efficient way to do it, nevertheless a "lazy approach" could be to read the data points, interpret each dataset as a LineString (collection of line segments, essentially equivalent to assuming a linear interpolation between data points) and then calculate the intersection points. In Python, the script to do this might look like this:
#!/usr/bin/env python
import sys
import numpy as np
from shapely.geometry import LineString
#-------------------------------------------------------------------------------
def load_data(fname):
return LineString(np.genfromtxt(fname, delimiter = ','))
#-------------------------------------------------------------------------------
lines = list(map(load_data, sys.argv[1:]))
for g in lines[0].intersection(lines[1]):
if g.geom_type != 'Point':
continue
print('%f,%f' % (g.x, g.y))
Then in Gnuplot, one can invoke it directly:
set terminal pngcairo
set output 'fig.png'
set datafile separator comma
set yr [0:700]
set xr [0:10]
set xtics 0,2,10
set ytics 0,100,700
set grid
set xlabel "Time [seconds]"
set ylabel "Segments"
plot \
'estimated.csv' w l lc rgb 'dark-blue' t 'Estimated', \
'actual.csv' w l lc rgb 'green' t 'Actual', \
'<python filter.py estimated.csv actual.csv' w p lc rgb 'red' ps 0.5 pt 7 t ''
which gives:

Gnuplot, skipping timedat tics, histogram

So, i need to make histogram of data by dates, but i have problem with xticlabel overlapping, so, i'm trying to find a solution how to skip xtics to avoid overlapping. Considering that dates are not integer tics, i was trying to solve it that way:
the .dat file
Time Dat 1 Dat 2
1 27-12-2016 12 2
2 28-12-2016 13 7
3 29-12-2016 17 2
4 30-12-2016 9 10
....
Is it possible to count xtic by first column, but show values in second column instead of values in first?
my code:
reset
dx=5.
n=2
total_box_width_relative=0.75
gap_width_relative=0.1
d_width=(gap_width_relative+total_box_width_relative)*dx/2.
d_box = total_box_width_relative/n
reset
set term png truecolor font "arial,10" fontscale 1.0 size 800,400
set output "test.png"
set datafile separator "\\t"
set title "Errors"
set print "-"
set xlabel 'x' offset "0", "-1"
set ylabel 'y' offset "1", "-0"
set key invert reverse Left outside
set key autotitle columnheader
set key samplen 4 spacing 1 width 0 height 0
set autoscale yfixmax
set yrange [0: ]
set xtics strftime('%d-%m-%Y', "27-12-2016"), 5, strftime('%m-%d-%Y', "15-01-2017")
set xtics font ", 7"
set ytics auto font ", 9"
set y2tics auto font ", 9"
set grid
set style data histogram
set style histogram cluster gap 1
set style fill transparent solid 0.75 noborder
set boxwidth 0.9 relative
set xtic rotate by -45 scale 0
plot 'datfile' u 3:xtic(strftime('%d-%m-%Y', strptime('%m.%d.%Y', stringcolumn(2)))), '' u 4

Before asking such a vague question, always reduce the script to a bare minimum which is required to reproduce the problem.
After removing all unnecessary stuff and fixing the plot command, here is what I end up with:
reset
set datafile separator "\t"
set yrange [0:*]
set style fill transparent solid 0.75 noborder
set boxwidth 0.9 relative
set xtic rotate by -45 scale 0
set key autotitle columnheader
set style data histogram
set style histogram cluster gap 1
plot 'file.dat' using 3:xtic(2) t col(2), '' using 4
Here, you already see one option to avoid overlapping of longer tic labels by rotating them.
Another possibility is to skip every n-th xticlabel. At this point you must understand how gnuplot creates histograms. Histograms don't use a conventional numerical axis, so you cannot simply use the dates as you normally would do when plotting lines. But gnuplot puts each bar cluster at an integer x-position and with e.g. xtic(2) you label every cluster with the string as given in the second column.
The expression xtic(2) is a short cut for xticlabel(2), which means xticlabel(stringcolumn(2)). Instead of using exactly the string in the second column, you can use here any expression which yields a string, including conditions. To only plot every second label check if the row number is even or odd with int($0) % 2 == 0 and use and empty string or the string from the second column:
plot 'file.dat' using 3:xtic(int($0)%2 == 0 ? stringcolumn(2) : '') t col(2), '' u 4

stacked graph with gnuplot

My data file looks like this
A 20120301 4
A 20120302 3
B 20120301 5
B 20120302 6
C 20120303 5
except there are many more than just A,B,C and I want to create a stacked graph with gnuplot (similar to the "Stacked histograms" from the gnuplot demos)
20120301 = (A:4 + B:5)
20120302 = (A:3 + B:6)
20120303 = (C:5)
So far I could not convince plot to read the data in that format. Do I have re-arrange the data file for this? Or is there a way for gnuplot to read the data in that format?

I think I've managed to beat it into a form that will work (you'll need at least gnuplot 4.3):
set boxwidth 0.75 absolute
set style fill solid 1.00 border lt -1
set datafile missing '-'
set style histogram rowstacked
set style data histograms
set yrange [0:]
plot for [i=2:4] 'test.dat' u i,'' u (0.0):xtic(1) notitle
and here's the datafile test.dat
#date A B C
#missing data is marked by a minus sign
20120301 4 5 -
20120302 3 6 -
20120303 - - 5
Phew! I've never been much good with gnuplot when it comes to histograms. Hopefully this will work for you (Sorry about the change to your datafile).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string