gnuplot: Plot boxes next to each other of discrete function - gnuplot

How can I plot discrete functions, like the Poisson distribution, in gnuplot with different parameters in the same plot without having them overlap?
For example: I plot the Poisson distribution with lambda = {1,3,5} and with boxes in the same plot. To discretize I do set xrange [1:15]; set sample 15 so that it only plots the discrete values. This works pretty well. The only problem is, that the boxes of the three different Poisson distributions (of the three different lambdas) overlap (because they all have a value at x=1, x=2, etc). Making them transparent looks still ugly (color mixing on overlaps). So I want the function to be displayed shifted. The values of Poisson(x, lambda=1) and Poisson(x, lambda=3) and Poisson(x, lambda=5) should be calculated for x at x, but for each lambda should be displayed slightly more shifted to x than the previous lambda plot so that the all boxes don't overlap and can clearly be seen.
I hope I expressed this clear enough.
With datafiles it's easy (just add something with using $1+0.1:2, e.g.) but how do I shift analytical functions?

In order to plot analytical functions with special needs, which require the using statement, one can use the pseudo filename +. In your case, the plotting script might look as follows:
set xrange[-0.5:15.5]
set samples 16
set style data boxes
set boxwidth 0.2 absolute
set style fill solid noborder
poisson(x) = lambda**x/int(x)!*exp(-lambda)
plot for [lambda=1:5:2] '+' using ($0-(lambda-3)*0.1):(poisson($0)) title sprintf("λ = %d", lambda)

Related

Gnuplot 3D bar graph from data files

I have a gnuplot script that produces bar graphs like this:
The input data is in files that have a number of columns, each column ultimately contributes to a cluster in the chart (2 clusters shown in the example). Each file contributes to a bar in the chart (there are 9 in the example). Each file may have a large number of rows.
The script takes the input data files and, using the stats command, produces new files containing one row per column of the original files. Each row contains a mean, min and max value for its source column.
These new files are then used to plot the bar chart with error bars. Each file represents one bar and each row contributes to one cluster. The plot code is as follows:
plot for [f in FILES] f.'.stats' using 2:3:4 title columnhead(1), \
'' using (0):xticlabels(1) with lines
Now I have a second set of files and that produce another similar bar chart. I would like to combine these charts onto one so there will be two rows of 3-D bars, one in front of the other (rendered with a 3-D style - the new 'z' axis representing the two data sets (two sets of FILES).
Here is an example to illustrate the look I'm after (obviously not made with gnuplot!):
Can I do this with Gunplot?
I have read the user manual and the Gnuplot In Action book but haven't found anything that would indicate this is possible.
gnuplot version 5.3 (the development branch) adds a 3D barchart variant
3D boxes demo. However rendering the boxes in 3D unfortunately depends on features that were not present in earlier gnuplot release versions so I cannot offer a work-around for the current one (5.2.4). Also the new 3D variant does not show error bars, although I think one could construct a plot command that would add them.
I produced a 3D bar chart using the development 5.3 version (git checkout). Here is my splot command:
splot for [c = 1:ncats] for [f = 1:nfiles] \
word(cat_files[c],f).'.stats' \
using (f+column(0)*(nfiles+2)):(scale_y(c)):2 \
with boxes \
title (c==1 ? columnhead(1) : '')
The input data is in a set of 'stats' files as described in the question. To draw the plot, I separated the input FILES into categories - two (ncats) sets of files held in the array cat_files, each containing the same number of files (nfiles).
The categores equate to positions on the y-axis (rows) and the individual files equate to positions on the x-axis (bars). Rows in each file equate to clusters of bars and the values in each row is the bar height which is the Z axis. The Z axis was the Y axis in the 2D model. The nasty expressions are to position the bars on the x and y axes as I explain below.
I had a lot of difficulty getting this to work but I think that the result looks good:
The problems, which I cover below are:
matching colours between chart 'rows' of the y-axis
bar dimensions - making square bars is very hit-and-miss, hence my scale_y function.
x-axis label orientation
repeated items in the key, hence conditional expression for title.
no clustering support, hence nasty positioning expressions
What I have is brittle---it works on my Linux system but relies heavily on shell helpers. But it works. Hopefully this information helps others or can be taken as feedback to improve gnuplot to make it even more awesome!
Colours
To get the colours in each data set to line up, I set linetype cycle nfiles and hope gnuplot defines sufficient colours.
The reason for doing this is to reset the colour assignment between file sets (categories on the y-axis) so that the same bar in different file sets had the same colour. By explictly setting it to cycle after the known number of files (chart bars) I ensured the colours matched.
Bar dimensions
The bar dimensions (boxwidth and boxdepth) are relative to the axis ranges and it's therefore difficult to make them square.
If a bar rests on the extreme of the y axis (lower or upper) then it is cut in half vertically (it's visible box depth is half the defined boxdepth value).
I had to play with scaling the y axis so that my two category sets were displayed near each other. The default behaviour displayed a range from 1 to 2 in steps of 0.2 and placed the two plots at 1 and 2, making them appear far apart.
I tried set ytics to no effect. I ended up scaling the y value.
scale(y) = 0.1 * y - 0.05
set yrange [0:1]
set boxdepth (0.8 / clusters)
all the numbers are fudge factors. clusters is the number of clusters (rows in files). The numbers I have maintain a square appearance with my test data (I have data to display up to 5 clusters).
I had to start the x axis at 0.5 otherwise the first bar would appear too far in (if x starts at 0) or vertically half-cut off (if x starts at 1).
set xrange [0.5:*]
Axis labels
I replaced the automatic tick marks with custom labels. On the Y axis:
set ytics ()
set for [c = 1:ncats] ytics add (word(CATS,c) scale_y(c) )
Similarly for the x axis. First, where there is 1 cluster I label each category
set xtics ()
set for [f = 1:nfiles] xtics add (label(word(cat_files[1],f)) f)
Or where there are multiple clusters, I label the clusters:
set xtics ()
set set for [c = 2:(clusters+1)] xtics add (cell(f,c,1) (nfiles/2)+2+((c-2)*nfiles))
Here, cell is a shell helper that returns the value from file f at row c position 1. The horrible formula is a hack to position the label along the axis in the middle of the cluster. I also use shell helpers to get the number of clusters. I could not find a way in gnuplot to query rows and columns. Note that previously (when 2D plotting) I would have used xticlabels(1) to plot the clusered x-axis.
I wanted to turn the x labels to run perpendicular to the axis but this doesn't seem possible. I also wanted to tweak their positions with 'right' alignment but couldn't make that work either.
Key labels
An entry is added into the key for each bar plotted. Because these are repeated within each category they get duplicted in the key. I made it only add them once by using a conditional, changing from
title columnhead(1)
to
title (c==1 ? columnhead(1) : '')
I only show the key when there is more than one cluster.
Clustering
The 2D plot was clustered. I had difficulty making a clustered appearance in 3D. If I run the plot on clustered data then they overlay (they have the same Y values). To overcome this I used a formula to shift latter clusters along the x-axis and add a gap between them. So instead of a simple value for x:
... using (f):(scale_y(c)):2 ...
I have a formula:
... using (f+column(0)*(nfiles+2)):(scale_y(c)):2 ...
where f is the file number (eq. the bar number), column(0) is the cluster number, nfiles is the number of files (eq. the numer of bars, or cluster size), and 2 is the separator gap.
Incidentally, whilst doing this I discovered that ($0) doesn't work in gnuplot 5.3, you have to use column(0) instead ($0 works in 5.2.4).
I used the Arch Linux AUR package to build which gave me a package gnuplot-git-5.3r20180810.10527-1-x86_64.pkg.tar.xz.
An example plot with one cluster.
An example plot with three clusters and a key legend.
There are probably better ways to do the things I've done here. Being relatively new to gnuplot, I would be interested in any ways to improve upon this solution.
(I can't figure out how to format text in a comment, so I'll provide this as a separate answer)
Matching color: This is more reliably done by providing the color in a separate field of the using spec. From the help text:
splot with boxes requires at least 3 columns of input data. Additional
input columns may be used to provide information such as box width or
fill color.
3 columns: x y z
4 columns: x y z [x_width or color]
5 columns: x y z x_width color
The last column is used as a color only if the splot command specifies a
variable color mode. Examples
splot 'blue_boxes.dat' using 1:2:3 fc "blue"
splot 'rgb_boxes.dat' using 1:2:3:4 fc rgb variable
splot 'category_boxes.dat' using 1:2:3:4:5 lc variable
In the first example all boxes are blue and have the width previously set
by set boxwidth. In the second example the box width is still taken from
set boxwidth because the 4th column is interpreted as a 24-bit RGB color.
The third example command reads box width from column 4 and interprets the
value in column 5 as an integer linetype from which the color is derived.
Half-depth boxes at each end: This was an autoscaling bug (now fixed)

What the 'set logscale' does in gnuplot?

My question is more about math then the actual code.
When use the command
set logscale
on gnuplot 5.0 what is happening ?
It should represents the logarithmic values values of the x and y points.
But it doesn not seems to work properly. For example on my data I have x and y values smaller then 1 so I am expecting to see negative values for these values on the plot, but I see only postivie values.
What I am doing wrong ?
The logarithmic scale still shows the real values around the axes, just their distances are logarithmic. To really see the negative values, you need to really apply the log function:
plot "file.dat" using (log($1)):(log($2)) with lines
without setting the logscale.
A specific example might help to illustrate the effect of logarithmic scaling:
set xrange [0.1:10]
plot x**2
Let's plot this again, but this time on a logarithmic scale. Watch how the scaling of the x and y axes changes:
set logscale
replot

Editing y axis range in Gnuplot

I have a plot with exponential y axis range. I'm using multiplot command by inserting two images in one row. So due to this wide y axis range I'm loosing some space which I could use it to show my plots in a better way. I want basically something like this
How could i do this? I think for doing this I have do some math operations in the y axis range. Also what is the most convenient command to insert ( xE-10) at top left of the plot.
reset
set terminal epslatex size 16cm,18cm color colortext
set output new.tex
set key off
set format $%g$
set title "sinx"
set ylabel "[kNm]"
plot 1000000*sin(x)
This is not my exact code but it looks similar to this. The plot I have presented is a part of the multiplot code and I use 7 input files with time series data of 300 seconds at a time step of 0.02. The point I want to edit the y axis range (use some mathtematical expressions) and also include the term ( xE-10 ) on the top of the plot something like this
You can manually add the exponent with a set label .... For instance, the following function takes large values within the given interval:
plot[0:50] exp(x)
We can place the "x 10^21" manually in the desired place after dividing the plotted quantity by it:
set label 1 "{/Symbol \264} 10^{21}" at graph 0,1.025 left
plot[0:50] exp(x)/1e21
You have to be careful with the exact placement of the exponent since it might lie outside the plotting area, in which case you should lower the top margin with set tmargin .... Also, to use the "times" symbol, you need to pass the enhanced option to your terminal. With the epslatex terminal, you can use latex syntax: $\times 10^{21}$.

Gnuplot - Plot data on another abscissa by interpolation

Good evening,
I have a problem with Gnuplot. I tried to sum up my problem to make the comprehension easier.
What I have : 2 sets of data, the first one is my experimental data, about 20 points, the second one is my numerical data, about 300 points. But the two sets don't have the same abscissa.
What I want to have : I want my numerical data be interpolate on the x-experimental abscissa.
I know it is possible to do that with Xmgrace (paragraph Interpolation at http://plasma-gate.weizmann.ac.il/Xmgr/doc/trans.html#interp) but with Gnuplot ?
What I want to have in addition : is it possible, then, to subtract the y-experimental data of my y-numerical data at the x-experimental abscissa points ?
Thank you in advance for your answer,
zackalucard
You cannot interpolate the ordinate values of one set to the abscissa values of the other. gnuplot has no mechanism for that.
You can however plot both datasets using one of the smoothing algorithms (check "help smooth") with common abscissa values (which might (be made to) coincide with the original values of one set.)
set table "data1.tmp"
plot dataf1 smooth cspline
set xrange [GPVAL_x_min:GPVAL_X_max] # fix xrange settings
set table "data2.tmp"
plot dataf2 smooth cspline
unset table
Now you have the interpolated data in two temporary files, and only need to combine them into one:
system("paste data1.tmp data2.tmp > correlation.dat") # unixoid "paste" command
plot "correlation.dat" using 2:4
(If you have a sensible fit function for both datasets, the whole thing becomes much easier : plot dataf1 using (fit1($1)):(fit2($1)))
You can use smoothing, this should do the trick
plot "DATA" smooth csplines
(csplines is just one options, there others, e.g. bezier)
But I don't think you can automatically determine the intersection of the smoothed curved. You use the mouse to determine the intersection visually, or alternatively fit some functions f(x) and g(x) to your curves and solve f(x)=g(x) analytically

Inversed logarithmic scale in gnuplot

When plotting data which are very dense in small ranges of y, the logarithmic scale of gnuplot is the right scale to use.
But what scale to use when the opposite is the case? let's say most y values of different curves are between 90 and 100. In this case I want to use something like a inversed logarithmic scale. I tried a log scale between 0 and 1 but gnuplot wants me to choose a scale greater than 1
How can I achieve this?
You are right, you can use the inverse of the logarithmic function, which is the exponential. But gnuplot only supports logarithmic scales, so you have to do the conversion on your own:
plot "myData" using 1:(exp($2))
You will also have to handle the axis tics on your own. Either you just set them via a list like
set ytics ("0" 1.00, "1" 2.72, "2" 7.39, "3" 20.09, "4" 54.60, "5" 148.41)
or you use the link feature for axes of gnuplot 5. The following code uses the y2axis (right y-axis) to plot the data and calculates the tics on the left y-axis from the right one.
set link y via log(x) inverse exp(x)
plot "myData" using 1:(exp($2)) axes x1y2
Side note: Always remember that a non-linear axis is hard to understand. Logarithmic scales are common, but everything else not. So, be careful when presenting the data

Resources