Multiple plots with gnuplot by grouping columns - linux

I have a data file with schema as "object parameter output1 output2 ...... outputk". For eg.
A 0.1 0.2 0.43 0.81 0.60
A 0.2 0.1 0.42 0.83 0.62
A 0.3 0.5 0.48 0.84 0.65
B 0.1 0.1 0.42 0.83 0.62
B 0.2 0.1 0.82 0.93 0.61
B 0.3 0.5 0.48 0.34 0.15
I want to create multiple plots, each plot corresponding to an object, with x axis being the parameter and series being the outputs. Currently, I've written a python script which dumps the rows for each object in different files and then calls gnuplot. Is there a more elegant way to plot it?

You are looking for this:
plot 'data.txt' using (strcol(1) eq "A" ? $2 : 1/0):4 with line
which results to:
If you would like to create plots for every object use:
do for [object in "A B"] {
set title sprintf("Object %s",object)
plot 'data.txt' using (strcol(1) eq object ? $2 : 1/0):4 notitle with line
pause -1
Just press Enter for next plot.
Of course you can export these plots in files, too.


Gnuplot - How to join smoothly ordered points?

I've a set of data in three columns:
1st column: order criterion between 0 and 1
2nd: x vals
3rd: y vals
As a data file example:
0.027 -29.3 -29.6
0.071 -26.0 -31.0
0.202 -14.0 -32.8
0.304 -3.4 -29.3
0.329 -0.5 -26.0
0.409 6.7 -14.0
0.458 11.7 -3.4
0.471 12.8 -0.5
0.495 12.5 6.7
0.588 18.8 11.7
0.600 20.4 12.8
0.618 20.8 12.5
0.674 20.9 18.8
0.754 22.1 20.4
0.810 27.0 20.8
0.874 24.7 20.9
0.892 9.4 22.1
0.911 -11.5 27.0
0.943 -23.7 24.7
0.962 -29.6 9.4
0.991 -31.0 -11.5
0.999 -32.8 -23.7
My goal is to plot (x,y) points and a trend curve passing through each points ordered in ascending order with the first column values.
I use the following script:
set terminal png small size 600,450
set output "my_data_mcsplines_joined_points.png"
set table "table_interpolation.dat"
plot 'my_data.dat' using 2:3 smooth mcsplines
unset table
plot 'my_data.dat' using 2:3:(sprintf("%'.3f", $1)) with labels point pt 7 offset char 1,1 notitle ,\
"table_interpolation.dat" with lines notitle
Here mcspline results as an example:
mcspline joined points figure
The resulting curve should have the shape of a spindle or a loop.
Whatever smooth options used, Gnuplot seems invalid to handle such aim.
Unfortunatly most of smooth (mcspline, csplines...) options do a monotonic ordering of data.
How can I plot a trend curve passing through each points ordered in ascending order with the first column values?
I cannot post an image in a comment, and so place it here. I don't think a 2D plot will be sufficient, based on this 3D acatterplot of the data in your question.

Display changing column value in Gnuplot animation

I am making a gnuplot animation of a satellite going around a planet. My task is to display it's XY trajectory and associated values of velocity and energy versus time. I know how to plot the path, but I've been having problems displaying velocity etc.
the code below does the following:
satellite track and time steps -- column 3:4;
satellite position -- column 3:4;
planet position -- column 6:7.
do for [n=0:int(STATS_records)] {
plot "sat.dat" u 3:4 every ::0::n w lp ls 2 t sprintf("steps=%i", n), \
"sat.dat" u 3:4 every ::n::n w lp ls 4 notitle, \
"sat.dat" u 6:7 every ::0::n w lp ls 3 notitle , \
How do I display the associated velocity values for each sprintf ? The velocity values are in column 5. Thank you everyone in advance.
It seems that you want to put everything in the "key" (legend), but another option is to use labels, which can be easily placed arbitrarily. There are labels you can place one at a time (with set label) and with labels for plotting with actual labels. Don't get them confused.
Your main issue seems to be how to pull out the velocity value from column 5. My first instinct (which is quite hacky) is to use some external program, like awk:
v = system(sprintf("awk 'NR==%d{print $5}' '%s'", n+1, infile))
set label 1 sprintf("v=%.3f", v+0) at screen 0.2,0.9
This is also an example of a label (named 1). The screen keyword means screen-relative rather than graph-relative. Putting this inside your for loop will reassign label 1 every iteration, so it overwrites the label from the previous iteration. Not using this 1 will just plop another label on top of the last one, so it would get messy.
Using an external command line like this isn't very portable. (I don't think it would work on Windows.) I saw this question that shows how to pull a value from a specific row and column of a file. The problem I had with using this is that stats implicitly filters according to whatever xrange is set. When making animations like this, I've noticed that the camera can jump around too much from autoranging, so it's nice to have tight control over the plotting range. Defining an xrange at the top of the file interfered with a subsequent stats command to read a velocity value.
You can, however, specify a range for stats (before the file name, such as stats [*:*] infile). But I had issues using this in combination with a predefined xrange based for position. I found that it did work if I specify the desired plotting range on the plot line instead of a set xrange. Here is another (full script) version using only gnuplot:
set terminal pngcairo
infile = 'anim.dat'
stats infile using 3:4 name 'data' nooutput
set key font 'Courier'
do for [n=0:data_records-1] {
set output sprintf('frame-%03d.png', n)
stats [*:*] infile every ::n::n using 5 name 'velocity' nooutput
plot [data_min_x:1.1*data_max_x][data_min_y:1.1*data_max_y] \
infile u 3:4 every ::0::n w linespoints ls 2 t \
sprintf("steps =%6d\nvelocity =%6.3f", n, velocity_min), \
'' u 3:4 every ::n::n w points pt 7 ps 3 notitle
Notice that you could easily change this to a set label if you want. Another option is to plot
'' u (x):(y):5 every ::n::n w labels
to place a label at graph position (x,y).
I don't have your data, but I made my own file with what I hope is a similar format to yours:
0 0.0 0.0 0.0 1.11803398875 0.625
1 0.05 0.05 0.02375 1.09658560997 0.625
2 0.1 0.1 0.045 1.07703296143 0.625
3 0.15 0.15 0.06375 1.05948100502 0.625
4 0.2 0.2 0.08 1.04403065089 0.625
5 0.25 0.25 0.09375 1.0307764064 0.625
6 0.3 0.3 0.105 1.01980390272 0.625
7 0.35 0.35 0.11375 1.01118742081 0.625
8 0.4 0.4 0.12 1.00498756211 0.625
9 0.45 0.45 0.12375 1.00124921973 0.625
10 0.5 0.5 0.125 1.0 0.625
11 0.55 0.55 0.12375 1.00124921973 0.625
12 0.6 0.6 0.12 1.00498756211 0.625
13 0.65 0.65 0.11375 1.01118742081 0.625
14 0.7 0.7 0.105 1.01980390272 0.625
15 0.75 0.75 0.09375 1.0307764064 0.625
16 0.8 0.8 0.08 1.04403065089 0.625
17 0.85 0.85 0.06375 1.05948100502 0.625
18 0.9 0.9 0.045 1.07703296143 0.625
19 0.95 0.95 0.02375 1.09658560997 0.625

Excel negative correlation of trendline

I have been getting this negative R² so many times when I add a trendline in excel as shown on the figure below.
Do I care about this negative sign?
Here is the data:
x y
0.059 0.13
0.095 0.05
0.097 0.02
0.12 0.2
0.146 0.05
0.192 0.11
0.231 0.16
0.25 0.16
0.28 0.09
0.33 0.05
0.36 0.18
0.37 0.24
0.47 0.14
0.76 0.11
1.2 0.07
1.86 0.12
So, a negative R² is possible based on how that value is computed (it's not purely a square of a number). For a properly-defined model, the value of the correlation coefficient will be between 0 and 1, and the interpretation is that "x" percentage of the variability in your data is explained by the model.
The interpretation of a negative value is that your trend line is a worse fit than a horizontal line. This answer provides a much more thorough explanation.
When you did your trendline you selected Set Intercept = option with intercept = 0.05. In this case Excel returns an R^2 that doesn't have it's customary meaning and can be negative --- see here.
To fix the problem, unselect the Set Intercept option.
When I run the trendline with the option unselected I get
y = -0.0017x + 0.1182 (R^2 = 0.0002)
Hope that helps.

Gnuplot: Find y value for given x value

I have a set of data points I plot with gnuplot. Know I calculate a value for x and want to find the corresponding y value within the data.
Does anybody know how to manage this with gnuplot?
I found kind of a solution here on StackOverflow
Basically you have to invert the function you use to calculate y (or x in the case of the link) and then, thanks to the sprintf function, you gate the corresponding value.
Have a look also here if you did not find the solution at the first link!
Although this is a rather old question, the following solution for interpolation might still be of interest to others.
The code can be simplified depending on whether you are just interested in nearest datapoint or in the interpolated value or whether you want to plot the value as point and/or label.
### interpolate between datapoints
reset session
$Data <<EOD
0 0
0.30 0.20
0.60 0.40
0.80 0.80
1 1
$Data2 <<EOD
1 0
0.80 0.15
0.50 0.30
0.50 0.50
0.30 0.80
0 1
InterpolY(x,x0,x1,y0,y1) = (x-x0)*(x-x1)<0 || x1-x0==0 ? x1-x0==0 ? (y0+y1)/2 : (y1-y0)/(x1-x0)*(x-x0) +y0 : NaN
set key noautotitle
array Point[1] # dummy array for plotting a single datapoint
# x-value for interpolation is defined in xp
plot x1=y1=yp=(xp=0.7,NaN) $Data u (x0=x1,x1=$1):(y0=y1,y1=$2, yp==yp ? NaN : yp=InterpolY(xp,x0,x1,y0,y1),y1) w lp pt 7, \
Point u (xp):(yp) w p pt 7 lc "red", \
Point u (xp):(yp):(sprintf("(%.2g|%.2g)",xp,yp)) w labels right offset -1,0 lc "red", \
x1=y1=yp=(xp=0.5,NaN) $Data2 u (x0=x1,x1=$1):(y0=y1,y1=$2, yp==yp ? NaN : yp=InterpolY(xp,x0,x1,y0,y1),y1) w lp pt 7, \
Point u (xp):(yp) w p pt 7 lc "green", \
Point u (xp):(yp):(sprintf("(%.2g|%.2g)",xp,yp)) w labels right offset -1,0 lc "green", \
### end of code
Result: (interpolated values at x=0.7 and x=0.5)

Histogram with numeric x-axis in gnuplot?

I'm having this file as data.dat:
Xstep Y1 Y2 Y3 Y4
332 1.22 0.00 0.00 1.43
336 5.95 12.03 6.11 10.41
340 81.05 81.82 81.92 81.05
394 11.76 6.16 10.46 5.87
398 0.00 0.00 1.51 1.25
1036 0.03 0.00 0.00 0.00
I can plot this data as histogram with this script, hist-v1.gplot (using set style data histogram):
set xlabel "X values"
set ylabel "Occurence"
set style data histogram
set style histogram cluster gap 1
set style fill solid border -1
set term png
set output 'hist-v1.png'
set boxwidth 0.9
# attempt to set xtics so they are positioned numerically on x axis:
set xtics ("332" 332, "336" 336, "340" 340, "394" 394, "398" 398, "1036" 1036)
# ti col reads the first entry of the column, uses it as title name
plot 'data.dat' using 2:xtic(1) ti col, '' u 3 ti col, '' u 4 ti col, '' u 5 ti col
And by calling:
gnuplot hist-v1.gplot && eog hist-v1.png
this image is generated:
However, you can notice that the X axis is not scaled numerically - it understands the X values as categories (i.e. it is a category axis).
I can get a more numerical X axis with the following script, hist-v2.gplot (using with boxes):
set xlabel "X values"
set ylabel "Occurence"
# in this case, histogram commands have no effect
set style data histogram
set style histogram cluster gap 1
set style fill solid border -1
set term png
set output 'hist-v2.png'
set boxwidth 0.9
set xr [330:400]
# here, setting xtics makes them positioned numerically on x axis:
set xtics ("332" 332, "336" 336, "340" 340, "394" 394, "398" 398, "1036" 1036)
# 1:2 will ONLY work with proper xr; since we have x>300; xr[0:10] generates "points y value undefined"!
plot 'data.dat' using 1:2 ti col smooth frequency with boxes, '' u 1:3 ti col smooth frequency with boxes
And by calling:
gnuplot hist-v2.gplot && eog hist-v2.png
this image is generated:
image hist-v2.png
Unfortunately, the bars 'overlap' here, so it is hard to read the graph.
Is there a way to keep the numerical scale X axis as in hist-v2.png, but keep the 'bars' side by side with as in hist-v1.png? This thread, "Re: Histogram with x axis date error" says you cannot:
But it will be hard to pull the x-coordinate date out of the data file, ...
but then, it refers to a different problem...
Ok, after reading the gnuplot help for a bit, it seems that histogram style will ''always'' interpret x axis as sequential entries/categories - so indeed, there seems to be no way to get a numerical axis with a histogram style.
However, it turns out that $ can refer to a column, and those can be used to actually 'reposition' the bars in the second (frequency with boxes style) example; so with this code as hist-v2b.gplot:
set xlabel "X values"
set ylabel "Occurence"
set style fill solid border -1
set term png
set output 'hist-v2.png'
set boxwidth 0.9
set xr [330:400]
# here, setting xtics makes them positioned numerically on x axis:
set xtics ("332" 332, "336" 336, "340" 340, "394" 394, "398" 398, "1036" 1036)
# 1:2 will ONLY work with proper xr; since we have x>300; xr[0:10] generates "points y value undefined"!
plot 'data.dat' using ($1-0.5):2 ti col smooth frequency with boxes, '' u ($1-0.25):3 ti col smooth frequency with boxes, '' u ($1+0.25):4 ti col smooth frequency with boxes, '' u ($1+0.5):5 ti col smooth frequency with boxes
And by calling:
gnuplot hist-v2b.gplot && eog hist-v2b.png
this image is generated:
image hist-v2b.png
... which is pretty much what I wanted in the first place.
Just a small note - I originally wanted to use the script with inline data; for a setup like this, it would have to be written as
plot '-' using ($1-0.5):2 ti col smooth frequency with boxes, '-' u ($1-0.25):3 ti col smooth frequency with boxes
Xstep Y1 Y2 Y3 Y4
332 1.22 0.00 0.00 1.43
336 5.95 12.03 6.11 10.41
340 81.05 81.82 81.92 81.05
394 11.76 6.16 10.46 5.87
398 0.00 0.00 1.51 1.25
1036 0.03 0.00 0.00 0.00
Xstep Y1 Y2 Y3 Y4
332 1.22 0.00 0.00 1.43
336 5.95 12.03 6.11 10.41
340 81.05 81.82 81.92 81.05
394 11.76 6.16 10.46 5.87
398 0.00 0.00 1.51 1.25
1036 0.03 0.00 0.00 0.00
... that is, the data would have to be entered multiple times, as it comes in from stdin - this problem is discussed in gnuplot - do multiple plots from data file with built-in commands.
PS: As there is quite a bit of space on the diagram, it would be nice if we could somehow specify separate x-axis ranges; that is discussed in:
Gnuplot tricks
Gnuplot tricks: Broken axis revisited
Setting the box width properly is very important when you plot a histogram using "boxes" plot style. In one of my blog article I have talked about it. If any interest,click here!
