Iterate over all datasets AND all columns in gnuplot - gnuplot

I have a datafile with an arbitrary number of datasets, each with an arbitrary number of columns. Every column starts with a header that I would like to use as a title. This is an example datafile, "gp.dat":
a b
2 3
4 9
16 27
c
4
16
64
I would like to generate a plot using gnuplot (gnuplot 5.4 patchlevel 2) that interprets every column in every dataset as an independent line, each labeled with its column header. For the above dataset, this would do the trick:
plot for [d=0:*] for [i=1:2] "gp.dat" index d using i title columnheader with linespoints
Resulting in the following plot:
However, when I try to specify ALL datasets AND ALL columns, the "c" line vanishes:
plot for [d=0:*] for [i=1:*] "gp.dat" index d using i title columnheader with linespoints
This seems to hold for any index I supply for the column number above 2, so this produces the same bad plot:
plot for [d=0:*] for [i=1:3] "gp.dat" index d using i title columnheader with linespoints
How can I specify ALL datasets and ALL columns and guarantee that everything will be plotted?

In the past, I made other strange observations using the * in such "self (de-/non-)terminating loops". I guess gnuplot determines the number of columns from the last block, but is probably not prepared to have variable number of columns.
Here is a somewhat awkward but straightforward procedure to plot all blocks and all columns. This example works as long as your column separator is whitespace.
determine the number of blocks using stats (check help stats)
set the column separator temporarily to "\n", i.e. strcol(1) will be the whole line
extract the number of columns from the first row of each block using words (check help words) and write it to a datablock $ColMax (check help table).
reset the column separator to whitespace again
use the variable number of columns for each block
Maybe there are shorter and smarter solutions.
Script:
### plot all blocks and all columns (variable number of columns in blocks)
reset session
$Data <<EOD
a b
2 3
4 9
16 27
c
4
16
64
d e f
5 6 7
33 44 55
77 88 99
EOD
stats $Data u 0 nooutput
set datafile separator "\n"
set table $ColMax
plot for [b=0:STATS_blocks-1] $Data u (words(strcol(1))) index b every ::::0 w table
unset table
set datafile separator whitespace
set key top center
plot for [b=0:STATS_blocks-1] for [c=1:$ColMax[b+1]] $Data u 0:c index b w lp pt 7 ti columnhead
### end of script
Result:
Addition:
Here is a bit shorter solution which does not use reading from or plotting to a table/datablock (which works only for gnuplot>5.0).
The following should also work for later versions of 4.x if you read the data from a file.
Script:
### plot all blocks and all columns (variable number of columns in blocks)
reset
FILE = 'myFile.dat'
set datafile separator "\n" # or any character which is not in the data
B = 0
Cols = ''
stats FILE u (column(-2)==B ? (B=B+1, Cols=Cols.' '.words(strcol(1))):0) every ::1::1 nooutput
set datafile separator whitespace
set key top center
plot for [b=0:B-1] for [c=1:word(Cols,b+1)] FILE u 0:c index b w lp pt 7 ti columnhead
### end of script

Related

How to plot two columns with different index in Gnuplot?

I have a data set that has two groups of data each one with 3 columns. I know that I use plot "dataset.dat" i 0 u 1:2 to plot the second column versus the first column in the first set of data (index starts with zero), or plot "dataset.dat" i 1 u 2:3 to plot the third column versus the second column in the second set of data. But what if I want to plot the second column of index 1 versus the second column of index 0?, is that possible? or do I have to put them contiguously in the same index. I have search in the documentation but isn't mentioned there. Thanks for your help.
This is basically a data (re-)arrangement challenge. You could rearrange your data with whatever external tool, but in principle you can also do it somehow with gnuplot.
One possible solution would be to place your y-values (from index 1) in a separate datablock (here; $myY) and in the final plot command address it by datablock line-index, which starts from 1 and requires a integer number, that's why it is $myY[int($0+1)]. Furthermore, you need to convert it into a (floating point) number via real(), check help real. The assumption is that the subblocks have the same length.
Code:
### plot x and y from different indices
reset session
$Data <<EOD
11 12 13
21 22 23
31 32 33
111 112 113
121 122 123
131 132 133
EOD
set table $myY
plot $Data u 2 index 1 w table
unset table
unset key
plot $Data u 2:(real($myY[int($0)+1])) index 0 w lp pt 7
### end of code
Result:

Subtract smoothed data from original

I wonder whether there is a way to subtract smoothed data from original ones when doing things of the kind:
plot ["17.12.2020 08:00:00":"18.12.2020 20:00:00"] 'data3-17-28.csv1' using 4:5 title 'Sensor 3' with lines, \
'' using 4:5 smooth acsplines
Alternatively I would need to do it externally, of course.
As #Suntory already suggested you can plot smoothed data into a table.
However, keep in mind, the number of datapoints will be determined by set samples, default setting is 100 and the smoothed datapoints will be equidistant. So, if you set samples to the number of your datapoints and your data is equidistant as well, then all should be fine.
Concatenating data line by line is not straightforward in gnuplot, since gnuplot is not intended to do such operations.
The following gnuplot-only solution assumes that you have your data in a datablock $Data without headers and empty lines. If not, you could either plot it with table from file into a table named $Data or use the following approach in the accepted answer of this question: gnuplot: load datafile 1:1 into datablock
If you don't have equidistant data, you need to interpolate data, which is also not straightforward in gnuplot, see: Resampling data with gnuplot
It's up to you: either you use external tools (which might not be platform-independent) or you apply a somewhat cumbersome platform independent gnuplot-only solution.
Code:
### plot difference of data to smoothed data
reset session
$Data <<EOD
1 0
2 13
3 16
4 17
5 11
6 8
7 0
EOD
stats $Data u 0 nooutput # get number of rows or datapoints
set samples STATS_records
set table $Smoothed
plot $Data u 1:2 smooth acsplines
unset table
# put both datablock into one
set print $Difference
do for [i=1:|$Data|] {
print sprintf('%s %s',$Data[i],$Smoothed[i+4])
}
set print
plot $Data u 1:2 w lp pt 7, \
$Smoothed u 1:2 w lp pt 6, \
$Difference u 1:($2-$4) w lp pt 4 lc "red"
### end of code
Result:
If I well understand you would like this :
First write your smooth's data in out.csv file
set table "out.csv" separator comma
plot 'file' u 4:5 smooth acsplines
unset table
Then this line will paste 'out.csv' to file as an appended column.You will maybe need to delete first lines using sed command (sed '1,4d' out.csv)
stats 'file' matrix
Thanks to stats we automatically get the number of column in your original data (STATS_size_x).
plot "< paste -d' ' file out.csv" u 4:($5-$(STATS_size_x+2)) w l
Could you please try this small code on your data.

How to remove line between "jumping" values, in gnuplot?

I would like to draw a line with plots that contain "jumping" values.
Here is an example: when we have plots of sin(x) for several cycles and plot it, unrealistic line will appear that go across from right to left (as shown in following figure).
One idea to avoid this might be using with linespoints (link), but I want to draw it without revising the original data file.
Do we have simple and robust solution for this problem?
Assuming that you are plotting a function, that is, for each x value there exists one and only one corresponding y value, the easiest way to achieve what you want is to use the smooth unique option. This smoothing routine will make the data monotonic in x, then plot it. When several y values exist for the same x value, the average will be used.
Example:
Data file:
0.5 0.5
1.0 1.5
1.5 0.5
0.5 0.5
Plotting without smoothing:
set xrange [0:2]
set yrange [0:2]
plot "data" w l
With smoothing:
plot "data" smooth unique
Edit: points are lost if this solution is used, so I suggest to improve my answer.
Here can be applied "conditional plotting". Suppose we have a file like this:
1 2
2 5
3 3
1 2
2 5
3 3
i.e. there is a backline between 3rd and 4th point.
plot "tmp.dat" u 1:2
Find minimum x value:
stats "tmp.dat" u 1:2
prev=STATS_min_x
Or find first x value:
prev=system("awk 'FNR == 1 {print $1}' tmp.dat")
Plot the line if current x value is greater than previous, or don't plot if it's less:
plot "tmp.dat" u ($0==0? prev:($1>prev? $1:1/0), prev=$1):2 w l
OK, it's not impossible, but the following is a ghastly hack. I really advise you add an empty line in your dataset at the breaks.
$dat << EOD
1 1
2 2
3 3
1 5
2 6
3 7
1 8
2 9
3 10
EOD
plot for [i=0:3] $dat us \
($0==0?j=0:j=j,llx=lx,lx=$1,llx>lx?j=j+1:j=j,i==j?$1:NaN):2 w lp notit
This plots your dataset three times (acually four, there is a small error in there. I guess i have to initialise all variables), counts how often the abscissa values "jump", and only plots datapoints if this counter j is equal to the plot counter i.
Check the help on the serial evaluation operator "a, b" and the ternary operator "a?b:c"
If you have data in a repetitive x-range where the corresponding y-values do not change, then #Miguel's smooth unique solution is certainly the easiest.
In a more general case, what if the x-range is repetitive but y-values are changing, e.g. like a noisy sin(x)?
Then compare two consecutive x-values x0 and x1, if x0>x1 then you have a "jump" and make the linecolor fully transparent, i.e. invisible, e.g. 0xff123456 (scheme 0xaarrggbb, check help colorspec). The same "trick" can be used when you want to interrupt a dataline which has a certain forward "jump" (see https://stackoverflow.com/a/72535613/7295599).
Minimal solution:
plot x1=NaN $Data u 1:2:(x0=x1,x1=$1,x0>x1?0xff123456:0x0000ff) w l lc rgb var
Script:
### plot "folded" data without connecting lines
reset session
# create some test data
set table $Data
plot [0:2*pi] for [i=1:4] '+' u 1:(sin(x)+rand(0)*0.5) w table
unset table
set xrange[0:2*pi]
set key noautotitle
set multiplot layout 1,2
plot $Data u 1:2 w l lc "red" ti "data as is"
plot x1=NaN $Data u 1:2:(x0=x1,x1=$1,x0>x1?0xff123456:0x0000ff) \
w l lc rgb var ti "\n\n\"Jumps\" removed\nwithout changing\ninput data"
unset multiplot
### end of script
Result:

Gnuplot Stacked Filledcurves from various indexes

I have a file where my data are separated into several indexes. I would like to plot some or all of the indexes as stacked filledcurves by adding the values of selected previous indexes to the values of the current index. I could not find a way to use the sum function as in the case of data arranged as columns in a single index (as in this question), even using the pseudocolumn(-2) as the index number.
Important note: every index as strictly identical sets of x values, only the y values differ.
Is there a way to do something like
p 'data.dat' index (sum(ind=1,3,4,5) ind) u 1:2 w filledcurve x1 t 'Sum(1,3,4,5)', '' index (sum(ind=1,2,5) ind) u 1:2 w filledcurve x1 t 'Sum(1,2,5)'
within gnuplot or do I have to resort to a script (maybe a variation of the one in this answer)?
You can do this with some help outside gnuplot (invoked within gnuplot). Imagine you have the following data file with 4 indices (0 to 3):
1 2
2 3
1 5
2 5
1 0
2 3
1 4
2 3
Now say that we want to sum 1 and 2 and 0 and 3. The first sum should return:
1 5
2 8
while the second sum should return
1 6
2 6
We can select the blocks we want using set table:
set table "sum1"
plot for [i in "1 2"] "data3" index 0+i pt 7 not
set table "sum2"
plot for [i in "0 3"] "data3" index 0+i pt 7 not
unset table
Now use sed piping to remove the empty lines and smooth freq to sum for equal x values:
plot "< sed '/^\s*$/d' sum1" smooth freq t "sum1", \
"< sed '/^\s*$/d' sum2" smooth freq t "sum2"
Although you may be able to do it using functions and variables of gnuplot 4.4+, this won't be very efficient as you want to perform an operation on several distant lines in your file, which is in fact an operation on arrays. Gnuplot is not meant for this, the datafiles should have a structure reasonably close to what you want to plot. I advise that you try to produce a file with such a structure, e.g. have the values you want to sum on the same line in different columns.

Fill Key Across Rows Instead of Down Columns

I currently position my key with:
set key outside bottom horizontal
I have 6 items in the key, coming from two plot for... commands (3 each). The size of my plot results in a key that is 2 rows x 3 columns. I'd like to fill my key across rows instead of down columns so that the nth item from each plot command is aligned vertically:
Current:
135
246
Desired:
123
456
I can't find any options for this in help set key. Is this possible without changing my plotting commands?
In the multiplot environment for placing multiple graphs you have the option rowsfirst and columnsfirst (check help multiplot).
Although, for the key entries you have the options vertical and horizontal (check help key). And as you hoped that horizontal would solve your problem, I also would have expected that horizontal together with the option maxcols 3 would do what you're requesting. But it doesn't. Hence, you need a workaround.
You don't specify where you get your key entries from. From the columheader or from some string created with some index? So, in the example below I assumed a key is created from your two loop indices.
Script: (works with gnuplot>=4.6.0, March 2012)
### change order in legend/key
reset
M = 2 # rows
N = 3 # columns
myTitles = ''
set key noautotitle maxrows M
set offsets 0,0,0.5,0
plot for [i=1:3] myTitles=myTitles.sprintf(" Data%d",i) '+' u 1:(cos(i*0.1*$1)) w l lw 2 lc i, \
for [i=4:6] myTitles=myTitles.sprintf(" Data%d",i) '+' u 1:(cos(i*0.1*$1)) w l lw 2 lc i, \
for [i=1:words(myTitles)] c=((i-1)%M*N+1 + (i-1)/M) '+' u 1:(NaN) w l lw 2 lc c ti word(myTitles,c)
### end of script
Result: (created with gnuplot 4.6.0)

Resources