gnuplot missing data with expression evaluation - gnuplot

I want to use the plot command in gnuplot with expression evaluation, i.e.
plot "-" using ($1):($2) with lines
1 10
2 20
3 ?
4 40
5 50
e
But I want it to ignore the missing data "?" in such a way that it connects the line (and doesn't break it between 2 and 4).
I tried set datafile missing "?",
but in agreement with the online-help it does not connect the lines. The following would, but I cannot use expression evaluation:
plot "-" using 1:2 with lines
1 10
2 20
3 ?
4 40
5 50
e
Any ideas how to connect the lines and use expression evaluation?

Two column data
If you set up a data file Data.csv
1 10
2 20
3 ?
4 40
5 50
you can plot your data with connected lines using
plot '<grep -v "?" Data.csv' u ($1):($2) w lp
More than two column data
For more than two columns you can make use of awk.
With a data file Data.csv
1 10 1
2 20 2
3 ? 3
4 40 ?
5 50 5
you can run a script over the data file for each plot like so:
plot "<awk '{if($2 != \"?\") print}' Data.csv" u ($1):($2) w lp, \
"<awk '{if($3 != \"?\") print}' Data.csv" u ($1):($3) w lp
A reference on scripting in gnuplot can be found here. The awk user manual here.

Related

How to plot points specified by ID column only with linespoints and multiple plots using gnuplot?

Say I have two files, each has 3 columns
file1:
ID X Y
10 0.1 some data as X
20 0.2
30 0.3
... ...
120 0.5
file2:
ID X Y
15 0.1 some data as X
30 0.2
45 0.3
60 0.4
... ...
120 0.6
I am doing
plot \
"file1" using 2:3 w linespoints lt 1 dt 1 lw 1 lc 1 title "file1",\
"file2" using 2:3 w linespoints lt 1 dt 1 lw 1 lc 2 title "file2"
which shows every point on the file.
If I only want points which its Row ID(first column) is 30, 60 ,90 ,120
How should I do? Thank you.
*In the actually case, I need to plot 12 file in one plot and each of them have 10000 rows but I only want to show 6 points.
You can filter your data with the ternary operator. Check help ternary.
For the filename and the filter I would define a function such that you have it compact in your plot command.
What myFilter(dcol,fcol) does is returning the value of the data column dcol if the filter column fcol is equal to one of the given values.
myFilename(n) creates the filename as a function of a number.
I don't have test files but the following should plot the 12 files names "file1.dat", ..., "file12.dat".
I hope you can adapt it to your exact needs.
Code:
### filter data with ternary operator
reset session
myFilename(n) = sprintf("file%d.dat",n)
myFilter(dcol,fcol) = column(fcol)==30 || column(fcol)==60 || \
column(fcol)==90 || column(fcol)==120 ? column(dcol) : NaN
set datafile missing NaN
plot for [i=1:12] myFileName(i) u 2:(myFilter(3,1)) w lp ti myFilename(i)
### end of code

gnuplot - printing multiple plots from one dataset in same file as script

Ive this script in gnuplot and I want to print multiple plots from 1 dataset. Ive tried this command but it seems that command needs another sama dataset to execute this command correctly. Do you know how to solve it?
plot '-' using 1:2, '=' using 1:3
1 1 5
2 2 5
3 3 5
e
With '-' you would have to enter the same data again. Check help special-filenames.
You better do:
$Data <<EOD
1 1 5
2 2 5
3 3 5
EOD
plot $Data u 1:2, '' u 1:3

Plotting same line number of several blocks data with gnuplot

I have a data file with the following structure
block1: line 1
line 2
line 3
.....
block2: line 1
line 2
line 3
......
block3: .....
To plot only the block2, I use the command
plot 'file' u x1:x2 every :::2::2 w l
How to gather only line 1 of each block on the plot command?
my guess would be, because the datapoints are from different blocks they are separated by an empty line. And datapoints separated by an empty line are not plotted connected using "with lines".
Try the following: write your desired data into a new table, like the example below (gnuplot 5.2.5).
### plot values of different blocks connected with lines
reset session
set colorsequence classic
$Data <<EOD
# block line xvalue yvalue
0 0 1 0
0 1 2 1
0 2 3 2
0 3 4 3
1 0 5 10
1 1 6 11
1 2 7 12
1 3 8 13
2 0 9 20
2 1 10 21
2 2 11 22
2 3 12 23
EOD
set table $Data2
plot $Data u 0:3:4 every ::0::0 with table
unset table
print $Data2
plot $Data u 3:4 w lp,\
$Data2 u 2:3 w lp
### end code
addition: if you want to do this with several files try the following below
(little drawback so far: points from different files are not connected)
### plot every Nth line of all blocks of several systematic files
reset session
FileCount = 2 # number of files
Col1 = 1 # e.g. column of x value
Col2 = 2 # e.g. column of y value
N = 0 # N=0 is first line of each datablock, N=1 second line, etc...
set print $EveryNthLineFromAllBlocksOfAllFiles
do for [i=1:FileCount] {
FILE = sprintf("name_%d.dat",i)
set table $EveryNthLine
plot FILE u Col1:Col2 every ::N::N with table
unset table
print $EveryNthLine
}
set print
print $EveryNthLineFromAllBlocksOfAllFiles
plot $EveryNthLineFromAllBlocksOfAllFiles u 1:2 w lp
### end code

How does gnuplot skip irrelevent column in plot?

I have a data file like this
# Time A irrelevent_col B
1 2 3 4
2 3 4 5
3 4 5 6
4 5 6 7
I am trying to plot two lines Time vs A, Time vs B with labels "A" and "B". How can I skip the "irrelevent_col" column?
I did the following, but the code still plots the "irrelevent_col" column. Shouldn't the ? : operator gets ride of that column?
set datafile commentschars "!!!"
plot for [i=2:4] filename using 1:(columnhead(i+1) ne "irrelevent_col" ? column(i) : 1/0) title columnhead(i+1)
Thanks!
If I understood correctly your question:
plot "filename" using 1:2 title "A" with lines,\
"filename" using 1:4 title "B" with lines
Let me repeat what I've understood from your question:
You have a large number of columns and you want to plot them all in a loop, but exclude a single column (or a few) by name.
Of course, you can specify all columns you do want to plot, like in #ViniciusPlacco's answer, however, as I understand that's what you wanted to avoid, since you have many more columns in your real data. You can also always use external tools to pre-process your data, but here I would like to suggest a gnuplot-only and hence platform-independent solution.
Why your solution is not working, I can only speculate: I guess using columnheader twice in a plot iteration creates problems (at least for gnuplot<=5.2). But I could be wrong. But as I will show below, your solution will work for gnuplot>=5.4.0.
Furthermore, you want to specify the columns by header not by column number.
In addition, your header line starts with the comment char '#', but you can easily change that to access the columnheader information.
In the example below you can specify a list of several headers which you don't want to plot. Maybe the script(s) can be further simplified.
Script: (works for gnuplot>=5.4.0, July 2020)
### exclude some columns by header from plotting loop (gnuplot>=5.4.0)
reset session
$Data <<EOD
# Time A B C D E
1 2 3 4 5 6
2 3 4 5 6 7
3 4 5 6 7 8
4 5 6 7 8 9
EOD
set datafile commentschars '' # no commentchar
set key top left noenhanced noautotitle
inList(w,list) = int(sum[_i=1:words(list)] w eq word(list,_i))
doNotPlot = 'B C'
color = 1
plot for [col=2:6] $Data u 1:((b=inList(myHeader=columnhead(col+1),doNotPlot)) ? \
NaN : ($0==1?color=color+1:0, column(col))) w lp pt 7 lc color ti (b ? '' : myHeader)
### end of script
Result:
For older gnuplot versions <5.4.0 you need a different approach:
get all headers into a string
specify all your headers of the columns you don't want to plot in a string
for gnuplot>=5.0.0, subtract two lists and keep the column numbers for the header you do want to plot
Script: (works for gnuplot>=5.2.2, Nov. 2017; result same as graph above)
### exclude some columns by header from plotting loop (gnuplot>=5.2.2)
reset session
$Data <<EOD
# Time A B C D E
1 2 3 4 5 6
2 3 4 5 6 7
3 4 5 6 7 8
4 5 6 7 8 9
EOD
set datafile commentschars '' # no commentchar
set key top left noenhanced noautotitle
inList(w,list) = int(sum[_i=1:words(list)] w eq word(list,_i))
doNotPlot = 'B C'
myHeaders = ''
color = 1
plot for [col=2:6] $Data u 1:((b=inList(myHeader=columnhead(col+1),doNotPlot)) ? NaN : \
($0==1 ? (color=color+1, myHeaders=myHeaders.' '.myHeader) : 0, column(col))) w lp pt 7 lc color, \
for [i=1:color] NaN w lp pt 7 lc i ti word(myHeaders,i)
### end of script
Script: (works for gnuplot>=5.0.0, Jan. 2015; result same as graph above)
### exclude some columns by header from plotting loop (gnuplot>=5.0.0)
reset session
$Data <<EOD
# Time A B C D E
1 2 3 4 5 6
2 3 4 5 6 7
3 4 5 6 7 8
4 5 6 7 8 9
EOD
set datafile commentschars '' # no commentchar
set datafile separator "\n" # or another character which is not in the header line
stats $Data u (allHeaders = strcol(1)[2:]) ever ::::0 nooutput # get header line into string
set datafile commentschar # reset to default
set datafile separator whitespace # ditto
inList(w,list) = int(sum[_i=1:words(list)] w eq word(list,_i))
subtractLists(list1,list2) = (_s=' ', sum[_j=1:words(list1)] (_s0=word(list1,_j), \
inList(_s0,list2) ? 0 : (_s=_s._s0.' ', \
myColNos=myColNos.' '._j), 0), _s)
doNotPlot = 'B C'
myColNos = ''
myHeaders = subtractLists(allHeaders,doNotPlot)
myColNo(i) = column((word(myColNos,i)))
set key top left noenhanced noautotitle
plot for [i=2:words(myHeaders)] $Data u 1:(myColNo(i)) w lp pt 7 ti word(myHeaders,i)
### end of script

weird interpolation using filledcurves

I have a data file like this:
1 1 2
2 2 3
3 4 nan
4 5 6
I want to plot it using:
plot "bla" u 1:2:3 w filledcurves, "" u 1:2 w lp, "" u 1:3 w lp
The problem is that the first part totally ignores the 3rd line, even though the nan is only in $3. Even though I have a value (4) in $2, it interpolates and skips it.
How do I make it not ignore that value?
I can make a workaround by replacing the nan by the value that should be there- (3+6)/2 in my case and then it will plot the 4 as well. There are two problems with that - I'll have to write a script that finds nans around the file, and it also plots a point when I'm using w lp as if there is a value there, but there isn't.
You're asking gnuplot's using to do something that it wasn't designed to do. Fortunately, there's a somewhat hacky solution. You need to use the 2-column version of filled curves which considers your data-points to make a closed loop. In this case, you want to plot the stuff from columns 1 and 2 and then the stuff from columns 3 and 1 (in reverse order). e.g.:
1 1
2 2
3 4
4 5
4 6
3 nan
2 3
1 2
If your datafile looks like this, you can plot it as:
plot 'datafile' u 1:2 w filledcurves
Now I'm pretty sure you don't actually want to re-generate your data-files, so the easiest thing is to use unix tools to do it for you:
plot "< sed 'x;1!H;$!d;x' test.dat | awk '{print $1,$3}' | cat test.dat -" u 1:2 w filledcurves
This should work. Note that the ugly sed command can be replaced by tac if you have that installed.

Resources