Plotting from two data sets delimited two different ways

Plotting from two data sets delimited two different ways - gnuplot

I need to plot data from a .csv file and from a white space separated file. Both sets of data need to appear on the same plot.
data1.dat
#t y
1 1
2 1
3 1
and
data2.csv
#t,y
1,2
2,2
3,2
normally I would do the following if both were .csv sets:
set datafile separator ','
plot 'data1.csv' using 1:2,'data2.csv' using 1:2
Is there some way to include the setting of the separation character in the plot statement?
plot 'data1.dat' using 1:2,'data2.csv' using datafile separator ',' using 1:2
The above does not work and I tried many different variations of the above code....I had no luck.

You can give more than one character to set datafile separator, in your case ", ". All these are then individually treated as separators. (A tab caracter can be given as "\t", you need to put double quotes around the string then!)
$dat << EOD
1 2,4
2 2,5
3 1,6
4 4,4
EOD
set xr [0.5:4.5]
set dataf sep ", "
plot $dat us 1:2:3 w yerrorbars
Note that the explicit separator characters each count as exactly one separator. "4, 4" with set dataf sep ", " evaluates to "three columns, second is a missing value. If you have incompatible formats in one plot, you can import the data for each subplot with its own separators set using set table $<name>. (check "help datablocks")
If your data files have a very difficult format:
gnuplots using specifier accepts a libC scanf() format string
plot "-" us 1:2 "%lf,%lf"
1,2
2,3
3,4
e
You can give a different format string for every file on your plot command. Note that gnuplot only accepts "double" fp numbers for input, so you have to use the %le or %lf specifier.
Check help using examples, and here is a full description of the format.

AFAIK, there isn't a way to specify the separator. However, if you're in a POSIX compliant environment (and your gnuplot supports pipes -- which most do), you can farm the work out to awk pretty easily:
plot 'data1.dat' using 1:2,\
"<awk -F, '{print $1,$2}' data2.csv" using 1:2

Not just for "retro"-fun, but also for current gnuplot versions, I guess this is the only(?) gnuplot-only solution which works with all versions even with versions before the time of OP's question.
The "trick" is: if you set datafile separator "," and read the first (and only) stringcolumn (of a whitespace separated file), i.e. strcol(1) will contain the full line. Now, you can simply split the string with word() and convert it into a floating point number with real().
If your original data has at least one space after the comma,
1, -0.2
2, -0.1
3, 0.0
the data would be plotted correctly with keeping the separator as whitespace, since the comma after the first column's data will be ignored during number interpretation.
Although, for newer gnuplot versions (>=4.6.7, Apr 2015) you have the possibility to define several separators, however, which will not work as you might think, because
set datafile separator ", "
will interpret each space as column separator. So, if you have an undefined and variable number of spaces your plot command would fail.
Anyway, here is the "always" working solution:
Data:
SO14262760_1.dat (with variable number of spaces)
1 -0.1
2 0.0
3 +0.1
SO14262760_2.dat (with no or some space after ,)
1,-0.2
2,-0.1
3, 0.0
Script: (works with gnuplot>=4.4.0, March 2010)
### different column separators in two files with one plot command
reset
FILE1 = "SO14262760_1.dat"
FILE2 = "SO14262760_2.dat"
set datafile separator ","
myCol(n) = real(word(strcol(1),n))
plot FILE1 u (myCol(1)):(myCol(2)) w lp pt 7 lc rgb "red", \
FILE2 u 1:2 w lp pt 7 lc rgb "blue"
### end of script
Result:

Related

Iterate over all datasets AND all columns in gnuplot

I have a datafile with an arbitrary number of datasets, each with an arbitrary number of columns. Every column starts with a header that I would like to use as a title. This is an example datafile, "gp.dat":
a b
2 3
4 9
16 27
c
4
16
64
I would like to generate a plot using gnuplot (gnuplot 5.4 patchlevel 2) that interprets every column in every dataset as an independent line, each labeled with its column header. For the above dataset, this would do the trick:
plot for [d=0:*] for [i=1:2] "gp.dat" index d using i title columnheader with linespoints
Resulting in the following plot:
However, when I try to specify ALL datasets AND ALL columns, the "c" line vanishes:
plot for [d=0:*] for [i=1:*] "gp.dat" index d using i title columnheader with linespoints
This seems to hold for any index I supply for the column number above 2, so this produces the same bad plot:
plot for [d=0:*] for [i=1:3] "gp.dat" index d using i title columnheader with linespoints
How can I specify ALL datasets and ALL columns and guarantee that everything will be plotted?

In the past, I made other strange observations using the * in such "self (de-/non-)terminating loops". I guess gnuplot determines the number of columns from the last block, but is probably not prepared to have variable number of columns.
Here is a somewhat awkward but straightforward procedure to plot all blocks and all columns. This example works as long as your column separator is whitespace.
determine the number of blocks using stats (check help stats)
set the column separator temporarily to "\n", i.e. strcol(1) will be the whole line
extract the number of columns from the first row of each block using words (check help words) and write it to a datablock $ColMax (check help table).
reset the column separator to whitespace again
use the variable number of columns for each block
Maybe there are shorter and smarter solutions.
Script:
### plot all blocks and all columns (variable number of columns in blocks)
reset session
$Data <<EOD
a b
2 3
4 9
16 27
c
4
16
64
d e f
5 6 7
33 44 55
77 88 99
EOD
stats $Data u 0 nooutput
set datafile separator "\n"
set table $ColMax
plot for [b=0:STATS_blocks-1] $Data u (words(strcol(1))) index b every ::::0 w table
unset table
set datafile separator whitespace
set key top center
plot for [b=0:STATS_blocks-1] for [c=1:$ColMax[b+1]] $Data u 0:c index b w lp pt 7 ti columnhead
### end of script
Result:
Addition:
Here is a bit shorter solution which does not use reading from or plotting to a table/datablock (which works only for gnuplot>5.0).
The following should also work for later versions of 4.x if you read the data from a file.
Script:
### plot all blocks and all columns (variable number of columns in blocks)
reset
FILE = 'myFile.dat'
set datafile separator "\n" # or any character which is not in the data
B = 0
Cols = ''
stats FILE u (column(-2)==B ? (B=B+1, Cols=Cols.' '.words(strcol(1))):0) every ::1::1 nooutput
set datafile separator whitespace
set key top center
plot for [b=0:B-1] for [c=1:word(Cols,b+1)] FILE u 0:c index b w lp pt 7 ti columnhead
### end of script

Remove linespoints in Gnu plot

I have ploted a graph in Gnuplot with linespoint
gnuplot> plot "360 values.txt" title "Fidelity vs Time" lt 7 lc 0 w lp
Now I want to change the linespoints with only lines
so I used
gnuplot> plot "360 values.txt" title "Fidelity vs Time" lt 7 lc 0 lw 4 w lines
But now the whole graph disappeared.
Why?

Gnuplot does not connect points with lines if the points in the datafile are separated by empty lines.
From the documentation (help plot datafile):
...
In datafiles, blank records (records with no characters other than blanks and
a newline and/or carriage return) are significant.
Single blank records designate discontinuities in a `plot`; no line will join
points separated by a blank records (if they are plotted with a line style).
Two blank records in a row indicate a break between separate data sets.
See `index`.
...

Multiple datasets in the same data file in gnuplot

I have a following kind of file:
<string1> <x1> <y1>
<string2> <x2> <y2>
...
I want to draw a scatter plot from the (x,y) values, having the different strings in the first column in different data sets, which will be drawn with different colors (I have many different x,y values but only a few different strings). I tried this:
plot "DATAFILE" using 2:3 title column(1)
Unfortunately, this one picks the first column for the first row and uses that as a title for all entries.

You could use awk to pick only rows where the first column matches your strings:
plot "<awk '$1~/string1/' DATAFILE" using 2:3 title column(1),\
"<awk '$1~/string2/' DATAFILE" using 2:3 title column(1)
and so on. For a built-in gnuplot solution, you can do:
plot "DATAFILE" u 2:(stringcolumn(1) eq "string1" ? $3:1/0),\
"DATAFILE" u 2:(stringcolumn(1) eq "string2" ? $3:1/0)
if you want to do something more automatic that would generate plots for every unique entry in column 1, this solution worked for me:
input file (test.dat - separated, otherwise need to change cut statement below):
one 1 3
two 2 4
ten 3 5
ten 4 3
two 5 4
one 6 5
one 7 3
ten 8 4
two 9 5
ten 10 3
two 11 4
one 12 5
the following line creates a plotting statement for gnuplot, and saves in a file:
cut -f1 test.dat | sort -u | awk '
BEGIN {print "plot\\"}
{print "\"test.dat\" u 2:(stringcolumn(1) eq \""$1"\" ?\$3:1/0),\\"}' > plot.gp
and the contents are:
plot\
"test.dat" u 2:(stringcolumn(1) eq "one" ?$3:1/0),\
"test.dat" u 2:(stringcolumn(1) eq "ten" ?$3:1/0),\
"test.dat" u 2:(stringcolumn(1) eq "two" ?$3:1/0),\
then you'd do:
gnuplot plot.gp
or add the line load "plot.gp" to your script.
I am pretty sure there must be a "gnuplot-only" solution, but that goes beyond my knowledge. Hope this helps.

You have just one plot, so just one title.
If you want to plot separately all datasets (separated by two consecutive blank lines), you (just) need to say so:
N_datasets=3
plot for [i=0:N_datasets-1] "file.dat" using 2:3 index i with title columnhead(1)
But the formatting of your datafile is not what gnuplot expects, and using title columnhead will also skip first line (assumed to contain headers only). The standard gnuplot format for this would be:
string1
x1_1 y1_1
x1_2 y1_2
...
string2
x2_1 y2_1
x2_2 y2_2
...

Different line styles for vectors from the same data file

Here is my data file:
25 10 8
0 50 11
34 25 0
14 0 22
200 25 56
And I plot 3D vectors with splot:
splot "data" using (0):(0):(0):1:2:3 with vectors
But I would like different colors for my vectors, using something like ls nth_vector with splot (so ls 1 for the first line of the file, then ls 2, etc.). Is it possible?
Thanks!

If you double space your data file you can achieve this using index. You can use awk within gnuplot to do the spacing on the fly:
splot for [i=0:system("wc -l < data")] '<awk -v s="\n" "{print s}1" data' using (0):(0):(0):1:2:3 index i notitle with vectors
The system command counts the number of lines in the file. awk prints two newlines for every line in the data file, so each line has a separate index. I have used a variable containing the \n character as this avoids difficulties in escaping strings.
edit
There's no need for any of that awk. You can use stats to get the number of lines in your file and every to plot each line separately:
stats 'data' nooutput
splot for [i=0:STATS_records] "data" using (0):(0):(0):1:2:3 every ::i::i with vectors notitle

You can use the row number (zeroth column) as linetype index for the linecolor variable option:
splot 'data' using (0):(0):(0):1:2:3:0 with vectors lc var
For the vectors plotting style you could even use arrowstyle variable to change the whole arrow settings.

Reading gnuplot legend from csv

I've got a data.csv file which is structured like this:
n John Smith stats Sam Williams stats
1 23.4 44.1
2 32.1 33.5
3 42.0 42.1
Currently I'm plotting with the following command in gnuplot:
plot 'data.csv' using 1:2 title 'John' with lines, '' using 1:3 title 'Sam' with lines
The question is how to retrieve first names from the first line of .csv rather than entering them manually?
In addition, is it possible to make it adjustable in case I add a column to the table, so it automatically adds another line with the appropriate title?

You say you have a csv file, so I assume your data file looks like this (and is saved in infile.csv):
n,John Smith stats,Sam Williams stats
1,23.4,44.1
2,32.1,33.5
3,42.0,42.1
If your version of Gnuplot is recent enough, you can use columnhead as the title argument:
echo "
set datafile separator ','
plot 'infile.csv' using 1:2 with lines title columnhead
" | gnuplot --persist
Or use the key option:
echo "
set datafile separator ','
set key autotitle columnhead
plot 'infile.csv' using 1:2 with lines, '' using 1:3 with lines
" | gnuplot --persist
Edit - shorten headings
echo "
set datafile separator ','
set key autotitle columnhead
plot '< sed -r \"1 s/,([^ ]+)[^,]+/,\1/g\" infile.csv' using 1:2 with lines, '' using 1:3 with lines
" | gnuplot --persist
Output:
Note this answer to a follow-up question may also be relevant.

You want to extract only a portion of the columnheader for the legend.
Update:
This is a task which you can easily do with gnuplot>=5.4.0. Check help columnheader and help word.
plot for [col=2:4] FILE u 1:col w l title word(columnheader(col),1)
However, the above command will not work with gnuplot versions 4.6.0 to 5.2.8. Whereas title columnheader(col) will work, but title word(columnheader(col),1) will not.
Workaround: (for gnuplot versions 4.6.0 to 5.2.8)
Again a strange gnuplot-only workaround.
In short: In a plot for loop which starts at 1 you assign the header of column 2 to the variable myHeader, however, you are plotting nothing (NaN) with title myHeader='' (empty string will not generate a keyentry). In the next iteration you plot column 2, with the previously extracted header. This will continue until the last column (here: N=4).
Data: SO13371449.csv (some more example data added)
n, John Smith stats, Sam Williams stats, Tom Muller stats
1, 23.4, 44.1, 22.1
2, 32.1, 33.5, 25.7
3, 42.0, 42.1, 40.0
Script: (works for gnuplot>=4.6.0)
### get only a portion of columnheader for the title (gnuplot>=4.6.0)
reset
FILE = "SO13371449.csv"
set datafile separator ","
myHeader = ''
N=4
plot for [col=1:N] FILE u ($0==0 && col<N ? myHeader=word(strcol(col+1),1) : 0, \
col==1 ? NaN : $1):col w lp pt 7 title myHeader
### end of script
Result: (created with gnuplot 4.6.0)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Plotting from two data sets delimited two different ways - gnuplot

AFAIK, there isn't a way to specify the separator. However, if you're in a POSIX compliant environment (and your gnuplot supports pipes -- which most do), you can farm the work out to awk pretty easily: plot 'data1.dat' using 1:2,\ "<awk -F, '{print $1,$2}' data2.csv" using 1:2

Related

Iterate over all datasets AND all columns in gnuplot

Remove linespoints in Gnu plot

Multiple datasets in the same data file in gnuplot

Different line styles for vectors from the same data file

Reading gnuplot legend from csv

Categories

Resources