Reading gnuplot legend from csv

Reading gnuplot legend from csv - gnuplot

I've got a data.csv file which is structured like this:
n John Smith stats Sam Williams stats
1 23.4 44.1
2 32.1 33.5
3 42.0 42.1
Currently I'm plotting with the following command in gnuplot:
plot 'data.csv' using 1:2 title 'John' with lines, '' using 1:3 title 'Sam' with lines
The question is how to retrieve first names from the first line of .csv rather than entering them manually?
In addition, is it possible to make it adjustable in case I add a column to the table, so it automatically adds another line with the appropriate title?

You say you have a csv file, so I assume your data file looks like this (and is saved in infile.csv):
n,John Smith stats,Sam Williams stats
1,23.4,44.1
2,32.1,33.5
3,42.0,42.1
If your version of Gnuplot is recent enough, you can use columnhead as the title argument:
echo "
set datafile separator ','
plot 'infile.csv' using 1:2 with lines title columnhead
" | gnuplot --persist
Or use the key option:
echo "
set datafile separator ','
set key autotitle columnhead
plot 'infile.csv' using 1:2 with lines, '' using 1:3 with lines
" | gnuplot --persist
Edit - shorten headings
echo "
set datafile separator ','
set key autotitle columnhead
plot '< sed -r \"1 s/,([^ ]+)[^,]+/,\1/g\" infile.csv' using 1:2 with lines, '' using 1:3 with lines
" | gnuplot --persist
Output:
Note this answer to a follow-up question may also be relevant.

You want to extract only a portion of the columnheader for the legend.
Update:
This is a task which you can easily do with gnuplot>=5.4.0. Check help columnheader and help word.
plot for [col=2:4] FILE u 1:col w l title word(columnheader(col),1)
However, the above command will not work with gnuplot versions 4.6.0 to 5.2.8. Whereas title columnheader(col) will work, but title word(columnheader(col),1) will not.
Workaround: (for gnuplot versions 4.6.0 to 5.2.8)
Again a strange gnuplot-only workaround.
In short: In a plot for loop which starts at 1 you assign the header of column 2 to the variable myHeader, however, you are plotting nothing (NaN) with title myHeader='' (empty string will not generate a keyentry). In the next iteration you plot column 2, with the previously extracted header. This will continue until the last column (here: N=4).
Data: SO13371449.csv (some more example data added)
n, John Smith stats, Sam Williams stats, Tom Muller stats
1, 23.4, 44.1, 22.1
2, 32.1, 33.5, 25.7
3, 42.0, 42.1, 40.0
Script: (works for gnuplot>=4.6.0)
### get only a portion of columnheader for the title (gnuplot>=4.6.0)
reset
FILE = "SO13371449.csv"
set datafile separator ","
myHeader = ''
N=4
plot for [col=1:N] FILE u ($0==0 && col<N ? myHeader=word(strcol(col+1),1) : 0, \
col==1 ? NaN : $1):col w lp pt 7 title myHeader
### end of script
Result: (created with gnuplot 4.6.0)

Related

Subtract smoothed data from original

I wonder whether there is a way to subtract smoothed data from original ones when doing things of the kind:
plot ["17.12.2020 08:00:00":"18.12.2020 20:00:00"] 'data3-17-28.csv1' using 4:5 title 'Sensor 3' with lines, \
'' using 4:5 smooth acsplines
Alternatively I would need to do it externally, of course.

As #Suntory already suggested you can plot smoothed data into a table.
However, keep in mind, the number of datapoints will be determined by set samples, default setting is 100 and the smoothed datapoints will be equidistant. So, if you set samples to the number of your datapoints and your data is equidistant as well, then all should be fine.
Concatenating data line by line is not straightforward in gnuplot, since gnuplot is not intended to do such operations.
The following gnuplot-only solution assumes that you have your data in a datablock $Data without headers and empty lines. If not, you could either plot it with table from file into a table named $Data or use the following approach in the accepted answer of this question: gnuplot: load datafile 1:1 into datablock
If you don't have equidistant data, you need to interpolate data, which is also not straightforward in gnuplot, see: Resampling data with gnuplot
It's up to you: either you use external tools (which might not be platform-independent) or you apply a somewhat cumbersome platform independent gnuplot-only solution.
Code:
### plot difference of data to smoothed data
reset session
$Data <<EOD
1 0
2 13
3 16
4 17
5 11
6 8
7 0
EOD
stats $Data u 0 nooutput # get number of rows or datapoints
set samples STATS_records
set table $Smoothed
plot $Data u 1:2 smooth acsplines
unset table
# put both datablock into one
set print $Difference
do for [i=1:|$Data|] {
print sprintf('%s %s',$Data[i],$Smoothed[i+4])
}
set print
plot $Data u 1:2 w lp pt 7, \
$Smoothed u 1:2 w lp pt 6, \
$Difference u 1:($2-$4) w lp pt 4 lc "red"
### end of code
Result:

If I well understand you would like this :
First write your smooth's data in out.csv file
set table "out.csv" separator comma
plot 'file' u 4:5 smooth acsplines
unset table
Then this line will paste 'out.csv' to file as an appended column.You will maybe need to delete first lines using sed command (sed '1,4d' out.csv)
stats 'file' matrix
Thanks to stats we automatically get the number of column in your original data (STATS_size_x).
plot "< paste -d' ' file out.csv" u 4:($5-$(STATS_size_x+2)) w l
Could you please try this small code on your data.

How to sprintf() last data value in Gnuplot key?

I'm working on a temperature graph and would like to put the last data point in the title. I can use column(2) to kind of do this but I'd like to add some descriptive text as well. I'm trying the code below to concatentate some text with the data value but getting this error: line 0: f_sprintf: attempt to print numeric value with string format
plot "/tmp/data.txt" using 1:2 with lines ls 2 title sprintf('Current :%sF', column(2))
I've tried changing the sprintf modifer to %d along with various flavors of concatenation with the dot character and haven't found the right combination.

Most probably there are various solutions. The first possibility which comes to my mind (I guess requiring gnuplot >5.2) is using keyentry, check help keyentry. While plotting you are asigning column 2 to a variable. After plotting, this variable holds the last value of column 2, which you use later in keyentry, which is a keyentry without plotting anything. There would also be workarounds for older gnuplot versions.
Code:
### last value into key
reset session
$Data <<EOD
1 7.1
2 6.2
3 5.3
4 4.4
5 3.5
6 2.6
7 1.7
8 0.8
EOD
plot $Data u 1:(a=$2) w lp pt 7 lc 1 notitle, \
keyentry w lp pt 7 lc 1 ti sprintf("Last y value: %g",a)
### end of code
Result:

The problem here is that the title string is evaluated by gnuplot before the data is parsed and plot is performed.
A trick is to store the last value of temperature, and plot it afterwards.
T=0
plot "/tmp/data.txt" using 1:(T=column(2)) w l ls 2 notitle, \
1/0 w l ls 2 title sprintf('Current: %.1fF', T)

Avoid connection of points when there is empty data

I am trying make a line chart using Gnuplot. I need to get something like the following but with an exception:
In the example above you can see a straight line which joins two separate points over empty data. It is the one that crosses the '2016-09-27 00:00:00' x tick. I would like there would be a empty space instead of that straight line. How could I achieve this?
This is the current code:
set xdata time
set terminal pngcairo enhanced font "arial,10" fontscale 1.0 size 900, 350
set output filename
set key off
set timefmt '"%Y-%m-%d %H:%M:%S"'
set format x "%Y-%m-%d %H:%M"
set xtics rotate by -80
set mxtics 10
set datafile missing "-"
set style line 1 lt 2 lc rgb 'blue' lw 1
set style line 2 lt 2 lc rgb 'green' lw 1
set style line 3 lt 2 lc rgb 'red' lw 1
plot\
fuente using 1:2 ls 1 with lines,\
fuente using 1:3 ls 2 with lines,\
fuente using 1:4 ls 3 with lines

Three options:
In the data file, put an empty line where the gap is. This results in exactly what you want, but would also affect the other data from that file.
Use every to only plot a portion of the data and plot it twice, once up to the gap, once from the gap. Suppose that the gap occurs between data points 42 and 43 in your case, then you could use:
plot\
fuente using 1:2 ls 1 every ::::41 with lines,\
fuente using 1:2 ls 1 every ::42 with lines,\
fuente using 1:3 ls 2 with lines,\
fuente using 1:4 ls 3 with lines
(The every statement takes up to six arguments separated by colons but you can leave them empty for default values. The fifth argument is the end point, the third is the starting point.)
If you use - for missing data in your file (as indicated by your set datafile missing "-"), you have modify your using statement for this to be effective:
plot\
fuente using 1:($2) ls 1 with lines,\
fuente using 1:3 ls 2 with lines,\
fuente using 1:4 ls 3 with lines

Of course, you can always change your data and e.g. insert empty lines (as #Wrzlprmft suggested) when data is missing which will interrupt your line.
With large datasets and a lot of "breaks" this would be painful if you have to do it manually.
I would say that there is a solution without changing your data.
Let me ask: "What do you consider as missing data?"
My assumption would be: you have e.g. a data logger which takes values every 10 minutes.
If for some reason the logger did not take some data there will be a "gap" of missing data.
Now, you can define what you consider as a gap, e.g. >1 hour of no data would be a gap.
Hence, you simply compare two consecutive values t0 and t1 and if the difference is larger then your gap you change the line color from whatever color to transparent (according to the scheme 0xaarrggbb). Check help linecolor variable and help colorspec.
Script:
### don't show line in missing data gaps
reset session
myFmt = "%Y-%m-%d %H:%M"
# create some random test data
set print $Data
tStart = "2016-09-27"
tEnd = "2016-10-10"
t0 = strptime(myFmt,tStart)
t1 = strptime(myFmt,tEnd)
y0 = 100
do for [t=t0:t0+(t1-t0)*0.2:600] { print sprintf("%s %g",strftime(myFmt,t),y0=y0+(rand(0)-0.5)) }
do for [t=t0+(t1-t0)*0.3:t0+(t1-t0)*0.5:600] { print sprintf("%s %g",strftime(myFmt,t),y0=y0+(rand(0)-0.5)) }
do for [t=t0+(t1-t0)*0.8:t0+(t1-t0):600] { print sprintf("%s %g",strftime(myFmt,t),y0=y0+(rand(0)-0.5)) }
set print
set format x "%d.%m." timedate
gap = 3600 # 1 hour
myColor(tCol,color) = (t0=t1, t1=timecolumn(tCol,myFmt), t1-t0>gap ? 0xff123456 : color)
set multiplot layout 2,1
plot $Data u (timecolumn(1,myFmt)):3 w l lc rgb 0xff0000 ti "data as is"
plot t1=NaN $Data u (timecolumn(1,myFmt)):3:(myColor(1,0x0000ff)) w l lc rgb var ti "with removed gaps"
unset multiplot
### end of script
Result:

Gnuplot plotting wrong lines and some strange values as well

I am using gnuplot to postprocess some calculation that I have done and I am having hard time getting gnuplot to select the right lines as it is outputting some strange values that I do not know where come from.
The first 200 points of the results start in line 3 and stop in 202 but that is not working when I use every ::3::202.
Does anyone have any suggestions of what I am doing wrong?
Gnuplot image:
Datafile
set terminal pngcairo transparent nocrop enhanced size 3200,2400 font "arial,40"
set output "Mast41_voltage_muffe.png"
set key right
set samples 500, 500
set xzeroaxis ls 1 lt 8 lw 3
set style line 12 lc rgb '#808080' lt 0 lw 1
set style line 13 lt 0 lw 3
set grid back ls 12
set decimalsign '.'
set datafile separator whitespace
set ylabel "Spenna [pu]"
set xlabel "Timi [s]"
plot "mrunout_01.out" every ::3::202 using 2:3 title '5 ohm' with lines lw 3 linecolor rgb '#D0006E',\
"mrunout_01.out" every ::203::402 using 2:3 title '10 ohm' with lines lw 3 linecolor rgb '#015DD4',\
"mrunout_01.out" every ::403::602 using 2:3 title '15 ohm' with lines lw 3 linecolor rgb '#F80419',\
"mrunout_01.out" every ::603::802 using 2:3 title '20 ohm' with lines lw 3 linecolor rgb '#07826A'
unset output
unset zeroaxis
unset terminal

every refers to the actual plottable points. In your case, you have to skip 2 lines and the bunch of data at the end of your datafile.
Since you know the actual lines you need to plot I would pre-parse the file with some external tools like sed
So you can omit the every and your plot line becomes:
plot "< sed -n '3,202p' mrunout_01.out" using 2:3 title '5 ohm' with lp lw 3 linecolor rgb '#D0006E'
With yor datafile as it is, gnuplot has problems reading it. It can't even run stats on it:
stats 'mrunout_01.out'
bad data on line 1 of file mrunout_01.out

There is no need for using external tools, you can simply do it with gnuplot.
It's advantageous with your data that it is regular, every 200 points plotted in a different color.
And the data you want to plot is separated by one empty line from some additional data at the end of the file which you don't want to plot.
So, you simply address the 4th set of 200 lines in the 0th block via every ::600:0:799:0.
From help every:
Syntax:
plot 'file' every {<point_incr>}
{:{<block_incr>}
{:{<start_point>}
{:{<start_block>}
{:{<end_point>}
{:<end_block>}}}}}
Comments:
you can skip two lines at the beginning of the files with skip 2
you can plot your curves in a loop plot for [i=1:4] ...
you can define your color myColor(n) via index n from a string "#D0006E #015DD4 #F80419 #07826A"
you can define the legend myTitle(n) also from a list "5 10 15 20"
Script: (tested with gnuplot 5.0.0, version at the time of OP's question)
### plot parts of a file in a loop
reset session
FILE = "SO36103041.dat"
myColor(n) = word("#D0006E #015DD4 #F80419 #07826A",n)
myTitle(n) = word("5 10 15 20",n)
set xlabel "Timi [s]"
set ylabel "Spenna [pu]"
set yrange[0:30]
plot for [i=1:4] FILE u 2:3 skip 2 every ::((i-1)*200):0:(200*i-1):0 \
w l lw 3 lc rgb myColor(i) ti myTitle(i)
### end of script
Result:

Plotting from two data sets delimited two different ways

I need to plot data from a .csv file and from a white space separated file. Both sets of data need to appear on the same plot.
data1.dat
#t y
1 1
2 1
3 1
and
data2.csv
#t,y
1,2
2,2
3,2
normally I would do the following if both were .csv sets:
set datafile separator ','
plot 'data1.csv' using 1:2,'data2.csv' using 1:2
Is there some way to include the setting of the separation character in the plot statement?
plot 'data1.dat' using 1:2,'data2.csv' using datafile separator ',' using 1:2
The above does not work and I tried many different variations of the above code....I had no luck.

You can give more than one character to set datafile separator, in your case ", ". All these are then individually treated as separators. (A tab caracter can be given as "\t", you need to put double quotes around the string then!)
$dat << EOD
1 2,4
2 2,5
3 1,6
4 4,4
EOD
set xr [0.5:4.5]
set dataf sep ", "
plot $dat us 1:2:3 w yerrorbars
Note that the explicit separator characters each count as exactly one separator. "4, 4" with set dataf sep ", " evaluates to "three columns, second is a missing value. If you have incompatible formats in one plot, you can import the data for each subplot with its own separators set using set table $<name>. (check "help datablocks")
If your data files have a very difficult format:
gnuplots using specifier accepts a libC scanf() format string
plot "-" us 1:2 "%lf,%lf"
1,2
2,3
3,4
e
You can give a different format string for every file on your plot command. Note that gnuplot only accepts "double" fp numbers for input, so you have to use the %le or %lf specifier.
Check help using examples, and here is a full description of the format.

AFAIK, there isn't a way to specify the separator. However, if you're in a POSIX compliant environment (and your gnuplot supports pipes -- which most do), you can farm the work out to awk pretty easily:
plot 'data1.dat' using 1:2,\
"<awk -F, '{print $1,$2}' data2.csv" using 1:2

Not just for "retro"-fun, but also for current gnuplot versions, I guess this is the only(?) gnuplot-only solution which works with all versions even with versions before the time of OP's question.
The "trick" is: if you set datafile separator "," and read the first (and only) stringcolumn (of a whitespace separated file), i.e. strcol(1) will contain the full line. Now, you can simply split the string with word() and convert it into a floating point number with real().
If your original data has at least one space after the comma,
1, -0.2
2, -0.1
3, 0.0
the data would be plotted correctly with keeping the separator as whitespace, since the comma after the first column's data will be ignored during number interpretation.
Although, for newer gnuplot versions (>=4.6.7, Apr 2015) you have the possibility to define several separators, however, which will not work as you might think, because
set datafile separator ", "
will interpret each space as column separator. So, if you have an undefined and variable number of spaces your plot command would fail.
Anyway, here is the "always" working solution:
Data:
SO14262760_1.dat (with variable number of spaces)
1 -0.1
2 0.0
3 +0.1
SO14262760_2.dat (with no or some space after ,)
1,-0.2
2,-0.1
3, 0.0
Script: (works with gnuplot>=4.4.0, March 2010)
### different column separators in two files with one plot command
reset
FILE1 = "SO14262760_1.dat"
FILE2 = "SO14262760_2.dat"
set datafile separator ","
myCol(n) = real(word(strcol(1),n))
plot FILE1 u (myCol(1)):(myCol(2)) w lp pt 7 lc rgb "red", \
FILE2 u 1:2 w lp pt 7 lc rgb "blue"
### end of script
Result:

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Reading gnuplot legend from csv - gnuplot

Related

Subtract smoothed data from original

How to sprintf() last data value in Gnuplot key?

Avoid connection of points when there is empty data

Gnuplot plotting wrong lines and some strange values as well

Plotting from two data sets delimited two different ways

Categories

Resources