print cvs with decription column and multiple rows - gnuplot

i've got a csv file with this content (Row = Title + 10 Values, Row = Line in chart)
column-count is known (but would be great if only the Title is needed and value count is open (but same for all rows)), row count is open
Test1;0,051;0,040;0,051;0,052;0,051;0,049;0,051;0,052;0,059;0,044
Test2;0,016;0,016;0,016;0,019;0,021;0,021;0,021;0,021;0,022;0,022
Test3;0,216;0,200;0,210;0,205;0,205;0,205;0,203;0,206;0,205;0,204
result in LibreOffice
now i want to print it with gnuplot
i tried it with
set xrange [1:10]
set yrange [0:2]
plot for [row=0:*] 'test.csv' matrix every :::row::row with lines
but that only gives me several error messages
"gnuplot.txt" line 7: warning: matrix contains missing or undefined values
"gnuplot.txt" line 7: warning: matrix contains missing or undefined values
"gnuplot.txt" line 7: warning: matrix contains missing or undefined values
"gnuplot.txt" line 7: warning: matrix contains missing or undefined values
would be nice if someone can give me a hint

Unfortunately, gnuplot doesn't like data in rows and doesn't have a transpose function. I guess there is a way to plot rows with matrix as you tried, but currently I don't see a direct way to include the rowheaders as key entry.
So, either you use an external tool and transpose your data or you use the cumbersome gnuplot-transpose-attempt below and then plot columns. At least, this should work for Linux as well as for Windows without the installation of extra tools.
Your decimal sign is ,. Since my standard decimalsign is ., I had to set:
set decimalsign locale 'french' # or 'german' might also work
Data: test.csv
Test1;0,051;0,040;0,051;0,052;0,051;0,049;0,051;0,052;0,059;0,044
Test2;0,016;0,016;0,016;0,019;0,021;0,021;0,021;0,021;0,022;0,022
Test3;0,216;0,200;0,210;0,205;0,205;0,205;0,203;0,206;0,205;0,204
Code:
### plot row data with rowheader as key entry
reset session
myFile = 'test.csv'
set datafile separator ';'
set decimalsign locale 'french' # or 'german' should also work
# transpose data
stats myFile u 0 nooutput
set table $Dummy
set print $DataTransposed
do for [i=1:STATS_columns] {
LINE = ''
do for [j=0:STATS_records-1] {
plot myFile u (a=stringcolumn(i)) every ::j::j with table
LINE = LINE.sprintf('%s', j < STATS_records-1 ? a.";" : a)
}
print LINE
}
set print
unset table
undefine $Dummy
plot for [i=1:STATS_records] $DataTransposed u ($0+1):i w l ti columnheader
### end of code
Result:

Related

gnuplot: simple beeswarm example

I have been struggling with a basic beeswarm plot from page 62 in this doc. I imagine they are skipping some details, and I'm not sure what actual data they used. I think in particular the problem is mapping a categorical/string variable to an X-axis value.
I used this data:
A 1
A 2
A 3
B 4
B 5
B 6
With this script:
set terminal png
set output "graph.png"
set jitter
plot "data.csv" using 1:2:1 with points lc variable
I get this error:
"graph_script" line 4: warning: Skipping data file with no valid points
plot "data.csv" using 1:2:1 with points lc variable
^
"graph_script" line 4: x range is invalid
In their demos gallery, I see something like set xtics ("A" -1, "B" 0) which could maybe help me to label already-numeric data better, but what if my data doesn't start off numeric to begin with?
Do I need something like (hash_string_to_large_int($1) % 2)? There must be an easier way!
As mentioned in the comments you have to "convert" your keys into numbers in order to plot them.
You can do this by creating a list with your unique keywords and defining a function to get the indices.
First, the following example creates some random data
The code after knows nothing about the keywords, so it creates the unique list from scratch from the random data.
Maybe there is (and I am not aware) a simpler solution with gnuplot only.
Code:
### bee-swarm plot with string keys
reset session
# create some random test data
myExts = '.py .sh .html'
set print $Data
do for [i=1:100] {
print sprintf("%s %d",word(myExts,int(rand(0)*3)+1),int(rand(0)*10+1)*5)
}
set print
# create a unique list of strings from a data stringcolumn
Uniques = ''
addToList(list,col) = list.( strstrt(list,'"'.strcol(col).'"') > 0 ? '' : ' "'.strcol(col).'"')
stats $Data u (Uniques = addToList(Uniques,1),0) nooutput
getIdx(key) = (_idx=NaN, sum [_i=1:words(Uniques)] (word(Uniques,_i) eq key ? _idx=_i : 0), _idx)
set offsets 0.5,0.5,1,1
set key noautotitle
set multiplot layout 1,2
set title "No jitter"
plot $Data u (idx=getIdx(strcol(1))):2:(idx):xtic(word(Uniques,idx)) w points pt 7 lc var
set title "With jitter"
set jitter
replot
unset multiplot
### end of code
Result:

Getting plot title and caption data from the data file

Consider the following file that I want to plot using gnuplot: Servos20211222_105253.csv
# Date/Time 2021/12/22, 10:52:53
# PonE=0,LsKp=200,LsKi=0,LsKd=250,HsKp=40,HsKi=0,HsKd=130,Sp=800,TDEC=1175137
#
# Rel. Time, currentPos, PosPID, currentSpeed, speedPID, Lag, ServoPos
0.00000,4693184,0,0,0,0,4693184
0.00000,4693184,2300,0,368,0,4693184
0.00391,4693185,2300,12,367,0,4693184
:
:
I would like to:
set the plot title to the date/time from the first comment record.
display the record that starts "# PonE" as a caption.
extract the value for TDEC and plot a horizontal line with the name "Target"
I have some influence over the format of the header records, so if (for example) it would be better that they were not comments but provided in some other way, then that can be done.
It is a common problem to get text values from files using only gnuplot. If you can use OS and shell dependent solutions, I'd suggest to use remove the comments from the file and try something like
set title "`head -1 Servos20211222_105253.csv`"
You can place text anywhere using set label <"label text">, where the label text can be the 2nd line from the file.
You can plot a straight line using plot:
p sin(x), 0.5 title "TDEC"
But instead of 0.5, you need to get the value using shell scripts again, e.g. the cut unix command.
There are ways with gnuplot only, although sometimes a bit cumbersome compared with using tools which you have available on Linux (or comparable tools which you need to install on Windows).
Update: shorter and "simplified" script
One possible gnuplot-only way:
set commentschar to nothing, i.e. ''
assign the columns to variables and/or arrays, e.g. myDate, myTime, P[1..9].
Merge P[1..8] into a multi-line string Params by "mis"-using sum (check help sum)
Convert P[9] into a floating point number TDEC for plotting
Script: (modified the data a bit just for illustration)
### extract values from headers with gnuplot only
reset session
$Data <<EOD
# Date/Time 2021/12/22, 10:52:53
# PonE=0,LsKp=200,LsKi=0,LsKd=250,HsKp=40,HsKi=0,HsKd=130,Sp=800,TDEC=1175137
#
# Rel. Time, currentPos, PosPID, currentSpeed, speedPID, Lag, ServoPos
0.00000,1300000,0,0,0,0,4693184
0.00200,1200000,2300,0,368,0,4693184
0.00391,1100000,2300,12,367,0,4693184
EOD
set datafile separator comma commentschar ''
array P[9] # array to store parameters
stats $Data u ($0==0 ? (myDate=strcol(1)[3:], myTime=strcol(2)) : \
sum [_i=1:9] (P[_i] = _i==1 ? strcol(_i)[3:] : strcol(_i) ,0 )) \
every ::0::1 nooutput
set datafile commentschar # set back to default
Params = P[1]
Params = (sum [_i=2:8] (Params=Params.sprintf("\n%s",P[_i]),0),Params)
set title sprintf("%s %s", myDate, myTime)
TDEC = real(P[9][6:]) # convert to real number
set label 1 at graph 0.02, first TDEC P[9] offset 0,-0.7
set label 2 at graph 0.02, graph 0.85 Params
plot $Data u 1:2 w lp pt 7 title "Data", \
TDEC w l lc "red" title "Target"
### end of script
Result:

max value for same minute over multiple days from csv with unix timestamps

I have a CSV with a unix timestamp column that was collected over multiple days having a data row for every 5 minutes (output log of my photo voltaik roof power plant).
I'd like to create a plot for 24 hours that shows the maximum value for every single (fifth) minute over all days.
Can this be done with gnuplots own capabilities or do I have to do the processing outside gnuplot via scrips?
You don't show how your exact data structure looks like, - theozh
This files are rather large. I placed an example here:
http://www.filedropper.com/log-pv-20190607-20190811 (300kB)
I'm specially interested in column 4 (DC1 P) and 9 (DC2 P).
Column 1 (Zeit) holds the unix timestamp.
The final goal is separate graphs (colors) for DC1 P and DC2 P, but that's a different question... ;o)
Update/Revision:
After revisiting this answer, I guess it is time for a clean up and simpler and extended solution. After some iterations and clarifications and after OP provided some data (although, the link is not valid anymore), I came up with some suggestions, which can be improved.
You can do all in gnuplot, no need for external tools!
The original request to plot the maximum values from several days is easy if you use the plotting style with boxes. But this is basically only a graphical solution. In that case is was apparently sufficient. However, if you are interested in the maximum values as numbers it is a little bit more effort.
gnuplot has the option smooth unique and smooth frequency (check help smooth). With this you can easily get the average and sum, respectively, but there is no smooth max or smooth min. As #meuh suggested, you can get maximum or mimimum with arrays, which are available since gnuplot 5.2.0
Script: (Requires gnuplot>=5.2.0)
### plot time data modulo 24h avg/sum/min/max
reset session
FILE = 'log-pv-20190607-20190811.csv'
set datafile separator comma
HeaderCount = 7
myTimeFmt = "%Y-%m-%d %H:%M:%S"
StartTime = ''
EndTime = ''
# if you don't define start/end time it will be taken automatically
if (StartTime eq '' || EndTime eq '') {
stats FILE u 1 skip HeaderCount nooutput
StartTime = (StartTime eq '' ? STATS_min : strptime(myTimeFmt,StartTime))
EndTime = (EndTime eq '' ? STATS_max : strptime(myTimeFmt,EndTime))
}
Modulo24Hours(t) = (t>=StartTime && t<=EndTime) ? (int(t)%86400) : NaN
set key noautotitle
set multiplot layout 3,2
set title "All data" offset 0,-0.5
set format x "%d.%m." timedate
set grid x,y
set yrange [0:]
myHeight = 1./3*1.1
set size 1.0,myHeight
plot FILE u 1:4:(tm_mday($1)) skip HeaderCount w l lc var
set multiplot next
set title "Data per 24 hours"
set format x "%H:%M" timedate
set xtics 3600*6
set size 0.5,myHeight
plot FILE u (Modulo24Hours($1)):4:(tm_mday($1)) skip HeaderCount w l lc var
set title "Average"
set size 0.5,myHeight
plot FILE u (int(Modulo24Hours($1))):4 skip HeaderCount smooth unique w l lc "web-green"
set title "Sum"
set size 0.5,myHeight
plot FILE u (int(Modulo24Hours($1))):4 skip HeaderCount smooth freq w l
set title "Min/Max"
set size 0.5,myHeight
N = 24*60/5
SecPerDay = 3600*24
array Min[N]
array Max[N]
do for [i=1:N] { Min[i]=NaN; Max[i]=0 } # initialize arrays
stats FILE u (idx=(int($1)%SecPerDay)/300+1, $4>Max[idx] ? Max[idx]=$4:0, \
Min[idx]!=Min[idx] ? Min[idx]=$4 : $4<Min[idx] ? Min[idx]=$4:0 ) skip HeaderCount nooutput
plot Min u ($1*300):2 w l lc "web-blue", \
Max u ($1*300):2 w l lc "red"
unset multiplot
### end of script
Result:
From gnuplot 5.2 you could use the new array datatype to calculate a maximum value for each 5 minute slot. I am not a gnuplot expert, so the following example needs more work, but shows the potential.
Assume data is similar to these lines, where there is a date in the format
yyyy.mm.dd.HH:MM, a comma and a y value:
2018.02.03.18:23,4
2018.02.03.19:23,7
2018.02.04.18:23,8
2018.02.05.19:23,11
Instead of using gnuplot's built-in time parsing, since we want to ignore the date, we create a function fsecs to use substr(stringcolumn(...),12,16) to get just the hours and minutes from data column 1, and strptime("%H:%M",...) to convert this to seconds:
set datafile separator ","
fsecs(v) = strptime("%H:%M",substr(stringcolumn(v),12,16))
We create an array Max indexed by "5 minute slot", of which there are 24*60/5 per day. It is initialised to NaN, not-a-number.
Nitems = int(24*60/5)+1
array Max[Nitems]
do for [i=1:Nitems] {
Max[i] = NaN
}
We then "plot" the data file data.csv into a dummy table, rather than generating any output. As we go through the data we index Max by the data x value (column 1) converted to seconds by fsecs(1) and then to slot by findex(). This is Max[findex(fsecs(1))].
We call our function fmax() to return the new maximum to set in the array.
findex(x) = int(((x)/60)/5)
fmax(a,b) = ((a>=b)?a:b)
set table $Dummy
plot 'data.csv' using \
(Max[findex(fsecs(1))] = fmax(Max[findex(fsecs(1))],$2)):2
unset table
Finally, we plot the array, which is the slot number against the value held in that slot number.
plot Max using 1:(Max[$1]) with points lw 2 title "max day"
This works for me on 5.2. You still need to label the x axes with HH:MM, and change the date parsing to fit your needs.
For time formating, please see Gnuplot date/time in x axis
If you do not care about format as time, you may use the every command, see gnuplot docu, but that does not take a maximum or something.
For the maximum value over a given time interval I suggest an awk script, see e.g. https://unix.stackexchange.com/a/207287/297901

Plotting CSV data with negative numbers using GNUplot

I'm newy on GNUplot, and I'd like to know how to plot from .csv or .txt file, I have written some code but always got an error like this,
line 10: warning: Skipping data file with no valid points
line 10: x range is invalid
and I don't know why,
I left the code written below, and some of the data i want to plot.
I'd like to plot colum 1 as x and column 2 as y
#programa para graficar archivos de texto csv.
set title '<Grafica de datos Sensor Tag 1>'
set ylabel '<Muestras>'
set xlabel '<mG>'
set grid
set datafile separator ";"
set autoscale fix
set output '<Grafica ST1>.png'
set xrange [*:]
plot "sensor12.csv" using 2 with lines
#plot "sensor1.txt" u (column(0)):2:xtic(1) w l title "", "sensor1.txt" u (column(0)):3:xtic(1) w l title "", "sensor1.txt" u (column(0)):4:xtic(1) w l title ""
pause -1
I hope someone could help me
thanks
RH
1547277933.638079,15.869141,136.718750,1019.287109
1547277933.640045,14.160156,134.765625,1014.160156
1547277933.642001,12.695312,132.812500,1011.962891
1547277933.643822,14.404297,135.742188,1018.554688
1547277933.645711,10.742188,134.765625,1016.845703
1547277933.647611,12.939453,133.056641,1022.705078
1547277933.649441,18.310547,132.324219,1012.939453
1547277933.651419,13.916016,134.033203,1017.822266
1547277933.653344,12.695312,134.521484,1015.869141
In your original data image you showed a table screenshot, which let Christoph infer that your datafile separator was whitespace.
However, your text file data is actually comma separated. As Thor mentioned, please always provide data as text.
So, replace your line set datafile separator ";"
with
set datafile separator ","
or
set datafile separator comma

gnuplot setting line titles by variables

Iam trying to plot multiple data lines with their titles in the key based on the variable which I am using as the index:
plot for [i=0:10] 'filename' index i u 2:7 w lines lw 2 t ' = '/(0.5*i)
However, it cannot seem to do this for a fractional multiple of i. Is there a way around this other than to set the title for each line separately?
sprintf should provide all the functionality needed, e.g.,
plot for [i=0:10] .... t sprintf(" = %.1f", 0.5*i)
in order to use the value of 0.5*i with 1 decimal digit...

Resources