Count columns in csv in gnuplot - gnuplot

Is there a function in gnuplot which returns the number of columns in a csv file?
I can't find anything in the docs, maybe someone can propose a custom made function for this?

As of gnuplot4.6, you can make a little hack script to do this. It is certainly not the most efficient, but it is pure gnuplot:
#script col_counter.gp
col_count=1
good_data=1
while (good_data){
stats "$0" u (valid(col_count))
if ( STATS_max ){
col_count = col_count+1
} else {
col_count = col_count-1
good_data = 0
}
}
Now in your main script,
call "col_counter.gp" "my_datafile_name"
print col_count #number of columns is stored in col_count.
This has some limitations -- It will choke if you have a column in the datafile that is completely non-numeric followed by more valid columns for example, but I think that it should work for many typical use cases.
print col_count
As a final note, you can use the environment variable GNUPLOT_LIB and then you don't even need to have col_counter.gp in the current directory.

Assuming this is related to this question, and that the content of infile.csv is:
n,John Smith stats,Sam Williams stats,Joe Jackson stats
1,23.4,44.1,35.1
2,32.1,33.5,38.5
3,42.0,42.1,42.1
You could do it like this:
plot.gp
nc = "`awk -F, 'NR == 1 { print NF; exit }' infile.csv`"
set key autotitle columnhead
set datafile separator ','
plot for [i=2:nc] "< sed -r '1 s/,([^ ]+)[^,]+/,\\1/g' infile.csv" using 1:i with lines
Note that the \1 needs escaping when used within " in Gnuplot.
Output:

Here is an update and an alternative extended retro-workaround: (of course gnuplot-only)
Update: (gnuplot>=5.0.0, Jan 2015)
Since gnuplot 5.0.0, there is the variable STATS_columns which will tell you the number of columns of the first unommented row.
stats FILE u 0 nooutput
print STATS_columns
Extended retro-workaround: (gnuplot>=4.6.0, March 2012)
Some time ago, I learnt that a correct CSV file should have the same number of columns (i.e. commas) in all rows. So it should be sufficient to "count" the commas in the first uncommented row. That's apparently what gnuplot>=5.0.0 is doing more or less.
However, in case you have an "incorrect CSV" with varying columns and you are interested in the minimum and maximum number of columns, you can use the following script, assuming that there are no (doublequoted) strings having a comma inside. Note, row indices are 0-based.
Data: SO13373206.dat
11, 12, 13, 14, 15, 16, 17
21, 22, 23, 24, 25, 26, 27, 28
31, 32, 33, 34, 35, 36, 37, 38, 39
41, 42, 43, 44, 45, 46
Script:
### count number of columns (gnuplot>=4.6.0)
reset
FILE = "SO13373206.dat"
countCommas(s) = sum[i=1:strlen(s)] ( s[i:i] eq ',' ? 1 : 0)
set datafile separator "\t" # in order to read a row as one string
stats FILE u (colCount=countCommas(strcol(1))+1,0) every ::0::0 nooutput
print sprintf("number of columns in first row: %d", colCount)
colMin = colMax = rMin = rMax = NaN
stats FILE u (c=countCommas(strcol(1))+1, \
c<colMin || colMin!=colMin ? (colMin=c,rMin=$0) : 0, \
c>colMax || colMax!=colMax ? (colMax=c,rMax=$0) : 0 ) nooutput
print sprintf("minimum %d columns in row %d",colMin, rMin)
print sprintf("maximum %d columns in row %d",colMax, rMax)
set datafile separator "," # restore separator
# ... plot something
### end of script
Result:
number of columns in first row: 7
minimum 6 columns in row 3
maximum 9 columns in row 2

Related

python3: Split time series by diurnal periods

I have the following dataset:
01/05/2020,00,26.3,27.5,26.3,80,81,73,22.5,22.7,22.0,993.7,993.7,993.0,0.0,178,1.2,-3.53,0.0
01/05/2020,01,26.1,26.8,26.1,79,80,75,22.2,22.4,21.9,994.4,994.4,993.7,1.1,22,2.0,-3.54,0.0
01/05/2020,02,25.4,26.1,25.4,80,81,79,21.6,22.3,21.6,994.7,994.7,994.4,0.1,335,2.3,-3.54,0.0
01/05/2020,03,23.3,25.4,23.3,90,90,80,21.6,21.8,21.5,994.7,994.8,994.6,0.9,263,1.5,-3.54,0.0
01/05/2020,04,22.9,24.2,22.9,89,90,86,21.0,22.1,21.0,994.2,994.7,994.2,0.3,268,2.0,-3.54,0.0
01/05/2020,05,22.8,23.1,22.8,90,91,89,21.0,21.4,20.9,993.6,994.2,993.6,0.7,264,1.5,-3.54,0.0
01/05/2020,06,22.2,22.8,22.2,92,92,90,20.9,21.2,20.8,993.6,993.6,993.4,0.8,272,1.6,-3.54,0.0
01/05/2020,07,22.6,22.6,22.0,91,93,91,21.0,21.2,20.7,993.4,993.6,993.4,0.4,284,2.3,-3.49,0.0
01/05/2020,08,21.6,22.6,21.5,92,92,90,20.2,20.9,20.1,993.8,993.8,993.4,0.4,197,2.1,-3.54,0.0
01/05/2020,09,22.0,22.1,21.5,92,93,92,20.7,20.8,20.2,994.3,994.3,993.7,0.0,125,2.1,-3.53,0.0
01/05/2020,10,22.7,22.7,21.9,91,92,91,21.2,21.2,20.5,995.0,995.0,994.3,0.0,354,0.0,70.99,0.0
01/05/2020,11,25.0,25.0,22.7,83,91,82,21.8,22.1,21.1,995.5,995.5,995.0,0.8,262,1.5,744.8,0.0
01/05/2020,12,27.9,28.1,24.9,72,83,70,22.3,22.8,21.6,996.1,996.1,995.5,0.7,228,1.9,1392.,0.0
01/05/2020,13,30.4,30.4,27.7,58,72,55,21.1,22.6,20.4,995.9,996.2,995.9,1.6,134,3.7,1910.,0.0
01/05/2020,14,31.7,32.3,30.1,50,58,48,20.2,21.3,19.7,995.8,996.1,995.8,3.0,114,5.4,2577.,0.0
01/05/2020,15,32.9,33.2,31.8,44,50,43,19.1,20.5,18.6,994.9,995.8,994.9,0.0,128,5.6,2853.,0.0
01/05/2020,16,33.2,34.4,32.0,46,48,41,20.0,20.0,18.2,994.0,994.9,994.0,0.0,125,4.3,2700.,0.0
01/05/2020,17,33.1,34.5,32.7,44,46,39,19.2,19.9,18.5,993.4,994.1,993.4,0.0,170,1.6,2806.,0.0
01/05/2020,18,33.6,34.2,32.6,41,47,40,18.5,20.0,18.3,992.6,993.4,992.6,0.0,149,0.0,2319.,0.0
01/05/2020,19,33.5,34.7,32.1,43,49,39,19.2,20.4,18.3,992.3,992.6,992.3,0.3,168,4.1,1907.,0.0
01/05/2020,20,32.1,33.9,32.1,49,51,41,20.2,20.7,18.5,992.4,992.4,992.3,0.1,192,3.7,1203.,0.0
01/05/2020,21,29.9,32.2,29.9,62,62,49,21.8,21.9,20.2,992.3,992.4,992.2,0.0,188,2.9,408.0,0.0
01/05/2020,22,28.5,29.9,28.4,67,67,62,21.8,22.0,21.7,992.5,992.5,992.3,0.4,181,2.3,6.817,0.0
01/05/2020,23,27.8,28.5,27.8,71,71,66,22.1,22.1,21.5,993.1,993.1,992.5,0.0,225,1.6,-3.39,0.0
02/05/2020,00,27.4,28.2,27.3,75,75,68,22.5,22.5,21.7,993.7,993.7,993.1,0.5,139,1.5,-3.54,0.0
02/05/2020,01,27.3,27.7,27.3,72,75,72,21.9,22.6,21.9,994.3,994.3,993.7,0.0,126,1.1,-3.54,0.0
02/05/2020,02,25.4,27.3,25.2,85,85,72,22.6,22.8,21.9,994.4,994.5,994.3,0.1,256,2.6,-3.54,0.0
02/05/2020,03,25.5,25.6,25.3,84,85,82,22.5,22.7,22.1,994.3,994.4,994.2,0.0,329,0.7,-3.54,0.0
02/05/2020,04,24.5,25.5,24.5,86,86,82,22.0,22.5,21.9,993.9,994.3,993.9,0.0,290,1.2,-3.54,0.0
02/05/2020,05,24.0,24.5,23.5,87,88,86,21.6,22.1,21.3,993.6,993.9,993.6,0.7,285,1.3,-3.54,0.0
02/05/2020,06,23.7,24.1,23.7,87,87,85,21.3,21.6,21.3,993.1,993.6,993.1,0.1,305,1.1,-3.51,0.0
02/05/2020,07,22.7,24.1,22.5,91,91,86,21.0,21.7,20.7,993.1,993.3,993.1,0.6,220,1.1,-3.54,0.0
02/05/2020,08,22.9,22.9,22.6,92,92,91,21.5,21.5,21.0,993.2,993.2,987.6,0.0,239,1.5,-3.53,0.0
02/05/2020,09,22.9,23.0,22.8,93,93,92,21.7,21.7,21.4,993.6,993.6,993.2,0.0,289,0.4,-3.53,0.0
02/05/2020,10,23.5,23.5,22.8,92,93,92,22.1,22.1,21.6,994.3,994.3,993.6,0.0,256,0.0,91.75,0.0
02/05/2020,11,26.1,26.2,23.5,80,92,80,22.4,23.1,22.2,995.0,995.0,994.3,1.1,141,1.9,789.0,0.0
02/05/2020,12,28.7,28.7,26.1,69,80,68,22.4,22.7,22.1,995.5,995.5,995.0,0.0,116,2.2,1468.,0.0
02/05/2020,13,31.4,31.4,28.6,56,69,56,21.6,22.9,21.0,995.5,995.7,995.4,0.0,65,0.0,1762.,0.0
02/05/2020,14,32.1,32.4,30.6,48,58,47,19.8,22.0,19.3,995.0,995.6,990.6,0.0,105,0.0,2657.,0.0
02/05/2020,15,34.0,34.2,31.7,43,48,42,19.6,20.1,18.6,993.9,995.0,993.9,3.0,71,6.0,2846.,0.0
02/05/2020,16,34.7,34.7,32.3,38,48,38,18.4,20.3,18.3,992.7,993.9,992.7,1.4,63,6.3,2959.,0.0
02/05/2020,17,34.0,34.7,32.7,42,46,38,19.2,20.0,18.4,991.7,992.7,991.7,2.2,103,4.8,2493.,0.0
02/05/2020,18,34.3,34.7,33.6,41,42,38,19.1,19.4,18.0,991.2,991.7,991.2,2.0,141,4.8,2593.,0.0
02/05/2020,19,33.5,34.5,32.5,42,47,39,18.7,20.0,18.4,990.7,991.4,989.9,1.8,132,4.2,1317.,0.0
02/05/2020,20,32.5,34.2,32.5,47,48,40,19.7,20.3,18.7,990.5,990.7,989.8,1.3,191,4.2,1250.,0.0
02/05/2020,21,30.5,32.5,30.5,59,59,47,21.5,21.6,20.0,979.8,990.5,979.5,0.1,157,2.9,345.5,0.0
02/05/2020,22,28.6,30.5,28.6,67,67,59,21.9,21.9,21.5,978.9,980.1,978.7,0.6,166,2.2,1.122,0.0
02/05/2020,23,27.2,28.7,27.2,74,74,66,22.1,22.2,21.6,978.9,979.3,978.6,0.0,246,1.7,-3.54,0.0
03/05/2020,00,26.5,27.2,26.0,77,80,74,22.2,22.5,22.0,979.0,979.1,978.7,0.0,179,1.4,-3.54,0.0
03/05/2020,01,26.0,26.6,26.0,80,80,77,22.4,22.5,22.1,979.1,992.4,978.7,0.0,276,0.6,-3.54,0.0
03/05/2020,02,26.0,26.5,26.0,79,81,75,22.1,22.5,21.7,978.8,979.1,978.5,0.0,290,0.6,-3.53,0.0
03/05/2020,03,25.3,26.0,25.3,83,83,79,22.2,22.4,21.8,978.6,989.4,978.5,0.5,303,1.0,-3.54,0.0
03/05/2020,04,25.3,25.6,24.6,81,85,81,21.9,22.5,21.7,978.1,992.7,977.9,0.7,288,1.5,-3.00,0.0
03/05/2020,05,23.7,25.3,23.7,88,88,81,21.5,21.9,21.5,977.6,991.8,977.3,1.2,256,1.8,-3.54,0.0
03/05/2020,06,23.3,23.7,23.3,91,91,88,21.7,21.7,21.5,976.9,977.6,976.7,0.4,245,1.8,-3.54,0.0
03/05/2020,07,23.0,23.6,23.0,91,91,89,21.4,21.9,21.3,976.7,977.0,976.4,0.9,257,1.9,-3.54,0.0
03/05/2020,08,23.4,23.4,22.9,90,92,90,21.7,21.7,21.3,976.8,976.9,976.5,0.4,294,1.6,-3.52,0.0
03/05/2020,09,23.0,23.5,23.0,88,90,87,21.0,21.6,20.9,992.1,992.1,976.7,0.8,263,1.6,-3.54,0.0
03/05/2020,10,23.2,23.2,22.5,91,92,88,21.6,21.6,20.8,993.0,993.0,992.2,0.1,226,1.5,29.03,0.0
03/05/2020,11,26.0,26.1,23.2,77,91,76,21.6,22.1,21.5,993.8,993.8,982.1,0.0,120,0.9,458.1,0.0
03/05/2020,12,26.6,27.0,25.5,76,80,76,22.1,22.5,21.4,982.7,994.3,982.6,0.3,121,2.3,765.3,0.0
03/05/2020,13,28.5,28.7,26.6,66,77,65,21.5,23.1,21.2,982.5,994.2,982.4,1.4,130,3.2,1219.,0.0
03/05/2020,14,31.1,31.1,28.5,55,66,53,21.0,21.8,19.9,982.3,982.7,982.1,1.2,129,3.7,1743.,0.0
03/05/2020,15,31.6,31.8,30.7,50,55,49,19.8,20.8,19.2,992.9,993.5,982.2,1.1,119,5.1,1958.,0.0
03/05/2020,16,32.7,32.8,31.1,46,52,46,19.6,20.7,19.2,991.9,992.9,991.9,0.8,122,4.4,1953.,0.0
03/05/2020,17,32.3,33.3,32.0,44,49,42,18.6,20.2,18.2,990.7,991.9,979.0,2.6,133,5.9,2463.,0.0
03/05/2020,18,33.1,33.3,31.9,44,50,44,19.3,20.8,18.9,989.9,990.7,989.9,1.1,170,5.4,2033.,0.0
03/05/2020,19,32.4,33.2,32.2,47,47,44,19.7,20.0,18.7,989.5,989.9,989.5,2.4,152,5.2,1581.,0.0
03/05/2020,20,31.2,32.5,31.2,53,53,46,20.6,20.7,19.4,989.5,989.7,989.5,1.7,159,4.6,968.6,0.0
03/05/2020,21,29.7,32.0,29.7,62,62,51,21.8,21.8,20.5,989.7,989.7,989.4,0.8,154,4.0,414.2,0.0
03/05/2020,22,28.3,29.7,28.3,69,69,62,22.1,22.1,21.7,989.9,989.9,989.7,0.3,174,2.0,6.459,0.0
03/05/2020,23,26.9,28.5,26.9,75,75,67,22.1,22.5,21.7,990.5,990.5,989.8,0.2,183,1.0,-3.54,0.0
The second column is time (hour). I want to separate the dataset by morning (06-11), afternoon (12-17), evening (18-23) and night (00-05). How I can do it?
You can use pd.cut:
bins = [-1,5,11,17,24]
labels = ['morning', 'afternoon', 'evening', 'night']
df['day_part'] = pd.cut(df['hour'], bins=bins, labels=labels)
I added column names, including Hour for the second column.
Then I used read_csv which reads the source text, "dropping" leading
zeroes, so that Hour column is just int.
To split rows (add a column marking the diurnal period), use:
df['period'] = pd.cut(df.Hour, bins=[0, 6, 12, 18, 24], right=False,
labels=['night', 'morning', 'afternoon', 'evening'])
Then you can e.g. use groupby to process your groups.
Because I used right=False parameter, the bins are closed on the left
side, thus bin limits are more natural (no need for -1 as an hour).
And bin limits (except for the last) are just starting hours of
each period - quite natural notation.

Creating a command with sprintf. Is this possible?

I want to plot some data. The data is in several files and the line it is in is not always the same. Therefore I used grep and some other commandline tools to extract the line I want. I read online, that it should be possible with gnuplot to print from a string or from the result of a commandline.
I work in linux.
set terminal pdfcairo enhanced font "Garamond,10" fontscale 1.0 size 9in,9in
set nogrid
set samples 1001
set border 31 linewidth .3
set output "access/accessTimeAcrossFreq.pdf"
set xlabel "freq"
set ylabel "Time [s]"
set key right top
set size square
set autoscale y
set termoption lw 2.5
volts = "0.8"
fins = "111 122 222"
freq = "0.5G 1G 1.5G 2G 2.5G 3G"
metrics = "read1bldeltav read0bldeltav read1senseChange read0senseChange read1latchChange read0latchChange sense1speed sense0speed write1CellFlip write0CellFLip write1CellSwing write0CellSwing write1BLSwing write0BLSwing powerpertime"
runTitle = "abetraryString"
filename(fin, f, volt) = sprintf("../%s_temp27_fin%s_freq%s_vdd%s/accessTimeVolLSA/result.txt", runTitle, fin, f, volt)
data(met, file) = system(sprintf("grep -n '%s' %s | cut -d: -f 2 | awk '{$1=$1};1'", met, file))
com(met, file) = sprintf("< grep -n '%s' %s | cut -d: -f 2 | awk '{$1=$1};1'", met, file)
do for [fin in fins] {
do for [v in volts] {
do for [met in metrics] {
set title sprintf("%s VLSA across Freq, fins %s, %sV, w/o she", met, fin, v)
plot for[i=1:words(freq)] com(met, filename(fin, word(freq, i), v)) using (i):2:xtic(word(freq, i)) notitle with points lc i
}
}
}
So I was wondering if a) I can have a function that returns a string that is a command that is then run by gnuplot
b) Where the error might come from:
line 32: warning: Skipping data file with no valid points
line 32: warning: Skipping data file with no valid points
line 32: warning: Skipping data file with no valid points
line 32: warning: Skipping data file with no valid points
line 32: warning: Skipping data file with no valid points
line 32: warning: Skipping data file with no valid points
line 32: x range is invalid
I thought, maybe I need a linebreak at the end of my one-liner of data. Or because gnuplot always thinks the first line is not data... I don't know.
Today I figured it out. I used prints in the for loop to see what the command returns. Before I posted the question I tried the command in a separate terminal with success. The problem was I just tested it with the first element of metrics. The prints revealed that I forgot the metrics need to be all lower case.
To conclude. Yes, you can put a string via a function together and gnuplot will then run it as I was expecting it. See the use of com(..) in the plot line.
Second. I think the xrange error usually points out that in a plot there are no data points and gnuplot does not like a xrange of 0. To figure this out I used prints. I did a quick search if there is a verbose mode but was not successful, so prints it is.
Maybe someone can take away something like I did.

Gnuplot 5.0 patchlevel 4 - passing column numbers in a macro

I have a data file (see below) with a dozen columns and I am only interested in plotting two columns (say, 5 and 10) when the values in column 1 are over a given interval. To do so, I have defined:
inter(min,max,var,colx)=(min<=column(var)&&column(var)<=max?column(colx):NaN)
Everything works as expected using plot 'data.dat' u (inter(0.25,0.5,1,5)):10 which plots columns 5 and 10 over the [0.25:0.5] interval of values in column 1.
As I need to plot various couples of columns over various intervals, I have created a file, PlotInterval.p, containing
inter(min,max,var,colx)=(min<=column(var)&&column(var)<=max?column(colx):NaN)
plot ARG1 u (inter(ARG2,ARG3,ARG4,ARG5)):ARG6
and when I call it with call 'PlotInterval.p' 0.25 0.5 1 5 10 then I get the error message:
gnuplot> call 'PlotInterval.p' 'data.dat' 0.25 0.5 1 5 10
"PlotInterval.p", line 3: warning: no column with header "1"
"PlotInterval.p", line 3: warning: partial match against column 6 header "1.451433e-005"
gnuplot> plot ARG1 u (inter(ARG2,ARG3,ARG4,ARG5)):ARG6
^
"PlotInterval.p", line 3: x range is invalid
It appears the column numbers are not passed properly (the min and max values of the interval are passed properly).
Here are the first lines of data.dat:
0.000000e+000 -1.577475e+000 -7.175042e+000 2.764545e-005 -5.966045e+000 1.451433e-005 -4.665347e+000 -1.412159e-005 6.154827e+000 0.000000e+000 0.000000e+000 3.100275e+002 0.000000e+000
2.500000e-003 4.346526e+000 -1.305610e+001 3.170804e-005 -5.790276e+000 1.632860e-005 -4.574010e+000 -1.459951e-005 6.069773e+000 -1.521847e+000 -1.521847e+000 3.009973e+002 0.000000e+000
5.000000e-003 1.055312e+001 -1.861278e+001 3.085889e-005 -5.604992e+000 1.797386e-005 -4.472427e+000 -1.651171e-005 5.977640e+000 -7.909049e+000 -7.909049e+000 3.029022e+002 0.000000e+000
7.500000e-003 1.676089e+001 -2.476250e+001 3.417608e-005 -5.412398e+000 2.195262e-005 -4.354189e+000 -1.823193e-005 5.874751e+000 -4.333744e+000 -4.333744e+000 2.982168e+002 0.000000e+000
1.000000e-002 2.276874e+001 -3.064776e+001 3.607515e-005 -5.204357e+000 2.585798e-005 -4.212604e+000 -1.948774e-005 5.763049e+000 -9.444781e+000 -9.444781e+000 2.864735e+002 0.000000e+000
1.250000e-002 2.901897e+001 -3.670245e+001 3.681956e-005 -4.988488e+000 2.942617e-005 -4.048886e+000 -2.254946e-005 5.638561e+000 -1.512790e+001 -1.512790e+001 2.852074e+002 0.000000e+000
1.500000e-002 3.479634e+001 -4.301166e+001 4.146322e-005 -4.756663e+000 3.338716e-005 -3.862872e+000 -2.427187e-005 5.499905e+000 -1.618025e+001 -1.618025e+001 2.797585e+002 0.000000e+000
1.750000e-002 4.052957e+001 -4.899462e+001 4.416380e-005 -4.503088e+000 3.794105e-005 -3.651641e+000 -2.608256e-005 5.350786e+000 -2.219509e+001 -2.219509e+001 2.736614e+002 0.000000e+000
2.000000e-002 4.657926e+001 -5.503798e+001 4.764674e-005 -4.231202e+000 4.255615e-005 -3.413258e+000 -2.911828e-005 5.187315e+000 -2.519971e+001 -2.519971e+001 2.689015e+002 0.000000e+000
Am I missing something? How can I get the column numbers to be passed? Is there a workaround? Thanks a lot.
The variables ARG1 etc are string variables, and column work differently for string or integer variable. So you must explicitly cast the values given to column to integers:
inter(min,max,var,colx)=(min<=column(int(var))&&column(int(var))<=max?column(int(colx)):NaN)
plot ARG1 u (inter(ARG2,ARG3,ARG4,ARG5)):ARG6

store commented value from data file in gnuplot

I have multiple data files output_k, where k is a number. The files look like
#a=1.00 b = 0.01
# mass mean std
0.2 0.0163 0.0000125
0.4 0.0275 0.0001256
Now I need to retrieve the values of a and b and to store them in a variable, so I can use them for the title or function input etc. The looping over the files in the folder works. But I need some help with reading out the the parameters a and b. This is what i have so far.
# specify the number of plots
plot_number = 100
# loop over all data files
do for [i=0:plot_number] {
a = TODO
b = TODO
#set terminal
set terminal postscript eps size 6.4,4.8 enhanced color font 'Helvetica,20' linewidth 2
set title "Measurement \n{/*0.8 A = a, B = b}"
outFile=sprintf("plot_%d.eps", i)
dataFile=sprintf("output_%d.data", i)
set output outFile
plot dataFile using 1:2:3 with errorbars lt 1 linecolor "red", f(a,b)
unset output
}
EDIT:
I am working with gnuplot for windows.
If you are on a Unixoid system, you can use system to get the output of standard command line tools, namely head and sed, which again allow to extract said values form the files:
a = system(sprintf("head -n 1 output_%i.data | sed \"s/#a=//;s/ b .*//\"", i))
b = system(sprintf("head -n 1 output_%i.data | sed \"s/.*b = //\"", i))
This assumes that the leading spaces to all lines in your question are actually a formatting mistake.
A late answer, but since you are working under Windows you either install the comparable utilities or you might be interested in a gnuplot-only solution (hence platform-independent).
you can use stats to extract information from the datablock (or file) to variables. Check help stats.
the extraction of your a and b depends on the exact structure of that line. You can split a line at spaces via word(), check help word and get substrings via substr() or indexing, check help substr.
Script: (works with gnuplot>=5.0.0)
### extract information from commented header without external tools
reset session
$Data <<EOD
#a=1.00 b = 0.01
# mass mean std
0.2 0.0163 0.0000125
0.4 0.0275 0.0001256
EOD
set datafile commentschar ''
set datafile separator "\t"
stats $Data u (myHeader=strcol(1)[2:]) every ::0::0 nooutput
set datafile commentschar # reset to default
set datafile separator # reset to default
a = real(word(myHeader,1)[3:])
b = real(word(myHeader,4))
set label 1 at graph 0.1,0.9 sprintf("a=%g\nb=%g",a,b)
plot $Data u 1:2 w lp pt 7 lc "red"
### end of script
Result:

Plotting arrows with gnuplot

I have data generated in a simulation. The generated data file looks something like this:
1990/01/01 99
1990/01/02 92.7
1990/01/03 100.3
1990/01/04 44.2
1990/01/05 71.23
...
2100/01/01 98.25
I can create a chart (trivially), by simply issuing the (long versioned) command:
plot "simulation.dat" using 1:2 with line
I want to add a third column which will add arrow information. The encoding for the third column would be as follows:
0 => no arrow to be drawn for that x axis value
1 => an UPWARD pointing arrow to be drawn for the x axis value
2 => a DOWNWARD arrow to be drawn for the x axis value
I am just starting to learn gnuplot, and will appreciate help in how I can use gnuplot to create the arrows on the first plot?
I dont think there is an automatic way to create all your arrows at the same time based on the third column. You will have to execute the following for each arrow that you want:
set arrow xval1,yval1 to xval2,yval2
You can also use relative arrows
set arrow xval1,yval1 rto 1,0
This will draw a horizontal arrow from xval1,yval1 to (xval1+1),yval1
There are plenty of options associated with the set arrow command:
If you didn't want the arrow head then you might try the impulses style (with impulses rather than with lines)
(If you still want the lines on top then you can plot twice).
If you really want the arrow heads then the following might help: It uses a for loop (or sorts) to add vertical arrows to a plot.
Gnuplot script, for loop within or adding to existing plot
Specifically:
create a file simloop.gp which looks like the following:
count = count+1
#save the count to count.gp
system 'echo '.count.' > count.gp'
#load the simloop shell
system "./simloop.sh"
#draw the arrow
load 'draw_arrow.gp'
if(count<max) reread
Then create a simloop.sh file that looks something like so
#!/bin/bash
#read the count
count=$(awk -F, '{print $1}' count.gp)
#read the file
xcoord=$(awk -v count=$count -F, 'BEGIN{FS=" ";}{ if(NR==count) print $1}' simulation.dat)
ycoord=$(awk -v count=$count -F, 'BEGIN{FS=" "}{ if(NR==count) print $2}' simulation.dat)
dir=$(awk -v count=$count -F, 'BEGIN{FS=" "}{ if(NR==count) print $3}' simulation.dat)
#choose the direction of the arrow
if [ \"$dir\" == \"0\" ]; then
echo '' > draw_arrow.gp
fi
if [ \"$dir\" == \"1\" ]; then
echo 'set arrow from ' $xcoord' ,0 to '$xcoord','$ycoord' head' > draw_arrow.gp
fi
if [ \"$dir\" == \"2\" ]; then
echo 'set arrow from '$xcoord',0 to '$xcoord','$ycoord' backhead' > draw_arrow.gp
fi
Then create a simulation.gp file that looks something like so:
count = 0;
max = 5;
load "simloop.gp"
set yrange[0:*]
plot "simulation.dat" u 1:2 w l
Make sure the shell file has executable permissions (chmod +wrx simloop.sh), load up gnuplot and type
load "./simulation.gp"
This worked for me with the data file
1 99 0
2 92.7 1
3 100.3 2
4 44.2 0
5 71.23 1
(For testing I got rid of the time formatting You should be able to put it back without too much trouble.)
Then I got this graph:
Which I think is more or less what you want.
Although the question is quite old, here is my answer.
One can use the vectors plotting style, which can use variable arrowstyles based on a column's value:
set style arrow 1 backhead
set style arrow 2 head
set yrange[0:*]
set xdata time
set timefmt "%Y/%m/%d"
plot "simulation.dat" using 1:2 with line,\
"" using 1:2:(0):(-$2):($3 == 0 ? 1/0 : $3) with vectors arrowstyle variable
It the value of a column is 1/0, the point is considered as undefined and is skipped.

Resources