I have a file with a matrix like :
1 2 3
4 5 6
7 8 9
Using gnuplot, I would like to extract the Variable in the 3th row on the 2th column, and store it in a variable called X for example. please how to do that using gnuplot.
Thanks
You can do that within a plot command,
set table "/dev/null"
X=0
X_row=3
X_col=2
plot "file.dat" using (($0==X_row)?(X=column(X_col),X):0)
unset table
To save time the plot command can do something useful at the same time, like... plotting something.
Thanks, It's solved actually using this syntax :
plot u 0:($0==RowIndex?(VariableName=$ColumnIndex):$ColumnIndex)
#RowIndex starts with 0, ColumnIndex starts with 1
print VariableName
It's already explained quite well here :
by #StackJack
Related
I have data in the format:
1 1 A
2 3 ab
1 2 A
3 3 x
4 1 x
2 3 A
and so on. The third column indicates the series. That is in the case above there are 3 distinct data series, one designated A, another designated ab and last designated x. Is there a way to plot the three data series from such data structure in gnuplot without using eg. awk? The difficulty here is that the number of categories (here denoted A, ab, x) is quite large and it is not feasible to write them out by hand.
I was thinking along the lines:
plot data u 1:2:3 w dots
but that does not work and I get warning: Skipping data file with no valid points (I tried quoted and unquoted version of the third column). A similar question has to manually define the palette which is undesirable.
With a little bit of work you can make a list of unique categories from within gnuplot without using external tools. The following code snippet first assembles a list of the entire third column of the data file, and then loops over it to generate a list of unique category names. If memory use or processing time become an issue then one could probably combine these steps and avoid forming a single string with the entire third column.
delimiter = "#" # some character that does not appear in category name
categories = ""
stats "test.dat" using (categories = categories." ".delimiter.strcol(3).delimiter) nooutput
unique_categories = ""
do for [cat in categories] {
if (strstrt (unique_categories, cat) ==0) {
unique_categories = unique_categories." ".cat
}
}
set xrange[0:5]
set yrange [0:4]
plot for [cat in unique_categories] "test.dat" using 1:(delimiter.strcol(3).delimiter eq cat ? $2 : NaN) title cat[2:strlen(cat)-1]
Take a look at the contents of the string variables categories and unique_categories to get a better idea of what this code does.
I have a problem handling data using gnuplot.
My data has different column number per line.
I want to plot with X-axis of the first column and Y-axis of the last.
The last columns are always different every line.
For example, my data looks like that (my.dat)
1 2
2 1 3
3 4 4
4 5
5 2 1 3 6
plot 'my.dat' us 1:(lastcolumn) w l
Before reading in gnuplot, I can pre-process of the data.
But my gnuplot is windows version, I cannot use awk or any parsing program.
So I hope it handles only into gnuplot.
Is that possible?
Thanks
Yes, you can check that with gnuplot. The idea is as follows:
You analyze your data with stats and inside the using you check recursively with valid which column is the last valid. If an invalid column is reached you return the number of the previous column otherwise the next column is checked. The last column is then contained in the variable STATS_max
check_valid_column(c) = valid(c) ? check_valid_column(c + 1) : c - 1
stats 'my.dat' using (check_valid_column(1)) nooutput
last_column = int(STATS_max)
plot 'my.dat' using 1:last_column
Just for the records, here is an alternative suggestion. Christoph's solution is certainly more elegant and probably faster.
However, with the recursive approach you will get an error "recursion depth limit exceeded" if you have more than 250 columns (admittedly, probably very rare cases).
The solution below uses the lines as one string and counts the columns with words(). This, however, works only if you have whitespace as separator. With comma it will not work. Not sure what string length limit would be.
Code: (edit: no need to plot to a dummy table, stats can be used instead)
### find the maximum number of columns
reset session
# create some random test data
set print $Data
rows = int(rand(0)*5+5) # random 5 to 9 lines
do for [r=1:rows] {
minCols = 251 # if minCols >250, the recursive approach will fail
cols = int(rand(0)*10+minCols)
line = ''
do for [c=1:cols] { line = sprintf("%s %d",line,rand(0)*10) }
print line
}
set print
# alternative approach with word(). Works only for separator whitespace.
set datafile separator "\n"
maxCol=0
stats $Data u (cols=words(strcol(1)), cols>maxCol?maxCol=cols:0) nooutput
set datafile separator whitespace
print "words() approach: ", maxCol
# Recursive approach for comparison
print "Recursive approach: "
check_valid_column(c) = valid(c) ? check_valid_column(c + 1) : c - 1
stats $Data u (check_valid_column(1)) nooutput
last_column = int(STATS_max)
print last_column
### end of code
Result: (if number of max columns>250)
words() approach: 259
Recursive approach:
"SO41032862.gp" line 28: recursion depth limit exceeded
I have a file called plot.txt with a number of values such as:
1 7.5000000000000000
2 10.312500000000000
3 11.660156250000000
4 12.425537109375000
5 12.913055419921875
6 13.248996734619141
7 13.493841290473938
8 13.679883163422346
9 13.825851876754314
10 13.943356417876203
This list continuous until about 450. When i try to plot it with lines i get a linear line across the graph. Why is this? line graph And how do I get rid of it?
open(newunit=write_unit,access='sequential',file='plotgnu.txt',status='unknown')
write(write_unit,*)'plot ''plot.txt'' with linespoints '
close(write_unit,status='keep')
!Kaller på gnuplot
call execute_command_line("gnuplot -persist plotgnu.txt")
When i plot it without linespoints I get the the correct graph just with points point graph
write(write_unit,*)'plot ''plot.txt'' '
Your data file contains the same data set four times without empty lines:
1 7.5000000000000000
2 10.312500000000000
...
437 14.999999999999998
438 14.999999999999998
1 7.5000000000000000
2 10.312500000000000
...
If you plot that with lines you do of course also get a line from the last point of the first "data set" to the first point of the second occurrence of the data set. And that is the line you are seeing.
i have a file which looks as follows:
19:40:47,2772
19:41:50,2896
19:42:50,2870
19:43:51,2851
19:44:53,2824
19:45:55,2891
.
.
.
07:52:53,2772
07:53:56,2767
07:55:00,2709
07:56:01,2713
07:57:04,2844
07:58:04,2750
07:59:05,2744
08:00:08,2812
08:01:11,2728
08:02:14,2852
and im trying to do the simple task of making a graph with time X axis & number Y axis.
code as follows:
#!/usr/bin/gnuplot
unset multiplot
set xdata time
set datafile separator ","
set timefmt "%H:%M:%S"
set format x "%H:%M"
set title "defect number"
set xlabel "X"
set ylabel "Y"
plot "Defect_number_03-03-16_08.04.49.csv" using 1:2 w lines
pause -1
problem is that gnuplot autosorts the time and my chart looks like this:
I want to make a chart according to the order in the file, any help will be great =)
When you give the plot command
plot datafile u 1:2
you are telling gnuplot that the first column is your x-value and the second is your y-value. Naturally, earlier times are further to the left (as you didn't post your full data, I have used only the part you did post - this will cause a "skip" in the axis labels).
You can use a pseudocolumn to use the line number as your x-value. The 0 column corresponds to the line number (see help pseudocolumns).
Thus plot datafile u 0:2 will use the line number as the x-coordinate and the 2nd column as the y-coordinate.
We still need to add the correct x-axis labels, and can't rely on them to be generated correctly in this case. We would use the xtic function to do this, as1
plot datafile u 0:2:xtic(1)
which tells gnuplot to use the value in column 1 as an xtic, but it will read this literally and not format it as you have desired with the time. To do this, we can manually cast this to the correct string
plot datafile u 0:2:xtic(strftime("%H:%M",strptime("%H:%M:%S",strcol(1)))) w lines
Here, the strcol function reads column 1 as a string, the strptime function turns this into the internal time representation using the specified format string for reading it, and finally the strftime formats this as time string using the specified output string.
As Christoph stated in his answer, these solutions will cause uniform spacing of the points. If the points are already uniform spaced, this is not a problem, and if the points are very close to uniform spaced, it is probably acceptable as well (it looks like your points are about 1 minute apart give or take a couple of seconds).
However, if we want the absolutely correct spacing, we will need to add a date to the lines. This could be done in the original data file during the creation, or we could use an external process to add the dates only when needed leaving the original file exactly the same.
As you are only marking off the time and not the day in your tic marks, the actual day doesn't matter. It only matters that the times from the next morning are in the next day from the times from the last night.
We can use an external program to add dates. The following python 3 program reads the data file and adds a date to it (using Jan 1st, 2015 for the first date - as previously mentioned this date doesn't really matter). If a time occurs earlier in the day from the previous one, it moves to the next day. Here is the program adddates.py:
from datetime import datetime,timedelta
from sys import argv
last = None
offset = timedelta(days=0)
for x in open(argv[1],"r"):
vals = x.split(",")
dte = datetime.strptime("01/01/2015 "+vals[0],"%m/%d/%Y %H:%M:%S") + offset
if last!=None and last>dte:
offset+= timedelta(days=1)
dte = dte + offset
last = dte
print(dte.strftime("%Y-%m-%d %H:%M:%S"),vals[1],sep=",",end="")
The output from running this on the data file looks like:
2015-01-01 19:40:47,2772
2015-01-01 19:41:50,2896
2015-01-01 19:42:50,2870
2015-01-01 19:43:51,2851
2015-01-01 19:44:53,2824
2015-01-01 19:45:55,2891
...
2015-01-02 07:52:53,2772
2015-01-02 07:53:56,2767
...
We can now read data from this program by opening a pipe in our plot command.
set timefmt "%Y-%m-%d %H:%M:%S"
plot "< adddates.py datafile" u 1:2 with lines
1 Note that this also causes labels to overlap, as it uses all of them. To use every other one, we could have used xtic(int($0) % 2 == 0 ? strcol(1):""). A similar technique can be used with the format using the correct labels as well.
A proper solution is to save your data with full date and time, or as timestamps.
All other solutions with $0 and labelling the xtics with xticlabel requires your data to be spaces equidistantly, which doesn't seem to be the case.
So, just save your data as e.g. UNIX timestamp and you can use all nice gnuplot features without fiddling.
According to figure above. this picture is generated from data points in text file. My question is that how can i remove the line at any two points if graph is jumped? (In my picture see that graph is jump about on x~260)
note that my purpose is that i just want to make this graph look like piecewise function that mean line on the middle of graph should not be connected because is jumped.
In gnuplot you can split a line in several parts either when you have an invalid data value somewhere, or an empty line.
For the first situation, you could check inside the using statement, if the difference to the previous point is too large, and invalidate the current point. But that would also make you loose not only the connecting line, but also the first point after the jump:
lim=3
y2=y1=0
plot 'test.dat' using (y2=y1,y1=$2,$1):($0 > 0 && abs(y2-y1) > lim ? 1/0 : y1) with linespoints
The test data file I used is
1 1
2 1.1
3 0.95
4 1
5 5
6 6
7 5.5
8 5.8
9 -2
10 -2.5
11 -4
As you see, the points at x=5 and x=9 are lost.
Alternatively, you can pipe your data through an external tool like awk for the filtering. In this case you can insert an empty line when the difference between two consecutive y-values exceeds some limit:
filter(lim) = 'awk ''{if(NR > 1 && sqrt((y-$2)**2) > '.lim.') print ""; print; y=$2}'' test.dat'
plot '< '.filter(3) using 1:2 with lines
Note, that I used the sqrt((..)**2) only to simulate an abs function, which awk doesn't have.