I have the following data, which I wan't to plot using GNUPLOT:
#TIME #VALUE #SOURCE
1 100 A
1 88 B
2 115 A
2 100 B
3 130 A
3 210 B
I want to have two lines drawn, depending on the value of column #SOURCE. One line for A and one line for B. Is this possible with GNUPLOT and if yes how?
Is it possible to also draw a summation of column #VALUE depending over column #TIME? Means, that for all equal entries in #TIME, the values in #VALUE will be summed up.
Thanks in advance,
Frank
One way to do it would be to use grep to locate lines ending with A or B and plot the result. You can do this in a single plot line with a for loop if you know the characters lines will end in:
plot for [s in 'A B'] sprintf("<(grep -v '%s$' data.dat)", s) u 1:2 w l
This plots the data you provided (saved in data.dat) as two different lines.
You could also change the for part to [s in 'word1 word2 word3'] or any other string you like. If you don't know the character/word lines will be ending with you would probably need to pass the file twice first to determine the string for the for loop and a second time to do the plotting.
Related
I have data in the format:
1 1 A
2 3 ab
1 2 A
3 3 x
4 1 x
2 3 A
and so on. The third column indicates the series. That is in the case above there are 3 distinct data series, one designated A, another designated ab and last designated x. Is there a way to plot the three data series from such data structure in gnuplot without using eg. awk? The difficulty here is that the number of categories (here denoted A, ab, x) is quite large and it is not feasible to write them out by hand.
I was thinking along the lines:
plot data u 1:2:3 w dots
but that does not work and I get warning: Skipping data file with no valid points (I tried quoted and unquoted version of the third column). A similar question has to manually define the palette which is undesirable.
With a little bit of work you can make a list of unique categories from within gnuplot without using external tools. The following code snippet first assembles a list of the entire third column of the data file, and then loops over it to generate a list of unique category names. If memory use or processing time become an issue then one could probably combine these steps and avoid forming a single string with the entire third column.
delimiter = "#" # some character that does not appear in category name
categories = ""
stats "test.dat" using (categories = categories." ".delimiter.strcol(3).delimiter) nooutput
unique_categories = ""
do for [cat in categories] {
if (strstrt (unique_categories, cat) ==0) {
unique_categories = unique_categories." ".cat
}
}
set xrange[0:5]
set yrange [0:4]
plot for [cat in unique_categories] "test.dat" using 1:(delimiter.strcol(3).delimiter eq cat ? $2 : NaN) title cat[2:strlen(cat)-1]
Take a look at the contents of the string variables categories and unique_categories to get a better idea of what this code does.
I have a matrix which contains the atom numbers of the pairs of atoms which are in contact with each other. My matrix is like this:
column 1: atom number i;
column 2: atom number j
i,j runs from 1 to 800.
If there is a pair i-j in the matrix, place a dot corresponding to the position (i,j) of the matrix.
How do I plot such matrix?
Example:
A= [1,3; 3,8; 3,1; 6,2; 2,6; 1,2; 5,2; 8,3; 2,5; 2,1]
I want to Plot the matrix A, where X and Y-axis run from 1 to 8. Place a dot for every combination of X and Y which are present in A.
I want a plot like this:
Isn't this just a scatter plot?
If your m x 2 matrix is saved in a text file then this is trivial.
Here are the contents of an example data file "input.dat":
4 3
3 4
5 3
3 5
8 2
2 8
All you need to do is open the data file in xmgrace using xmgrace input.dat.
Now, initially it will be a line plot, but if you do 'Plot' > 'Set Appearance' and then with the only set already being selected you can set the 'Symbol Properties' 'Type:' to Diamond and 'Line Properties' 'Type:' to None you will already be on your way. Setting the symbol fill to solid red, tweaking the axis ranges and showing major tick grid lines will give a plot like the one you gave as an example.
You can save a parameters file and in future load the parameters at the beginning using
xmgrace -param template.par input2.dat.
But, having said all this, why not just plot it in matlab?
I have a problem handling data using gnuplot.
My data has different column number per line.
I want to plot with X-axis of the first column and Y-axis of the last.
The last columns are always different every line.
For example, my data looks like that (my.dat)
1 2
2 1 3
3 4 4
4 5
5 2 1 3 6
plot 'my.dat' us 1:(lastcolumn) w l
Before reading in gnuplot, I can pre-process of the data.
But my gnuplot is windows version, I cannot use awk or any parsing program.
So I hope it handles only into gnuplot.
Is that possible?
Thanks
Yes, you can check that with gnuplot. The idea is as follows:
You analyze your data with stats and inside the using you check recursively with valid which column is the last valid. If an invalid column is reached you return the number of the previous column otherwise the next column is checked. The last column is then contained in the variable STATS_max
check_valid_column(c) = valid(c) ? check_valid_column(c + 1) : c - 1
stats 'my.dat' using (check_valid_column(1)) nooutput
last_column = int(STATS_max)
plot 'my.dat' using 1:last_column
Just for the records, here is an alternative suggestion. Christoph's solution is certainly more elegant and probably faster.
However, with the recursive approach you will get an error "recursion depth limit exceeded" if you have more than 250 columns (admittedly, probably very rare cases).
The solution below uses the lines as one string and counts the columns with words(). This, however, works only if you have whitespace as separator. With comma it will not work. Not sure what string length limit would be.
Code: (edit: no need to plot to a dummy table, stats can be used instead)
### find the maximum number of columns
reset session
# create some random test data
set print $Data
rows = int(rand(0)*5+5) # random 5 to 9 lines
do for [r=1:rows] {
minCols = 251 # if minCols >250, the recursive approach will fail
cols = int(rand(0)*10+minCols)
line = ''
do for [c=1:cols] { line = sprintf("%s %d",line,rand(0)*10) }
print line
}
set print
# alternative approach with word(). Works only for separator whitespace.
set datafile separator "\n"
maxCol=0
stats $Data u (cols=words(strcol(1)), cols>maxCol?maxCol=cols:0) nooutput
set datafile separator whitespace
print "words() approach: ", maxCol
# Recursive approach for comparison
print "Recursive approach: "
check_valid_column(c) = valid(c) ? check_valid_column(c + 1) : c - 1
stats $Data u (check_valid_column(1)) nooutput
last_column = int(STATS_max)
print last_column
### end of code
Result: (if number of max columns>250)
words() approach: 259
Recursive approach:
"SO41032862.gp" line 28: recursion depth limit exceeded
My data file is a set of sorted single-column:
1
1
2
2
2
3
...
999
1000
1000
I am able to successfully plot the CDF using the command like (assuming 10000 lines in the file):
plot "file" using 1:(1/10000.) smooth cumulative title "CDF"
I am also able to plot the logcale of x axis by:
set logscale x
My problem is how can I have a CCDF plotting with Gnuplot?
In additional, the CDF with log-log scale (set logscale xy) can not give me any output. What if I would like to have a log-log CCDF plotting?
Many thanks!
I found a workaround for this problem, because I do not think you can plot a CCDF only using gnuplot.
Briefly, I just parsed my data using bash to create a dataset where the cumulative data is explicit; then gnuplot may simply plot the new dataset. As an example, assuming that your file contains the (numerical) values you want to cumulate, I would do in a bash environment:
cat data | sort -n | uniq --count | awk 'BEGIN{sum=0}{print $2,$1,sum; sum=sum+$1}' > parsed.dat'
This command reads the dataset (cat data), sorts the numerical data using their value (sort -n), counts the occurrences of each sample (uniq --count) and creates a new dataset, calculating as well the cumulative sum of each data value (the awk command).
This new dataset contains 3 columns: the first column ($1 in gnuplot) contains the unique values of your dataset, the $2 contains the number of the occurrences of your values, and the third column represents the cumulative sum.
Finally, in gnuplot, you can do this:
stats "parsed.dat" using 3;
plot "parsed.dat" using 1:($3/STATS_max) with lines title "CDF",\
"" using 1:(1-$3/STATS_max) with lines title "CCDF",\
"" using 1:($2/STATS_max) with boxes title "PDF"
The stats command of gnuplot analyzes the third column (the one with the cumulative sum) and stores the values to some variables. STATS_max is the max value of this column (so it is the final cumulative sum). Now you have all the data you need to plot not only the CDF, but also the CCDF (which is 1 - CDF) and also the PDF (or the normalized histogram, for discrete values).
I have a data file containing 30 columns and N rows. Each rows correspond to 30 values of function f(x) for x={1,...,30}. The data file has following pattern:
#<index> f(1) f(2) ... f(30)
1 7.221 5.302 ... -1.031
2 4.527 3.193 ... 0.410
...
N 6.386 1.321 ... -0.386
gnuplot interprets first column as X and the second one as Y. But, what I want is to plot each line in a separated output file without transposing this data file. For example, for the first line, the desired output would be what gnuplot gets with this input file:
# X Y
1 7.221
2 5.302
...
30 -1.031
I found a solution:
plot "data.dat" matrix every 1::1 with linespoint
matix indicates data file type by which the input file interpreted as matrix.
every 1::1 skip the first column
UPDATED based on #Christoph's comment:
plot for [i=2:30] 'data.dat' using (i-1):(column(i)) with linespoint