gnuplot: access a value from a file and plot that value against a column - gnuplot

I have 2 questions to do with gnuplot:
How can I access a specific value from a data file?
Can I plot that value against a column?
I have the following script:
plot for [iter=0:3:1] path_to_data using 1:(column(iter))
I would like to plot additional lines on this graph, based on values stored in path_to_expt.
The problem is that I want to plot an x value from one file (path_to_data: column(1)) against a constant y value from another file (path_to_expt). Is this possible?
I'm not sure of the syntax to access a cell from a file but the y value is stored at: path_to_expt[row(iter), column(1)].
Can I set a variable (expt_value) inside the above loop, equal to the y value, and plot it against path_to_sim[column(1)] as follows:
plot for [iter=0:3:1] path_to_data using 1:(column(iter)) , \
expt_value = path_to_expt[row(iter), column(1)] , \
path_to_sim using 1:expt_value
I tried this but my syntax is wrong and I can't find how to access a single value from a file. I don't know if I'll be able to plot a single value against a column but, if not, perhaps I could make a column of constant values to do this. Thank you for your help.

Related

matlab plot table - two cell arrays for labels and one vector of doubles

I have a 3 column table named 'A' from which I want to plot a heatmap or scatter plot where I can see a colour for the coordinates indicated by the first two columns. For example, at row 'A91552' and column 's_4_AAGCTA' I want to see a colour corresponding to 0.47619.
Example data:
'A91552' 's_4_AAGCTA' 0.476190000000000
'A91554' 's_4_CCTATT' 0.476190000000000
's_4_AAGCTA' 'A91552' 0.476190000000000
's_4_CCTATT' 'A91554' 0.476190000000000
Is there a way to do this directly using the strings as indices, or will I need to make a double matrix and change the axis labels on something like imagesc?
Found it:
I just needed to convert my lists of strings to categorical variables:
scatter(categorical(A.Var1), categorical(A.Var2), 125, A.Var3, 'filled')

Gnuplot - plotting series based on label in third column

I have data in the format:
1 1 A
2 3 ab
1 2 A
3 3 x
4 1 x
2 3 A
and so on. The third column indicates the series. That is in the case above there are 3 distinct data series, one designated A, another designated ab and last designated x. Is there a way to plot the three data series from such data structure in gnuplot without using eg. awk? The difficulty here is that the number of categories (here denoted A, ab, x) is quite large and it is not feasible to write them out by hand.
I was thinking along the lines:
plot data u 1:2:3 w dots
but that does not work and I get warning: Skipping data file with no valid points (I tried quoted and unquoted version of the third column). A similar question has to manually define the palette which is undesirable.
With a little bit of work you can make a list of unique categories from within gnuplot without using external tools. The following code snippet first assembles a list of the entire third column of the data file, and then loops over it to generate a list of unique category names. If memory use or processing time become an issue then one could probably combine these steps and avoid forming a single string with the entire third column.
delimiter = "#" # some character that does not appear in category name
categories = ""
stats "test.dat" using (categories = categories." ".delimiter.strcol(3).delimiter) nooutput
unique_categories = ""
do for [cat in categories] {
if (strstrt (unique_categories, cat) ==0) {
unique_categories = unique_categories." ".cat
}
}
set xrange[0:5]
set yrange [0:4]
plot for [cat in unique_categories] "test.dat" using 1:(delimiter.strcol(3).delimiter eq cat ? $2 : NaN) title cat[2:strlen(cat)-1]
Take a look at the contents of the string variables categories and unique_categories to get a better idea of what this code does.

GNUPLOT: boxplots variable line style/colors

I have multiple data files for which I want to draw a single figure. Each of the files contains a column with two variables: true and false. I would like to draw boxplot for each of these values such that they can be compared. Sample of data file is given below:
0.6,true
0.7,true
0.5,false
0.4,true
..
I come up with following code:
plot inputFile1 using (1):($4):(0.3):3 title 'A' , \
inputFile2 using (3):($4):(0.3):3 title 'B'
This generated the following figure:
However, I would like to customize it such that all the boxplots with "true" variable have one specific line style/color and boxplots with "false" variable have another specific line style/color.
Furthermore, I would like to show in the title the shape of true and false, however, on x-xis, I want to have File A and B for each true,false pair.
Any help in this regard would be highly appreciated.
Thanks in anticipation.
With your current datafile, you would need to detect whether the second column contains true or false and act accordingly. However, I am not sure gnuplot can process strings from a datafile.
If you process your file and replace the true or false by 1 or 0, then you can adapt the following line:
plot [0:6] "+" using 0:($0/2.):(0.3):0:xtic((int($0)%2)==0?"true":"false") w errorb lc variable
Here the 4th number in the using list defines the colour, with $0 the colour changes for each line of the file, but if the colour number is in one column of your file then use that column. Replace the "+" by your file and the first two numbers in using by the parameters needed by your plotting style. The xtic command processes some column in the file (here the line number $0) and labels the x tic depending on the value (see help ternary).
Note that your MWE does not work as is, please amend it if you want a more precise answer.

Gnuplot CCDF plotting and log-log scale

My data file is a set of sorted single-column:
1
1
2
2
2
3
...
999
1000
1000
I am able to successfully plot the CDF using the command like (assuming 10000 lines in the file):
plot "file" using 1:(1/10000.) smooth cumulative title "CDF"
I am also able to plot the logcale of x axis by:
set logscale x
My problem is how can I have a CCDF plotting with Gnuplot?
In additional, the CDF with log-log scale (set logscale xy) can not give me any output. What if I would like to have a log-log CCDF plotting?
Many thanks!
I found a workaround for this problem, because I do not think you can plot a CCDF only using gnuplot.
Briefly, I just parsed my data using bash to create a dataset where the cumulative data is explicit; then gnuplot may simply plot the new dataset. As an example, assuming that your file contains the (numerical) values you want to cumulate, I would do in a bash environment:
cat data | sort -n | uniq --count | awk 'BEGIN{sum=0}{print $2,$1,sum; sum=sum+$1}' > parsed.dat'
This command reads the dataset (cat data), sorts the numerical data using their value (sort -n), counts the occurrences of each sample (uniq --count) and creates a new dataset, calculating as well the cumulative sum of each data value (the awk command).
This new dataset contains 3 columns: the first column ($1 in gnuplot) contains the unique values of your dataset, the $2 contains the number of the occurrences of your values, and the third column represents the cumulative sum.
Finally, in gnuplot, you can do this:
stats "parsed.dat" using 3;
plot "parsed.dat" using 1:($3/STATS_max) with lines title "CDF",\
"" using 1:(1-$3/STATS_max) with lines title "CCDF",\
"" using 1:($2/STATS_max) with boxes title "PDF"
The stats command of gnuplot analyzes the third column (the one with the cumulative sum) and stores the values to some variables. STATS_max is the max value of this column (so it is the final cumulative sum). Now you have all the data you need to plot not only the CDF, but also the CCDF (which is 1 - CDF) and also the PDF (or the normalized histogram, for discrete values).

specifying points in space associated with a particular value in gnuplot

I have 10x10x10 grid points. Some of these points are associated with a value 1 and the others are associated with a value -1. I want to specify(give a color to) only those points which have value 1. Can anyone please tell me how this can be achieved in Gnuplot.
Thanks in advance.
If you want to filter out completely all points with value -1, you can do it as follows:
splot 'file' using 1:2:($4 == 1 ? $3 : 1/0) with points
That assumes, that your data file has four columns with the x, y, z value in the columns 1, 2, 3 and in the fourth column the values 1 or -1.
With the using statement you can specify which columns are used for plotting: using 1:2:3 uses the first column as x, the second as y and the third column as z value.
You can also do calculations inside the using statement. In that case you must put the respective expression in braces, and refer to the values of a column with e.g. $3 or column(3): using 1:2:($3/10) would scale the values in the third column by 10 and use the result as z value.
The expression I used above, using 1:2:($4 == 1 ? $3 : 1/0) does the following: If the value in the fourth column is equal to 1, use the value in the third column, otherwise use 1/0. The 'special' value 1/0 makes gnuplot ignore a point.

Resources