GNUPLOT: boxplots variable line style/colors - gnuplot

I have multiple data files for which I want to draw a single figure. Each of the files contains a column with two variables: true and false. I would like to draw boxplot for each of these values such that they can be compared. Sample of data file is given below:
0.6,true
0.7,true
0.5,false
0.4,true
..
I come up with following code:
plot inputFile1 using (1):($4):(0.3):3 title 'A' , \
inputFile2 using (3):($4):(0.3):3 title 'B'
This generated the following figure:
However, I would like to customize it such that all the boxplots with "true" variable have one specific line style/color and boxplots with "false" variable have another specific line style/color.
Furthermore, I would like to show in the title the shape of true and false, however, on x-xis, I want to have File A and B for each true,false pair.
Any help in this regard would be highly appreciated.
Thanks in anticipation.

With your current datafile, you would need to detect whether the second column contains true or false and act accordingly. However, I am not sure gnuplot can process strings from a datafile.
If you process your file and replace the true or false by 1 or 0, then you can adapt the following line:
plot [0:6] "+" using 0:($0/2.):(0.3):0:xtic((int($0)%2)==0?"true":"false") w errorb lc variable
Here the 4th number in the using list defines the colour, with $0 the colour changes for each line of the file, but if the colour number is in one column of your file then use that column. Replace the "+" by your file and the first two numbers in using by the parameters needed by your plotting style. The xtic command processes some column in the file (here the line number $0) and labels the x tic depending on the value (see help ternary).
Note that your MWE does not work as is, please amend it if you want a more precise answer.

Related

matlab plot table - two cell arrays for labels and one vector of doubles

I have a 3 column table named 'A' from which I want to plot a heatmap or scatter plot where I can see a colour for the coordinates indicated by the first two columns. For example, at row 'A91552' and column 's_4_AAGCTA' I want to see a colour corresponding to 0.47619.
Example data:
'A91552' 's_4_AAGCTA' 0.476190000000000
'A91554' 's_4_CCTATT' 0.476190000000000
's_4_AAGCTA' 'A91552' 0.476190000000000
's_4_CCTATT' 'A91554' 0.476190000000000
Is there a way to do this directly using the strings as indices, or will I need to make a double matrix and change the axis labels on something like imagesc?
Found it:
I just needed to convert my lists of strings to categorical variables:
scatter(categorical(A.Var1), categorical(A.Var2), 125, A.Var3, 'filled')

Gnuplot - plotting series based on label in third column

I have data in the format:
1 1 A
2 3 ab
1 2 A
3 3 x
4 1 x
2 3 A
and so on. The third column indicates the series. That is in the case above there are 3 distinct data series, one designated A, another designated ab and last designated x. Is there a way to plot the three data series from such data structure in gnuplot without using eg. awk? The difficulty here is that the number of categories (here denoted A, ab, x) is quite large and it is not feasible to write them out by hand.
I was thinking along the lines:
plot data u 1:2:3 w dots
but that does not work and I get warning: Skipping data file with no valid points (I tried quoted and unquoted version of the third column). A similar question has to manually define the palette which is undesirable.
With a little bit of work you can make a list of unique categories from within gnuplot without using external tools. The following code snippet first assembles a list of the entire third column of the data file, and then loops over it to generate a list of unique category names. If memory use or processing time become an issue then one could probably combine these steps and avoid forming a single string with the entire third column.
delimiter = "#" # some character that does not appear in category name
categories = ""
stats "test.dat" using (categories = categories." ".delimiter.strcol(3).delimiter) nooutput
unique_categories = ""
do for [cat in categories] {
if (strstrt (unique_categories, cat) ==0) {
unique_categories = unique_categories." ".cat
}
}
set xrange[0:5]
set yrange [0:4]
plot for [cat in unique_categories] "test.dat" using 1:(delimiter.strcol(3).delimiter eq cat ? $2 : NaN) title cat[2:strlen(cat)-1]
Take a look at the contents of the string variables categories and unique_categories to get a better idea of what this code does.

Gnuplot: Defining an x axis based on the order of my values

I have some data
20,10.00
21,10.00
22,10.00
23,09.00
00,10.00
01,10.00
...
I want to graph the first value on the x axis and the second value on the y axis. I want the y axis to be autoset but I want the x axis to follow in line with my data eg. 20, 21, ..., 0, 1... instead of 0, 1, ..., 23
I thought I would do this with xticlabels, stating plot "filename" using xticlabels(1):2 or, as inspired by this, 1:2:xticlabels(1). Neither has the desired effect. What am I to do?
Yes, you must use xticlabels to add individual labels. But now you must still specify some value for the x-axis. If you know, that the rows all have the same spacing, then use the zeroth column as x-value:
plot "filename" using 0:2:xticlabels(1)
For my specific case, setting xrange [23:0] will suffice. However, this is not dynamic as it does not apply in the case of unordered values, so I am still curious of how the problem would otherwise be solved.

Prevent backward lines in gnuplot

I have some values given by clock time, where the first column is the time. However, the values until 2 o clock still belong to the current day. Given
3 1
12 4
18 1
21 2
1 3
2 0
named as test.data, I'd like to print this in gnuplot:
set xrange [0:24]
plot 'test.data' with lines
However, the plot contains a backward line. It's striking through the whole diagram.
Is there a way to tell gnuplot to explicitly not print such backward lines, or even better, print them wrapping around the x axis (e.g. in my example drawing the line as a forward line up to 24, and then continuing it at 0)?
Note: The x axis of the plot should still start at 0 and end at 24.
As far as wrapping over the edge of the graph (a pac-man like effect), gnuplot can't do that on it's own. Even doing it manually, you would have to somehow calculate the right point to re-enter the graph based on the slope of the connecting line, and insert a new point into the data to control where the re-entry line enters, and where the exiting line exits. This would require external processing.
If you can do some outside preprocessing, adding a blank line before the 1 3 line will insert a discontinuity into the plot and prevent gnuplot from connecting those points (see help datafile for how gnuplot handles blank lines). Of course, you could always sort the data too.
I would recommend sorting the data before plotting, but if you do want to do this wrapping effect, the following python program (wrapper.py) will set up the data for it
data = [tuple(map(float,x.strip().split(" "))) for x in open("data.txt","r")]
data2 = sorted(data)
back_in_to = data2[0]
out_from = data2[-1]
xdelta = back_in_to[0] + 24 - out_from[0]
ydelta = back_in_to[1] - out_from[1]
slope = ydelta/xdelta
outy = out_from[1] + (24-out_from[0])*slope
print(0,outy)
for x in data2:
print(*x)
if x[0]==data[-1][0]: print("")
print(24,outy)
It reads in the data (assumed to be in data.txt, and calculates the points where a line should leave the graph and where it should re-enter, adding these points to the sorted data. It adds a blank line after the last point in the original graph, causing the break in the line. We can then plot like
plot "< wrapper.py" with lines
If we look at your original plot
we see the backward line that you referred to which reaches from the furthest right point to the next left point. The plot that the python program pre-processed reaches through the right of the graph to move back to this point.

gnuplot: access a value from a file and plot that value against a column

I have 2 questions to do with gnuplot:
How can I access a specific value from a data file?
Can I plot that value against a column?
I have the following script:
plot for [iter=0:3:1] path_to_data using 1:(column(iter))
I would like to plot additional lines on this graph, based on values stored in path_to_expt.
The problem is that I want to plot an x value from one file (path_to_data: column(1)) against a constant y value from another file (path_to_expt). Is this possible?
I'm not sure of the syntax to access a cell from a file but the y value is stored at: path_to_expt[row(iter), column(1)].
Can I set a variable (expt_value) inside the above loop, equal to the y value, and plot it against path_to_sim[column(1)] as follows:
plot for [iter=0:3:1] path_to_data using 1:(column(iter)) , \
expt_value = path_to_expt[row(iter), column(1)] , \
path_to_sim using 1:expt_value
I tried this but my syntax is wrong and I can't find how to access a single value from a file. I don't know if I'll be able to plot a single value against a column but, if not, perhaps I could make a column of constant values to do this. Thank you for your help.

Resources