logscale histogram of a matrix - gnuplot

I have data organized like this:
XPos Data1 Data2 Data3
100 2 3 4
1000 20 30 40
10000 200 300 400
And I would like to draw a bar chart where the first column is used as X, and each data row is used as a cluster.
Problem is: I need to use logscales on Y and X columns should be placed with equal size and space between them.
Something like this:
Is it possible in gnuplot? When I use logscale, I get this message:
Log scale on X is incompatible with histogram plots
Or, it is possible using octave?

I may be misunderstanding what you need.
However, using the following script:
set ytics auto
set logscale y
set style data histogram
set style fill solid border -1
plot 'data.dat' u 2:xtic(1) t col, '' u 3 t col, '' u 4 t col
gives me the following plot:
I guess set logscale y is the key.

Related

How to sum an entire column in gnuplot?

I have a multiple CSV files like this (just one column):
3
4
2.3
0.1
Now I want to create a gnuplot bar chart that has <filename>:<sum of the column>.
But currently I struggle with summing up a single column:
plot 'data1.txt' using 0:(sum [col = 0:MAXCOL] (col)) with linespoint;
The command you show is summing each row rather than each column.
(1) If you can transpose rows/columns in your csv file before feeding it to gnuplot, this command would produce a plot close to what you ask for. Note that MAXCOL is really the number of rows (not columns) in the original data file
set boxwidth 0.5
set style fill solid
plot 'transpose_of_original' using 0:(sum [col=0:MAXCOL] col) with boxes
(2) Alternatively you can do the summing gnuplot by first accumulating the sums and then plotting it afterward
# get number of columns
stats 'data1.txt' nooutput
NCOL = STATS_columns
array SUM[NCOL]
# get sum for each column
do for [col=1:NCOL] {
stats 'data1.txt' using col nooutput
SUM[col] = STATS_sum
}
# Now we plot the sums in a bar chart
set style fill solid
set boxwidth 0.5
set xlabel "Column"
set ylabel "Sum"
plot SUM using 1:2 with boxes
With help from #Ethan, I was able to solve my problem:
array files = ['data1.txt', 'data2.txt']
array SUM[|files|]
do for [i=1:|files|] {
stats files[i] using 1 nooutput
SUM[i] = STATS_sum
}
set style fill solid
set boxwidth 0.5
set xlabel 'File'
set ylabel 'Sum'
set yrange [0:]
plot SUM using 1:2:xticlabels(files[column(0)+1]) with boxes
data1.txt:
11
22
33
44
data2.txt:
11
2
33
4

How to remove line between "jumping" values, in gnuplot?

I would like to draw a line with plots that contain "jumping" values.
Here is an example: when we have plots of sin(x) for several cycles and plot it, unrealistic line will appear that go across from right to left (as shown in following figure).
One idea to avoid this might be using with linespoints (link), but I want to draw it without revising the original data file.
Do we have simple and robust solution for this problem?
Assuming that you are plotting a function, that is, for each x value there exists one and only one corresponding y value, the easiest way to achieve what you want is to use the smooth unique option. This smoothing routine will make the data monotonic in x, then plot it. When several y values exist for the same x value, the average will be used.
Example:
Data file:
0.5 0.5
1.0 1.5
1.5 0.5
0.5 0.5
Plotting without smoothing:
set xrange [0:2]
set yrange [0:2]
plot "data" w l
With smoothing:
plot "data" smooth unique
Edit: points are lost if this solution is used, so I suggest to improve my answer.
Here can be applied "conditional plotting". Suppose we have a file like this:
1 2
2 5
3 3
1 2
2 5
3 3
i.e. there is a backline between 3rd and 4th point.
plot "tmp.dat" u 1:2
Find minimum x value:
stats "tmp.dat" u 1:2
prev=STATS_min_x
Or find first x value:
prev=system("awk 'FNR == 1 {print $1}' tmp.dat")
Plot the line if current x value is greater than previous, or don't plot if it's less:
plot "tmp.dat" u ($0==0? prev:($1>prev? $1:1/0), prev=$1):2 w l
OK, it's not impossible, but the following is a ghastly hack. I really advise you add an empty line in your dataset at the breaks.
$dat << EOD
1 1
2 2
3 3
1 5
2 6
3 7
1 8
2 9
3 10
EOD
plot for [i=0:3] $dat us \
($0==0?j=0:j=j,llx=lx,lx=$1,llx>lx?j=j+1:j=j,i==j?$1:NaN):2 w lp notit
This plots your dataset three times (acually four, there is a small error in there. I guess i have to initialise all variables), counts how often the abscissa values "jump", and only plots datapoints if this counter j is equal to the plot counter i.
Check the help on the serial evaluation operator "a, b" and the ternary operator "a?b:c"
If you have data in a repetitive x-range where the corresponding y-values do not change, then #Miguel's smooth unique solution is certainly the easiest.
In a more general case, what if the x-range is repetitive but y-values are changing, e.g. like a noisy sin(x)?
Then compare two consecutive x-values x0 and x1, if x0>x1 then you have a "jump" and make the linecolor fully transparent, i.e. invisible, e.g. 0xff123456 (scheme 0xaarrggbb, check help colorspec). The same "trick" can be used when you want to interrupt a dataline which has a certain forward "jump" (see https://stackoverflow.com/a/72535613/7295599).
Minimal solution:
plot x1=NaN $Data u 1:2:(x0=x1,x1=$1,x0>x1?0xff123456:0x0000ff) w l lc rgb var
Script:
### plot "folded" data without connecting lines
reset session
# create some test data
set table $Data
plot [0:2*pi] for [i=1:4] '+' u 1:(sin(x)+rand(0)*0.5) w table
unset table
set xrange[0:2*pi]
set key noautotitle
set multiplot layout 1,2
plot $Data u 1:2 w l lc "red" ti "data as is"
plot x1=NaN $Data u 1:2:(x0=x1,x1=$1,x0>x1?0xff123456:0x0000ff) \
w l lc rgb var ti "\n\n\"Jumps\" removed\nwithout changing\ninput data"
unset multiplot
### end of script
Result:

gnuplot: min and max values for arbitrary number of columns

I'm trying to plot an arbitrary number of lines in a same plot. My data file is like the following:
1 10 15 20
2 20 25 30
3 30 35 40
4 40 45 50
5 50 55 60
I'm using multiplot to do this:
set multiplot
do for [i=1:ny] {
plot 'data.dat' u 1:i+1 with lines lc i title word(names,i)
}
unset multiplot
where ny=3 in this example. As expected, the yrange of each plot is different, so the graph looks very messy. I'm trying to add
set yrange [ymin:ymax]
where ymin=min(col2,col3,col4,...,coln) is the minimum value among all the columns 2-n and ymax is the maximum value. However, I still don't know how to get ymin and ymax. The function stats allow me to get minimums and maximums for one or two columns at the same time, but no more. Even if I do this column by column, I still don't know how to get the maximum among n scalars.
Any idea?
You can use if statement, here is the code:
ymin=1000 #set ymin to a very large value
ymax=0 #set ymax to a very small value
do for [i=1:ny] {
stats "data.dat" u i+1
if (STATS_min < ymin) {ymin=STATS_min}
if (STATS_max > ymax) {ymax=STATS_max}
}
Usually, multiplot isn't for drawing multiple plots in one graph, but to draw several beneath each other. I guess you want to iterate inside the plot command:
plot for [i=1:ny] 'data.dat' u 1:i+1 with lines lc i title word(names, i)
This uses ranges which cover the values of all sub-plots. And it gets the key right.

Plotting horizontal lines in gnuplot on an existing graph, using the same coloured lines

The starting point is that I have a graph with 4 lines on it. They are the results of my simulation, plotted over an x-axis of iteration, at 4 different locations. I also have experimental values at each of those locations. I want to plot those 4 experimental values as horizontal lines on the same graph. I would also like the line colours of the simulation and experiment results at each location to be the same.
With #Tom's help, below, I have got the following script to do this:
unset bars
max = 1e6
set xrange[7000:24000]
set yrange[-0.5:1.5]
plot for [i=2:5] 'sim' using 1:(column(i)) ls i, \
for [i=1:4] 'expt' using (1):1:(max) every ::(i-1)::(i-1) with xerror ls i ps 0
The problem is that I want the values in xrange[x_min:x_max] and yrange[y_min:y_max] to be taken from sim and expt as follows:
x_min = min(sim[:1]) # where min(sim[:1]) means "min value in file 'sim' col 1"
x_max = max(sim[:1])
y_min = min(sim[:2],sim[:3],sim[:4],sim[:5],expt[:1])
y_max = max(sim[:2],sim[:3],sim[:4],sim[:5],expt[:1])
My OS is Scientific Linux: Release 6.3, Kernel Linux 2.6.32-358.2.1.el6.x86_64, GNOME 2.28.2
sim and expt are .txt files
A representative sample of sim is:
7520 0.282511 0.0756715 -0.222863 -0.0898819
7521 0.315944 0.201687 -0.321723 -0.106345
7522 0.230956 0.102217 -0.34196 -0.061009
7523 1.460043 -0.00118292 -0.045077 0.673926
A representative sample of expt is:
1.112
0.123
-0.45
0.862
Thank you for your help.
I think that this is a way to solve your problem:
unset bars
max = 1e6
set xrange[0:8]
plot for [i=1:4] 2*i+sin(x) ls i, \
for [i=1:4] 'expt' using (1):1:(max) every ::(i-1)::(i-1) with xerror ls i ps 0
Based on some information I found on Gnuplot tricks, I have (ab)used error bars to produce horizontal lines based on the points in this data file:
2
4
6
8
The (1):1:(max) specifies that a point should be plotted at the coordinate (1, y), where y is read from the data file. The max is the value of xdelta, which determines the size of the x error bar. This is one way of achieving a horizontal line in your plot, as a suitably large value of max will result in an error bar across the entire xrange of your plot.
Here's what the output looks like:
Considering, that you have a data file with five columns, one with the x-values and four with y-values. Now you have additional file where a number path_to_expt comes from. In order to plot the columns and one horizontal line having the y-value path_to_expt you can use
plot for [i=2:5] path_to_file using 1:(column(i))
This plot col 2 against 1, 3 vs 1, 4 vs 1 and 5 vs 1. To get different styles, just use set linetype to redefine the automatically assigned line types:
set linetype 1 lc rgb 'orange'
# ... other lt definitions
plot for [i=2:5] path_to_file using 1:(column(i))
If you don't want to overwrite exising linetype 1..4, use e.g. 11..14:
set linetype 11 lc rgb 'orange'
# ...
plot for [i=2:5] path_to_file using 1:(column(i)) lt (9 + i)
Finally, in order to plot a horizontal line, using the same x-values as in the data file, use
mynumber = 27
plot path_to_file using 1:(mynumber)
If you don't put a number in parentheses, it is interpreted as column number (like the 1 here), whereas put inside parentheses, it is treated as number.
Another option would be to set arrows:
set arrow from graph 0, first mynumber to graph 1, first mynumber lt 1
plot for [i=2:5] path_to_file using 1:(column(i))

Gnuplot histogram x logscale

I'm using gnuplot in a bash script to draw several things.
For this special graphic, I need to print the amount of matrices (y axis) with the matrix size as the x-axis.
As the distribution can be pretty sparsed, I want to use a logscale for x and y. It works great with y, but gnuplot tells me I can't have a logscale for the x-axis when I'm using histogram style.
Any ideas to debug this? or on how to present the results using a similar way?
set style data histogram
set style histogram cluster gap 1
set style fill solid border -1
set logscale xy
plot '$res/histo-$ld-$lr-$e-$r' using 2:xtic(1) title 'Run'
The error is :
line 0: Log scale on X is incompatible with histogram plots
Thanks in advance.
Edit : btw, I was using gnuplot 4.4 patchlevel 4 and just updated to the newest version (i.e. 4.6 patchlevel 5)
Gnuplot histograms work a bit differently from what you might think. The x-axis isn't numeric. In your case the value in the first row, second column is placed at an x-value of 0 with the y-value taken from the second column and a manual label taken from the first column, first row. The values of the second row are placed at x=1 etc.
You can try using the boxes plotting style, which is used with a 'conventional' x-axis and supports a logscale in x:
set logscale xy
set offset 0,0,1,1
set boxwidth 0.9 relative
set style fill solid noborder
plot 'data.dat' with boxes
With the data file data.dat
1 1000
2 300
5 150
20 10
135 3
this gives the result (with 4.6.5):
In order to have a fixed boxwidth and a varying box distance, you can use a third column to specify a box width as percentage of the x-value:
set logscale xy
set offset 0,0,1,1
set style fill solid noborder
plot 'data.dat' using 1:2:($1*0.5) with boxes
Putting the actual values on the x-axis works as follows:
set logscale xy
set offset 0,0,1,1
set style fill solid noborder
plot 'data.dat' using 1:2:($1*0.5):xtic(1) with boxes

Resources