I am currently using a script to generate histogram plots, e.g., by doing:
set style histogram cluster gap 4
plot for [COL=2:10] 'example.dat' u COL:xticlabels(1) title columnheader(COL)
Now I wish to add the y-values (numbers) above the bars in the histogram but adding w labels gives the 'Not enough columns for this style' error.
plot for [COL=2:10] 'example.dat' u COL:xticlabels(1) title columnheader(COL), \
for [COL=2:10] 'example.dat' u COL title '' w labels
Is it possible to add y-labels using the histogram style?
Note: I know that there are examples for plotting with boxes. I wish to make this work with the histogram style if possible.
Here's a test datafile I came up with:
example.dat
hi world foo bar baz qux
1 2 3 4 5 6
4 5 7 3 6 5
Here's the script I used to plot it:
set yrange [0:*]
GAPSIZE=4
set style histogram cluster gap 4
STARTCOL=2 #Start plotting data in this column (2 for your example)
ENDCOL=6 #Last column of data to plot (10 for your example)
NCOL=ENDCOL-STARTCOL+1 #Number of columns we're plotting
BOXWIDTH=1./(GAPSIZE+NCOL) #Width of each box.
plot for [COL=STARTCOL:ENDCOL] 'example.dat' u COL:xtic(1) w histogram title columnheader(COL), \
for [COL=STARTCOL:ENDCOL] 'example.dat' u (column(0)-1+BOXWIDTH*(COL-STARTCOL+GAPSIZE/2+1)-0.5):COL:COL notitle w labels
Each cluster of histograms takes a total width of 1 unit on the x axis. We know how many widths we need (the number of boxes +4 since that is the gapsize). We can calculate the width of each box (1/(N+4)). We then plot the histograms as normal. (Note that I added with histogram to the plot command).
According to the builtin help, labels require 3 columns of data (x y label). In this case, the y position and the label are the same and can be read directly from the column COL. The x position of the first block is centered 0 (and has a total width of 1). So, the first block is going to be located at x=-0.5+2*BOXWIDTH. The 2 here is because the gap is 4 boxwidths -- two on the left and 2 on the right. The next block is going to be located at -0.5+3*BOXWIDTH, etc. In general, (as a function of COL) we can write this as
-0.5+BOXSIZE*(COL-STARTCOL+1+GAPSIZE/2)
We need to shift this to the right by 1 unit for each additional block we read. Since each block corresponds to 1 line in the data file, we can use pseudo-column 0 (i.e. column(0) or $0) for this since it gets incremented for each "record/line" gnuplot reads. The 0th record holds the titles, the first record holds the first block. Since we want a function which returns 0 for the first record, we use column(0)-1. Putting it all together, we find that the x-position is:
(column(0)-1-0.5+BOXSIZE*(COL-STARTCOL+1+GAPSIZE/2))
which is equivalent to what I have above.
Related
I would like to draw a line with plots that contain "jumping" values.
Here is an example: when we have plots of sin(x) for several cycles and plot it, unrealistic line will appear that go across from right to left (as shown in following figure).
One idea to avoid this might be using with linespoints (link), but I want to draw it without revising the original data file.
Do we have simple and robust solution for this problem?
Assuming that you are plotting a function, that is, for each x value there exists one and only one corresponding y value, the easiest way to achieve what you want is to use the smooth unique option. This smoothing routine will make the data monotonic in x, then plot it. When several y values exist for the same x value, the average will be used.
Example:
Data file:
0.5 0.5
1.0 1.5
1.5 0.5
0.5 0.5
Plotting without smoothing:
set xrange [0:2]
set yrange [0:2]
plot "data" w l
With smoothing:
plot "data" smooth unique
Edit: points are lost if this solution is used, so I suggest to improve my answer.
Here can be applied "conditional plotting". Suppose we have a file like this:
1 2
2 5
3 3
1 2
2 5
3 3
i.e. there is a backline between 3rd and 4th point.
plot "tmp.dat" u 1:2
Find minimum x value:
stats "tmp.dat" u 1:2
prev=STATS_min_x
Or find first x value:
prev=system("awk 'FNR == 1 {print $1}' tmp.dat")
Plot the line if current x value is greater than previous, or don't plot if it's less:
plot "tmp.dat" u ($0==0? prev:($1>prev? $1:1/0), prev=$1):2 w l
OK, it's not impossible, but the following is a ghastly hack. I really advise you add an empty line in your dataset at the breaks.
$dat << EOD
1 1
2 2
3 3
1 5
2 6
3 7
1 8
2 9
3 10
EOD
plot for [i=0:3] $dat us \
($0==0?j=0:j=j,llx=lx,lx=$1,llx>lx?j=j+1:j=j,i==j?$1:NaN):2 w lp notit
This plots your dataset three times (acually four, there is a small error in there. I guess i have to initialise all variables), counts how often the abscissa values "jump", and only plots datapoints if this counter j is equal to the plot counter i.
Check the help on the serial evaluation operator "a, b" and the ternary operator "a?b:c"
If you have data in a repetitive x-range where the corresponding y-values do not change, then #Miguel's smooth unique solution is certainly the easiest.
In a more general case, what if the x-range is repetitive but y-values are changing, e.g. like a noisy sin(x)?
Then compare two consecutive x-values x0 and x1, if x0>x1 then you have a "jump" and make the linecolor fully transparent, i.e. invisible, e.g. 0xff123456 (scheme 0xaarrggbb, check help colorspec). The same "trick" can be used when you want to interrupt a dataline which has a certain forward "jump" (see https://stackoverflow.com/a/72535613/7295599).
Minimal solution:
plot x1=NaN $Data u 1:2:(x0=x1,x1=$1,x0>x1?0xff123456:0x0000ff) w l lc rgb var
Script:
### plot "folded" data without connecting lines
reset session
# create some test data
set table $Data
plot [0:2*pi] for [i=1:4] '+' u 1:(sin(x)+rand(0)*0.5) w table
unset table
set xrange[0:2*pi]
set key noautotitle
set multiplot layout 1,2
plot $Data u 1:2 w l lc "red" ti "data as is"
plot x1=NaN $Data u 1:2:(x0=x1,x1=$1,x0>x1?0xff123456:0x0000ff) \
w l lc rgb var ti "\n\n\"Jumps\" removed\nwithout changing\ninput data"
unset multiplot
### end of script
Result:
So I have some data files in format
x y ymin ymax
That I'm plotting with yerrorbars.
Now how would I best add a median of the y values to the plot running over the whole range of x?
UPDATE
I'm also plotting the x axis in logscale which seems to prevent using STATS.
Suppose that your data looks like this:
1 8 6 9
2 6 5 7
3 5 4 8
4 6 5 8
We can use the stats command to find the median. The use is similar to the plot command. Here, we only need to do analysis of the second column, so we will only specify the second column:
stats datafile u 2 nooutput
The nooutput option tells the command not to print the results. If we wish to see the full analysis, we simply omit that specification. By default, the stats command stores its results in variables of the form STATS_*. We can use a different prefix if desired. See help stats for more details.
At this point, we have a variable STATS_median that stores the median of the y values (which is 6 for the sample data). We can now add the median to the graph in one of two ways. First we can simply add a plot specification to the existing plot command:
plot datafile u 1:2:3:4 with yerrorbars, STATS_median
or we can add a line with the set arrow command and then plot just the yerrorbars:
set arrow 1 from graph 0, first STATS_median to graph 1, first STATS_median nohead
plot datafile u 1:2:3:4 with yerrorbars
Here we give the x coordinate in graph units which range from 0 (the left side) to 1 (the right side) and the y coordinate in the first coordinate system which corresponds to the y1 axis. Specifying nohead says to not draw an arrow head. The 1 immediately following set arrow tags this arrow as arrow 1 so that we can change or remove it easily later.
Other options are available. See help arrow for more details.
The starting point is that I have a graph with 4 lines on it. They are the results of my simulation, plotted over an x-axis of iteration, at 4 different locations. I also have experimental values at each of those locations. I want to plot those 4 experimental values as horizontal lines on the same graph. I would also like the line colours of the simulation and experiment results at each location to be the same.
With #Tom's help, below, I have got the following script to do this:
unset bars
max = 1e6
set xrange[7000:24000]
set yrange[-0.5:1.5]
plot for [i=2:5] 'sim' using 1:(column(i)) ls i, \
for [i=1:4] 'expt' using (1):1:(max) every ::(i-1)::(i-1) with xerror ls i ps 0
The problem is that I want the values in xrange[x_min:x_max] and yrange[y_min:y_max] to be taken from sim and expt as follows:
x_min = min(sim[:1]) # where min(sim[:1]) means "min value in file 'sim' col 1"
x_max = max(sim[:1])
y_min = min(sim[:2],sim[:3],sim[:4],sim[:5],expt[:1])
y_max = max(sim[:2],sim[:3],sim[:4],sim[:5],expt[:1])
My OS is Scientific Linux: Release 6.3, Kernel Linux 2.6.32-358.2.1.el6.x86_64, GNOME 2.28.2
sim and expt are .txt files
A representative sample of sim is:
7520 0.282511 0.0756715 -0.222863 -0.0898819
7521 0.315944 0.201687 -0.321723 -0.106345
7522 0.230956 0.102217 -0.34196 -0.061009
7523 1.460043 -0.00118292 -0.045077 0.673926
A representative sample of expt is:
1.112
0.123
-0.45
0.862
Thank you for your help.
I think that this is a way to solve your problem:
unset bars
max = 1e6
set xrange[0:8]
plot for [i=1:4] 2*i+sin(x) ls i, \
for [i=1:4] 'expt' using (1):1:(max) every ::(i-1)::(i-1) with xerror ls i ps 0
Based on some information I found on Gnuplot tricks, I have (ab)used error bars to produce horizontal lines based on the points in this data file:
2
4
6
8
The (1):1:(max) specifies that a point should be plotted at the coordinate (1, y), where y is read from the data file. The max is the value of xdelta, which determines the size of the x error bar. This is one way of achieving a horizontal line in your plot, as a suitably large value of max will result in an error bar across the entire xrange of your plot.
Here's what the output looks like:
Considering, that you have a data file with five columns, one with the x-values and four with y-values. Now you have additional file where a number path_to_expt comes from. In order to plot the columns and one horizontal line having the y-value path_to_expt you can use
plot for [i=2:5] path_to_file using 1:(column(i))
This plot col 2 against 1, 3 vs 1, 4 vs 1 and 5 vs 1. To get different styles, just use set linetype to redefine the automatically assigned line types:
set linetype 1 lc rgb 'orange'
# ... other lt definitions
plot for [i=2:5] path_to_file using 1:(column(i))
If you don't want to overwrite exising linetype 1..4, use e.g. 11..14:
set linetype 11 lc rgb 'orange'
# ...
plot for [i=2:5] path_to_file using 1:(column(i)) lt (9 + i)
Finally, in order to plot a horizontal line, using the same x-values as in the data file, use
mynumber = 27
plot path_to_file using 1:(mynumber)
If you don't put a number in parentheses, it is interpreted as column number (like the 1 here), whereas put inside parentheses, it is treated as number.
Another option would be to set arrows:
set arrow from graph 0, first mynumber to graph 1, first mynumber lt 1
plot for [i=2:5] path_to_file using 1:(column(i))
I have a text file with 2 columns of numbers corresponding to (x,y) coords.
4 1
4 5
1 1
1 5
2.5 3
How do I tell gnuplot to plot these points and label each point with its corresponding row #? (Please keep in mind I'm going to apply this to a much larger file with 100 points, so I'm looking for a way to do it automagically, rather than have to create a 3rd column of data corresponding to row numbers).
You can use the with labels flag to the plot command. By default this places the label instead of the point at the place where the point would be. with label takes the offset flag (and any flag you can pass to set label) so you can have the label next to the point. Here is an example script:
#!/usr/bin/env gnuplot
reset
set terminal pngcairo
set output 'test.png'
set xr [0:5]
set yr [0:6]
plot 'data.dat' pt 7, \
'data.dat' using 1:2:($0+1) with labels offset 1 notitle
which produces this output:
Can I get gnuplot to display the exact y-value or height of a data point (plotted using "with boxes") over its bar? I would like the plot to be easy to read so nobody has to line up the top of a bar with the y-axis and guess what the value is.
You can use the labels style and combine it into the plot command with the boxes style. The labels style expects 3 columns of data - the x coordinate, the y coordinate, and the actual label text.
For example, with the following data
1 4
2 6
3 2
4 8
the command (we set the yrange to 0 - 10 and boxwidth to 0.9 and set a solid fill style)
plot datafile u 1:2 with boxes, "" u 1:2:2 with labels offset char 0,1
produces
Normally, the labels would be centered on the specified point (the top edge of the box). By specifying an offset, we can move them up to just above the box. Here we used no offset in the x direction, but a unit of 1 in the y direction. We used the character coordinate system, so this corresponds to moving up by one character unit.
I can only think of putting the values where you want them "manually" like this:
set label "value" at 12,34
The numbers are coordinates according to your x and y ranges.
An automatic way would use "with labels", see e.g.
http://gnuplot.sourceforge.net/demo_4.4/stringvar.html