gnuplot histogram with boxes (smooth frequency)

gnuplot histogram with boxes (smooth frequency) - gnuplot

I have crawled many similar questions without finding the proper problem/question/answer...
I want to use gnuplot to make a histogram plot out of a distributed data file with bars/boxes of equal width and intervals. So I need to count/integrate over the width(=1) of my bars. That's why I wanted to use the 'smooth frequency' command:
#gnuplot
bin(x)=floor(x+0.5)
set boxwidth 0.8 relative
set style fill pattern
set grid
set xrange [0:11]
set yrange [0:3]
set xtics in 0,2,10
set mxtics 2
set ytics 0,1,3
set mytics 1
p 'data.dat' u (bin($1)):(1) smooth freq w boxes
#data.dat
2.489
7.5
9.128
9.567
I tried it and the result was the same, as with my handmade file plotted with boxes:
#gnuplot2
[...]
p 'data2.dat' w boxes
#data2.dat
2 1
8 1
9 1
10 1
Smooth frequency seems to do its job properly, but the result is not what I intended to do...: Image
Then I figured out, what the problem is. It is solved using my handmade data3.dat:
#gnuplot
p 'data3.dat' w boxes
#data3.dat
1 0
2 1
3 0
4 0
5 0
6 0
7 0
8 1
9 1
10 1
Image
So the problems are the holes in my data range, that aren't counted as '0'. With these holes, gnuplot seems to adjust the box width by itself to fit in the whole space left. How can I prevent this to get my desired result?

I think the problem is the relative key. Try:
set boxwidth 0.8 absolute

Related

Gnuplot histogram with boxes and a color per value

I would like to create a histogram with boxes using three pieces of data, first the number of iterations as the x-axis, then the execution time as the y-axis and finally the number of processes used.
I would like to see a bar for each number of processes used, and with a color specific to the value of the number of processes. How can I do this?
My test data is defined as:
"iterations" "processes" "time_execution"
1000 1 14
1000 2 10
1000 4 9
4000 1 60
4000 2 42
4000 4 45
7000 1 80
7000 2 70
7000 4 50
And here is my script so far, but I can't get it to place the three bars side by side:
set term svg
set output out.svg
set boxwidth 1
set style fill solid 1.00 border 0
set style histogram
set size ratio 0.8
set xlabel 'Number of iterations'
set ylabel offset 2 'Time execution in seconds'
set key left Right
set key samplen 2 spacing .8 height 3 font ',10'
set title 'Time execution per iterations and processus used'
plot test.data u 1:3:2 w boxes
Thanks!

I guess your data format doesn't fit the expected histogram format. Check the examples on the gnuplot homepage, although, I think the examples are too crowded which might be confusing and maybe the reason why there are so many histogram questions on SO.
If you modify your data format (see below) it will be easy to plot the histogram.
You can probably use any format, but the effort to prepare the data will be higher (see for example here: Gnuplot: How to plot a bar graph from flattened tables).
Script:
### plotting histogram requires suitable input data format
reset session
$Data <<EOD
xxx 1 2 4
1000 14 10 9
4000 60 42 45
7000 80 70 50
EOD
set style histogram clustered gap 1
set style data histogram
set boxwidth 0.8 relative
set style fill solid 0.3
set xlabel 'Number of iterations'
set xtics out
set ylabel 'Time execution in seconds'
set grid x,y
set key top center title "Processors"
set offset 0,0,0.5,0
plot for [col=2:4] $Data u col:xtic(1) ti col
### end of script
Result:

You can use lc variable
plot test.data u 1:3:2 w boxes lc variable notitle
EDIT
notitle is not necessary, but it makes the plot seems better.

Is there any way to visualize the field on adaptive mesh with gnuplot?

I am a beginner in gnuplot. Recently I tried to visualize a pressure field on adaptive mesh.
Firstly I got the coordinates of nodes and center of the cell and the pressure value at the center of the cell.
And, I found something difficult to deal with. That is the coordinates in x and y directions are not regular, which made me feel hard in preparing the format of source data. For regular and equal rectangular case, I can do something just like x-y-z format. But is there any successful case in adaptive mesh?

I understand that you have some x,y,z data which is in no regular grid (well, your adaptive mesh).
I'm not fully sure whether this is what you are looking for, but
gnuplot can grid the data for you, i.e. inter-/extrapolating your data within a regular grid and then plot it.
Check help dgrid3d.
Code:
### grid data
reset session
# create some test data
set print $Data
do for [i=1:200] {
x = rand(0)*100-50
y = rand(0)*100-50
z = sin(x/15)*sin(y/15)
print sprintf("%g %g %g",x,y,z)
}
set print
set view equal xyz
set view map
set multiplot layout 1,2
set title "Original data with no regular grid"
unset dgrid3d
splot $Data u 1:2:3 w p pt 7 lc palette notitle
set title "Gridded data"
set dgrid3d 100,100 qnorm 2
splot $Data u 1:2:3 w pm3d
unset multiplot
### end of code
Result:

If you have the size of each cell, you can use the "boxxyerror" plotting style. Let xdelta and ydelta be half the size of a cell along the x-axis and y-axis.
Script:
$datablock <<EOD
# x y xdelta ydelta pressure
1 1 1 1 0
3 1 1 1 1
1 3 1 1 1
3 3 1 1 3
2 6 2 2 4
6 2 2 2 4
6 6 2 2 5
4 12 4 4 6
12 4 4 4 6
12 12 4 4 7
EOD
set xrange [-2:18]
set yrange [-2:18]
set palette maxcolors 14
set style fill solid 1 border lc black
plot $datablock using 1:2:3:4:5 with boxxyerror fc palette title "mesh", \
$datablock using 1:2 with points pt 7 lc rgb "gray30" title "point"
pause -1
In this script, 5-column data (x, y, xdelta, ydelta, pressure) is given for "boxxyerror" plot. To colorize the cells, the option "fc palette" is required.
Result:
I hope this figure is what you are looking for.
Thanks.

Plotting Condition Lines

Suppose I have the following data:
"1,5"
"2,10"
""
"3,4"
"4,2"
""
"5,6"
"6,10"
I want to graph this using gnuplot with a line between each condition, similar to this display:
How might this be accomplished? I have looked into gridlines, but that does not seem to suit my need. I am also looking for a solution that will automatically draw condition / phase lines between each break in the data set.

As mentioned in the comments and explained in the linked question and its answers, you can draw arbitrary lines manually via set arrow ... (check help arrow).
However, if possible I don't want to adjust the lines manually every time I change the data or if I have many different plots.
But, hey, you are using gnuplot, so, make it automated!
To be honest, within the time figuring out how it can be done I could have changed a "few" lines and labels manually ;-). But now, this might be helpful for others.
The script below is written in such a way that it doesn't matter whether you have zero, one or two or more empty lines between the different blocks.
Comments:
the function valid(1) returns 0 and 1 if column(1) contains a valid number (check help valid).
the vertical lines are plotted with vectors (check help vectors). The x-position is taken as average of the x-value before the label line and the x-value after the label line. The y-value LevelY is determined beforehand via stats (check help stats).
the labels are plotted with labels (check help labels) and positioned at the first x-value after each label line and at an y-value of LevelY with an offset.
Script:
### automatic vertical lines and labels
reset session
$Data <<EOD
Baseline
1 10.0
2 12.0
3 10.5
4 11.0 # zero empty lines follow
Treatment
5 45.0
6 35.0
7 32.5
8 31.0 # one empty line follows
Baseline
9 14.0
10 12.8
11 12.0
12 11.3 # two empty lines follow
Treatment
13 35.0
14 45.0
15 45.0
16 37.0
EOD
set offset 1,1,1,1
set border 3
set title "Student Performance" font ",14"
set xlabel "Sessions"
set xtics 1 out nomirror
set ylabel "Number of Responses"
set yrange [0:]
set ytics out nomirror
set key noautotitle
set grid x,y
stats $Data u 2 nooutput
LevelY = STATS_max # get the max y-level
getLinePosX(col) = (v0=v1,(v1=valid(col))?(x0=x1,x1=column(1)):0, v0==0?(x0+x1)/2:NaN)
getLabel(col) = (v0=v1,(v1=valid(col))?0:(h1=strcol(1),h0=h1),column(1))
plot x1=NaN $Data u (y0=(valid(1)?$2:NaN),$1):(y0) w lp pt 13 ps 2 lw 2 lc "red", \
x1=v1=NaN '' u (getLinePosX(1)):(0):(0):(LevelY) w vec nohead lc "black" lw 1.5 dt 2, \
v1=NaN '' u (getLabel(1)):(LevelY):(sprintf("%s",v0==0?h0:'')) w labels left offset 0,1.5 font ",12"
### end of script
Result:

Gnuplot Histogram with overlapping bar pairs

I am trying to plot a bar chart/histogram that shows, for every point of the x axis, two bars, each of which is divided in three parts. Take the following dataset:
Min # Max # Avg # Min % Max % Avg %
6 12 6.67 13 100 35.25
0 6 3 0 90 43.25
235 1243 553 66.67 100 83.43
The idea is that for each row, there will be a pair of vertical bars, with the left one representing the three # values and the right one representing the three % values. The values are made-up, but the scale is more or less the real one.
So far, I have managed to get the following script, which is a frankenstein of several online scripts I found:
set ytics 10 nomirror tc lt 1
set y2tics 100 nomirror tc lt 2
set yrange [0:120]
set y2range [0:1500]
set style fill solid border -1
plot "table2.dat" using 5:xticlabels(1) with boxes lt rgb "#40FF00" t "Max \%",\
"" using 6 lt rgb "#406090" t "Avg \%",\
"" using 4 with boxes lt rgb "#403090" t "Min \%"
This will plot out the following chart:
What I cannot seem to figure out is how to put the second bar for the first three columns. Ideally, that "X" would also be replaced by a dotted line cutting the bar. The reason for the two Y axes is that each bar follows a different scale, so the second bar would have to be proportional to the right-side y axis. Finally, I had to add that little "hack" of making yrange higher than 100 so that the bars would not "hit the top". If there is another way to do that, that'd be great.
Thanks in advance for any help that can be given, I am a complete newbie at gnuplot but since trying to make this chart using spreadsheet tools was an even bigger pain, I am hopeful that someone'll be able to help with at least some of those problems.
Edit.: I will take suggestions for a better title for this question.

You can get a little further by setting a boxwidth to 0.4 of the default width, and defining a function (I used f here) that converts your data in columns 1 to 3 into percentage values too, and explicitly providing an x coordinate with the syntax x:y and using $1 to refer to column 1 etc. $0 is the row.
set boxwidth 0.4 relative
f(y,max) = (y*100./max)
plot "table2.dat" \
using 5:xticlabels(1) with boxes lt rgb "#40FF00" t "Max \%",\
"" using 6 lt rgb "#406090" t "Avg \%",\
"" using 4 with boxes lt rgb "#403090" t "Min \%",\
"" using ($0-.5):(f($2,$2)) with boxes lt rgb "red" t "Max",\
"" using ($0-.5):(f($3,$2)) lt rgb "blue" t "Avg",\
"" using ($0-.5):(f($1,$2)) with boxes lt rgb "orange" t "Min"
I used some garish colours to show the new boxes:

Can You Calculate the Area of a Contour in Gnuplot?

I've been using gnuplot for a couple of weeks now. I have large data files with 23 variables, but I select specifically x-y co-ordinate data and fluorescence intensity data for my analysis.
On of the things I would like to do is a contour plot of my fluorescing particles. I should add that this contour plot is over time so there will be several spots nearly overlapping, but this is in fact the same particle. I would like to draw contours around these spots, colour code according to intensity and have the area of the contour displayed on the graph.
I have achieved all but one of these goals for my contour plot. I cannot devise a way for gnuplot to calculate and display the area within the contour. If I could then I would have an estimate of the area of my particle. I recognise my goal may be beyond the capabilities of gnuplot, but if there were a solution then it would be very neat.
Here is my script for the contour plot which as I said gives everything I need bar the area within contours.
The co-ordinates are in nanometres and each point on the dataset is the centre of a molecule. I have taken a small range of co-ordinates because there is so much data, it would not be possible to distinguish otherwise (there are over 80 000 data points). I have also set a threshold of intensity as I only want relatively bright fluorescent particles (done with set cntrparam levels incremental 8000,5000,100000). $23 and $24 are the x and y co-ordinates respectively. $12 is the intensity.
#Contour plot of Fluorescent Particle Location with Intensity
#Gnuplot script file for plotting data in file "1002 all.txt"
reset
set dgrid3d 100,1000,1
set pm3d
set isosample 30
set xlabel 'x (nm)'
set ylabel 'y (nm)'
set contour base
set cntrparam levels incremental 8000,5000,100000
unset key
unset surface
set view map
set xrange[20000:22000]
set yrange[7000:10000]
splot "1002 all.txt" using ($23<22000 && $23>20000 ?$23 : 1/0):$24<10000 && $24>7000 ?$24 : 1/0):12 with lines
set terminal push
set terminal png
set output "1002_all_fluorophores_section_contour.png" # set the output filename
set terminal png size 1280,760
replot
set output

As #Christoph says, gnuplot might not be a numerical tool, however, the calculation of a polygon area is not too complicated and can easily be done with gnuplot only. Assumption is that you have closed polygons, i.e. last point == first point, and the data of the individual polygons is separated by two empty lines.
edit: script changed to work with gnuplot 4.6.0 as well.
Data: SO28173844.dat
1 1
2 1
2 2
1 2
1 1
3 1
5 4
9 0
8 4
7 4
9 8
6 8
4 9
0 6
3 1
4 0
5 3
7 1
4 0
Script: (works for gnuplot>=4.6.0, March 2012)
### calculate areas of closed polygons
reset
FILE = "SO28173844.dat"
set size ratio -1
set style fill solid 0.3
set grid x,y front
set key noautotitle
stats FILE u 0 nooutput # get number of blocks, i.e. polygons
N = STATS_blocks
getArea(colX,colY) = ($0==0?(Area=0, x1=column(colX), y1=column(colY)) : 0, \
x0=x1, y0=y1, x1=column(colX), y1=column(colY), Area=Area+0.5*(y1+y0)*(x1-x0))
getMinMax(colX,colY) = (x2=column(colX), y2=column(colY), $0==0? (xMin=xMax=x2, yMin=yMax=y2) : \
(x2<xMin?xMin=x2:0, x2>xMax?xMax=x2:0, y2<yMin?yMin=y2:0, y2>yMax?yMax=y2:0))
Areas = Centers = ''
do for [i=1:N] {
stats FILE u (getArea(1,2),getMinMax(1,2)) index i-1 nooutput
Areas = Areas.sprintf(" %g",abs(Area))
Centers = Centers.sprintf(' %g %g',0.5*(xMin+xMax),0.5*(yMin+yMax))
}
CenterX(n) = real(word(Centers,int(column(n))*2+1))
CenterY(n) = real(word(Centers,int(column(n))*2+2))
Area(n) = real(word(Areas,int(column(n)+1)))
myColors = "0xff0000 0x00ff00 0x0000ff"
myColor(i) = sprintf("#%06x",int(word(myColors,(i-1)%words(myColors)+1)))
plot for [i=1:N] FILE u 1:2 index i-1 w filledcurves lc rgb myColor(i), \
'+' u (CenterX(0)):(CenterY(0)):(sprintf("A=%g",Area(0))) every ::0::N-1 w labels center
### end of script
Result:

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string