I have been struggling with a basic beeswarm plot from page 62 in this doc. I imagine they are skipping some details, and I'm not sure what actual data they used. I think in particular the problem is mapping a categorical/string variable to an X-axis value.
I used this data:
A 1
A 2
A 3
B 4
B 5
B 6
With this script:
set terminal png
set output "graph.png"
set jitter
plot "data.csv" using 1:2:1 with points lc variable
I get this error:
"graph_script" line 4: warning: Skipping data file with no valid points
plot "data.csv" using 1:2:1 with points lc variable
"graph_script" line 4: x range is invalid
In their demos gallery, I see something like set xtics ("A" -1, "B" 0) which could maybe help me to label already-numeric data better, but what if my data doesn't start off numeric to begin with?
Do I need something like (hash_string_to_large_int($1) % 2)? There must be an easier way!
As mentioned in the comments you have to "convert" your keys into numbers in order to plot them.
You can do this by creating a list with your unique keywords and defining a function to get the indices.
First, the following example creates some random data
The code after knows nothing about the keywords, so it creates the unique list from scratch from the random data.
Maybe there is (and I am not aware) a simpler solution with gnuplot only.
### bee-swarm plot with string keys
reset session
# create some random test data
myExts = '.py .sh .html'
set print $Data
do for [i=1:100] {
print sprintf("%s %d",word(myExts,int(rand(0)*3)+1),int(rand(0)*10+1)*5)
set print
# create a unique list of strings from a data stringcolumn
Uniques = ''
addToList(list,col) = list.( strstrt(list,'"'.strcol(col).'"') > 0 ? '' : ' "'.strcol(col).'"')
stats $Data u (Uniques = addToList(Uniques,1),0) nooutput
getIdx(key) = (_idx=NaN, sum [_i=1:words(Uniques)] (word(Uniques,_i) eq key ? _idx=_i : 0), _idx)
set offsets 0.5,0.5,1,1
set key noautotitle
set multiplot layout 1,2
set title "No jitter"
plot $Data u (idx=getIdx(strcol(1))):2:(idx):xtic(word(Uniques,idx)) w points pt 7 lc var
set title "With jitter"
set jitter
unset multiplot
### end of code
Consider the following file that I want to plot using gnuplot: Servos20211222_105253.csv
# Date/Time 2021/12/22, 10:52:53
# PonE=0,LsKp=200,LsKi=0,LsKd=250,HsKp=40,HsKi=0,HsKd=130,Sp=800,TDEC=1175137
# Rel. Time, currentPos, PosPID, currentSpeed, speedPID, Lag, ServoPos
I would like to:
set the plot title to the date/time from the first comment record.
display the record that starts "# PonE" as a caption.
extract the value for TDEC and plot a horizontal line with the name "Target"
I have some influence over the format of the header records, so if (for example) it would be better that they were not comments but provided in some other way, then that can be done.
It is a common problem to get text values from files using only gnuplot. If you can use OS and shell dependent solutions, I'd suggest to use remove the comments from the file and try something like
set title "`head -1 Servos20211222_105253.csv`"
You can place text anywhere using set label <"label text">, where the label text can be the 2nd line from the file.
You can plot a straight line using plot:
p sin(x), 0.5 title "TDEC"
But instead of 0.5, you need to get the value using shell scripts again, e.g. the cut unix command.
There are ways with gnuplot only, although sometimes a bit cumbersome compared with using tools which you have available on Linux (or comparable tools which you need to install on Windows).
Update: shorter and "simplified" script
One possible gnuplot-only way:
set commentschar to nothing, i.e. ''
assign the columns to variables and/or arrays, e.g. myDate, myTime, P[1..9].
Merge P[1..8] into a multi-line string Params by "mis"-using sum (check help sum)
Convert P[9] into a floating point number TDEC for plotting
Script: (modified the data a bit just for illustration)
### extract values from headers with gnuplot only
reset session
$Data <<EOD
# Date/Time 2021/12/22, 10:52:53
# PonE=0,LsKp=200,LsKi=0,LsKd=250,HsKp=40,HsKi=0,HsKd=130,Sp=800,TDEC=1175137
# Rel. Time, currentPos, PosPID, currentSpeed, speedPID, Lag, ServoPos
set datafile separator comma commentschar ''
array P[9] # array to store parameters
stats $Data u ($0==0 ? (myDate=strcol(1)[3:], myTime=strcol(2)) : \
sum [_i=1:9] (P[_i] = _i==1 ? strcol(_i)[3:] : strcol(_i) ,0 )) \
every ::0::1 nooutput
set datafile commentschar # set back to default
Params = P[1]
Params = (sum [_i=2:8] (Params=Params.sprintf("\n%s",P[_i]),0),Params)
set title sprintf("%s %s", myDate, myTime)
TDEC = real(P[9][6:]) # convert to real number
set label 1 at graph 0.02, first TDEC P[9] offset 0,-0.7
set label 2 at graph 0.02, graph 0.85 Params
plot $Data u 1:2 w lp pt 7 title "Data", \
TDEC w l lc "red" title "Target"
### end of script
I am new to gnuplot, and I am trying to plot this data (gnuplot receives this input from stdin):
Regular 5
Block 3
Symbolic 8
Char 3
Socket 7
with this gnuplot code:
set style data histograms
set style fill solid
set terminal png
set output "plot.png"
plot '-' using 2:xtic(1), \
'' using 0:($2 + .1) with labels notitle
I get the error Not enough columns for this style. What am I doing wrong? If I remove the last line with labels, I am able to plot the histogram. How can I modify it to get data labels on top of each histogram bar?
Three columns of information x y text are needed for with labels. You gave coordinates but no actual text. Try
Regular 5
Block 3
Symbolic 8
Char 3
Socket 7
set style data histograms
set style fill solid
set yrange [0:*]
plot $DATA using 2:xtic(1), \
'' using 0:($2 + .1):2 with labels notitle
Try this:
plot 'input_file' using 2, '' using 0:2:1 with labels offset 0, char 1
Note that I have added the values in a file named input_file and have set set yrange [0:10] to make the plot nicer to watch
This gives:
I have developped a CGI in bash/html that allow me to generate a graph of my clusters.
Here is an exemple :
This is a graph that works well. The problem is that for some graphs, the percentages overlap or shift far too far from where it should be. Here is my GNUPLOT code:
f(w) = (strlen(w) > 10 ? word(w, 1) . "\n" . word(w, 2) : w)
set title "TITLE"
set terminal png truecolor size 960, 720 background rgb "#eff1f0"
set output "/var/www/html/CLUSTER_NAME.png"
set bmargin at screen 0.1
set key top center
set grid
set style data histograms
set style fill solid 1.00 border -1
set boxwidth 0.7 relative
set yrange [*:*]
set format y "%g%%"
set datafile separator ","
plot 'test1.txt' using 2:xtic(f(stringcolumn(1))) title " CPU consumption (%) ", \
'' using 3 title " RAM consumption (%)", \
'' using 0:($2+1):(sprintf(" %g%%",$2)) with labels notitle, \
'' using 0:($3+1):(sprintf(" %g%%",$3)) with labels notitle
Here is an example of a graph that does not work properly because the percentages are too shifted :
I am able to change this by changing this line in my code:
'' using 0:($3+1):(sprintf(" %g%%",$3)) with labels notitle
To :
'' using 0:($3+1):(sprintf(" %g%%",$3)) with labels notitle
Adding spaces allows to shift the percentages :
But even if it works for this graph, it moves the percentages for the other graphs too... :
I can't get "clean" graphics. Either the percentages overlap, or they go out of scope because the values are too large, or they are completely shifted....
Another example:
Is there a way to make all this move by itself, automatically, according to the values and therefore the size of the bars etc?
You might try an alternative mechanism, using plot for [i=2:3] ... to loop through the 2 columns of values. Instead of guessing the number of spaces to indent, you estimate the x position of the bar using column(0)+(i-2)*.25 (for i = 2 then 3),
which I got to by trial and error.
For example, using a function mytitle to get the 2 titles (my gnuplot is too old for an array):
mytitle(x) = (x==2?"cpu":"ram")
plot for [i=2:3] 'data' using i:xtic(stringcolumn(1)) title mytitle(i), \
for [i=2:3] '' using (column(0)+(i-2)*.25):(column(i)+1):\
(sprintf("%g%%",column(i))) with labels notitle
Iam trying to plot multiple data lines with their titles in the key based on the variable which I am using as the index:
plot for [i=0:10] 'filename' index i u 2:7 w lines lw 2 t ' = '/(0.5*i)
However, it cannot seem to do this for a fractional multiple of i. Is there a way around this other than to set the title for each line separately?
sprintf should provide all the functionality needed, e.g.,
plot for [i=0:10] .... t sprintf(" = %.1f", 0.5*i)
in order to use the value of 0.5*i with 1 decimal digit...