GNUPlot matrix plot with changing distance between lines - gnuplot

In GNUPlot you can make 3d plots based on a .dat file with a matrix notation:
#Y 0.1 0.2 0.3 0.4
0 1 4 9 #X = 1
1 2 5 10 #X = 2
4 5 8 13 #X = 3
9 10 13 18 #X = 5
16 17 20 25 #X = 7
25 26 29 34 #X = 10
However the file I want to plot has some changes in X-distance between the lines. As shown in comment. One can use set xtics but that only changes the numbers on the plot, while the points should be plotted on a linear axis.
Is there a way to do this?

No, not with this type of matrix notation. You would have to use a format as described here: http://t16web.lanl.gov/Kawano/gnuplot/datafile-e.html#3dim
The matrix format assumes an even spacing between x and y points, but the 3D data format allows arbitrary positioning of all the points.

Related

gnuplot: transform axis of matrix plot with "every"

I have a problem with plotting matrices with gnuplot. I am plotting one row of matrix with every option like that
plot inputfile matrix every 1:1:(4+N*M+1):100:(4+N*(M+1)):100 with linespoint
where 100 is number of row. It gave me that result:
nearly good result
I would like to get xrange from 0 to 360, but when I use something like that
plot inputfile matrix using ($1*11.25):2 every 1:1:(4+N*M+1):100:(4+N*(M+1)):100 with linespoint
it doesen't work: wrong result
What can I do with it?
You don't provide data, so I create some for the following example.
As I understand you want to plot a certain row of a matrix and adjust the x-range.
Check help matrix every.
For example, in plot FILE u 1:2:3 matrix, 1 is the column, 2 the row and 3 is the (z)-value.
And in plot FILE u 1:3 matrix every ::c:r:c:r, c is the column and r is the row (counting starts from 0).
So the example below plots the 4th row and the x-range is adjusted from 0 to 360.
Code:
### plotting a certain row while adjusting the x-range
reset session
$Data <<EOD
1 2 3 4 5 6 7
2 3 4 5 6 7 8
3 4 5 6 7 8 9
4 5 6 7 8 9 10
5 6 7 8 9 10 11
EOD
set key top left
plot $Data u ($1*60):3 matrix every :::3::3 w lp pt 7
### end of code
Result:

Contour plots of noisy data - gridding and averaging

I am trying to make a contour plot from a dataframe in which the x and y coordinates are unevenly spaced and sometimes overlap and the z coordinate is noisy:
x y z
1 15.4707 174.6779 1592.811638
2 15.4707 171.3179 1304.953183
3 61.6107 108.2379 1687.233377
4 46.3707 151.6929 1688.368690
5 30.7107 124.5429 1339.451757
6 31.1307 202.8704 1616.756963
7 0.2307 141.5029 1620.288736
8 15.4707 141.9054 1167.798302
9 46.3707 72.0729 1687.546227
10 15.4707 212.6929 638.059709
What I'd like to do is to define a grid in x and y whose gridelines pass coordinates, say
x=[7.5, 22.5, 37.5, 52.5]
y=[60, 120, 180, 240]
In every grid section, I then take the average of the z values and make a new dataframe where the x and y columns are the centres of the grid sections and the z column is the aforementioned average. The dataframe should look something like
x y z
1 15 90 1621.1
2 30 150 1444.2
3 45 210 1651.7
From this stage it easy to get a contour plot using matplotlib.contourf or similar, but how can do this type of gridding and averaging? Is there an elegant way to do it in Pandas or other python packages?

Create a frequency diagram using a dataframe in Pandas (Python3)

I currently have a list of the number of items and their frequency stored in a data frame called transactioncount_freq.
Item Frequency
0 1 3474
1 2 2964
2 3 1532
3 4 937
4 5 360
5 6 168
6 7 57
7 8 25
8 9 5
9 10 5
10 11 3
11 12 1
How would I make a bar chart using the item values as the x axis and the frequency values as the y axis using pandas and matplotlib.pyplot?
You can plot it easily like this
transactioncount_freq.plot(x='Item', y='Frequency', kind='bar')

gnuplot: fetching a variable value from different row/column for calculations

I want to get a specific value from another row & column to normalize my data. The tricky part is, that this value changes for every data point in my data set.
Here what my data set looks like:
64 22370 1 585 1 10
128 47547 1 4681 1 10
256 291761 1 37449 1 10
128 48446 1.019 4681 1 10
256 480937 1.648 37449 1 10
128 7765 0.163 777 0.166 10
256 7164 0.025 1393 0.037 10
128 37078 0.780 4681 1 10
256 334372 1.146 37449 1 10
128 45543 0.958 4681 1 10
128 5579 0.117 649 0.139 10
128 40121 0.844 4529 0.968 10
128 49494 1.041 4681 1 10
# --> here it starts to repeat
64 48788 1 585 1 20
128 110860 1 4681 1 20
256 717797 1 37449 1 20
128 101666 0.917 4681 1 20
......
......
This data file contains all points for in total 13 different sets, so I plot it with something like this:
plot\
'../logs.dat' every 13::1 u 6:2 title '' with lines lt 3 lc 'black' lw 1,\
'../logs.dat' every 13::3 u 6:2 title '' with lines lt 3 lc 'black' lw 1,\
Now I try to normalize my data. The interesting value is respectively the 1st row 2nd column (starting to count at 0) $1:$2 and then adds 13 to the rows for every data point
For example: The first data set I want to plot would be
(10:47547/47547)
(20:110860/110860)
...
The second plot should be
(10:48446/47547)
(20:101666/110860)
...
And so on.
In pseudo code I would read something like
plot\
'../logs.dat' every 13::1 u 6:($2 / take i:$2 for i = i + 13 ) title '' with lines lt 3 lc 'black' lw 1,\
'../logs.dat' every 13::3 u 6:($2 / take i:$2 for i = i + 13 ) title '' with lines lt 3 lc 'black' lw 1,\
I hope I could make clear what I try to archive.
Thank you for any help!
If the value you want to use for normalisation is the very first to be plotted, then something like this is possible:
plot y0=-1e10, "data" using 1:(y0 == -1e10 ? (y0 = $2, 1) : $2/y0)
The normalisation value y0 is initialised to -1e10 on every replot. Check the help for ternary operator and serial evaluation.
But really you'd better pre-process your data.
If I understood your question correctly you want to normalize some of your data in a special way.
For the first plot you want to start from the second line (row-index 1) and divide the value in the column by itself and continue for every 13th row.
So, this is dividing the values of the second column for the following row indices: 1/1, 14/14, 27/27, ..., (n*13+1)/(n*13+1). This is trivial because it will always be 1.
For the second plot you want to start with the value in column 2 from row index 3 and divide it by the value in column2 of row index 1 and repeat this for every 13th row.
i.e. involved rows-indices: 3/1, 16/14, 29/27, ..., (n*13+3)/(n*13+1)
For the second case, a construct with every 13 will not work because you need every 13th value and every 13th shifted by 2 rows.
So, what you can do:
If you pass by row-index 1 (and every 13th row later), remember the value in column 2 and when you pass by row-index 3, divide this value by the remembered value and plot it, otherwise plot NaN. Repeat this for all rows cycled by 13. You can use the pseudocolumn 0 (check help pseudocolumns) and the modulo operator (check help operators binary).
If you want a continuous line with lines or linespoints you need to set datafile missing NaN because NaN values would interrupt the lines (check help missing). However, this works only for gnuplot>=5.0.6. For gnuplot 5.0.0 (version at OP's question) you have to use some workaround.
Script:
### special normalization of data
reset session
$Data <<EOD
1 900 3 4 5 10
2 1000 3 4 5 10
3 1050 3 4 5 10
4 1100 3 4 5 10
5 1150 3 4 5 10
6 1200 3 4 5 10
7 1250 3 4 5 10
8 1300 3 4 5 10
9 1350 3 4 5 10
10 1400 3 4 5 10
11 1450 3 4 5 10
12 1500 3 4 5 10
13 1550 3 4 5 10
#
1 1900 3 4 5 20
2 2000 3 4 5 20
3 2050 3 4 5 20
4 2100 3 4 5 20
5 2150 3 4 5 20
6 2200 3 4 5 20
7 2250 3 4 5 20
8 2300 3 4 5 20
9 2350 3 4 5 20
10 2400 3 4 5 20
11 2450 3 4 5 20
12 2500 3 4 5 20
13 2550 3 4 5 20
#
1 2900 3 4 5 30
2 3000 3 4 5 30
3 3050 3 4 5 30
4 3100 3 4 5 30
5 3150 3 4 5 30
6 3200 3 4 5 30
7 3250 3 4 5 30
8 3300 3 4 5 30
9 3350 3 4 5 30
10 3400 3 4 5 30
11 3450 3 4 5 30
12 3500 3 4 5 30
13 3550 3 4 5 30
EOD
M = 13 # cycle of your data
set datafile missing NaN # only for gnuplot>=5.0.6
plot $Data u 6:(1) every M w lp pt 7 lc "red" ti "Normalized 1/1", \
'' u 6:(int($0)%M==1?y0=$2:0,int($0)%M==3?$2/y0:NaN) w lp pt 7 lc "blue" ti "Normalized 3/1"
### end of code
Result:

Heatmap with Gnuplot on a non-uniform grid

I would like to create a heatmap with gnuplot based on a non-uniform grid, meaning that my x axis bins do not have all the same width, and I can't figure out how to do that because when I plot my data with for example "with image" I get uniformly sized boxes which do no correspond to my coordinates at all (because "image" treats the data just as matrix I guess). So I would like to find a method to get non-uniform boxes which are also positioned in the right place on the Cartesian plane.
My data look something like this:
1 1 0.2
1 2 0.8
1 3 0.1
1 4 0.2
2 1 0.7
2 2 0.2
2 3 0.3
2 4 0.1
5 1 0.2
5 2 0.4
5 3 0.1
5 4 0.9
7 1 0.3
7 2 0.2
7 3 0.9
7 4 0.6
If I run this command on Gnuplot
set xrange [1:10]
p 'mydata.dat' with image
I get an image with 16 boxes that have the same width and height (apparently I don't have enough "reputation" on Stackoverflow to post an image, otherwise I would), but ideally I would like the boxes to have different widths and be in the right place on the plane. For example the first box should range from 1 to 2, the second one from 2 to 5, the third one from 5 to 7, and the last one from 7 to 10 (which is why I wrote set xrange [1:10]).
Could anyone help me please? Thank you very much!
The easiest (maybe only viable) way is to add some dummy data points and use splot ... with pm3d. This plotting style handles heatmaps with general quadrangles.
The image plotting style plots one box (one big pixel) for each data point, while pm3d takes each data point as corner of one or more quadrangles. The color of each quadrangles is determined by the values of the corners and is adjustable with set pm3d corners2color.
So, in your case you need to expand the 4x4 matrix to a 5x5 matrix (expand to right and top), but select the lower left corner to determine the color set pm3d corners2color c1.
The changed data file is then:
1 1 0.2
1 2 0.8
1 3 0.1
1 4 0.2
1 5 0.5
2 1 0.7
2 2 0.2
2 3 0.3
2 4 0.1
2 5 0.5
5 1 0.2
5 2 0.4
5 3 0.1
5 4 0.9
5 5 0.5
7 1 0.3
7 2 0.2
7 3 0.9
7 4 0.6
7 5 0.5
10 1 0.5
10 2 0.5
10 3 0.5
10 4 0.5
10 5 0.5
To plot it use
set pm3d map corners2color c1
set autoscale fix
set ytics 1
splot 'mydata.dat' using 1:($2-0.5):3 notitle
The result with 4.6.3 is:
In general, the z-value of the dummy data points doesn't matter, but in the above script it should lay somewhere between minimum and maximum values to allow set autoscale fix to work properly on the color scale.
If you don't want to change the data file manually, you could do it with some script, but that's a different question.
Here is an alternative solution without splot ... pm3d, but with boxxyerror.
If you plot data it should go as automatic as possible and there should be no need to "invent" and manually add data.
The following solution (a little bit more complex) takes care about the widths (+/-dx) and heights (+/-dy) of the boxes according to the following principle:
if it is an "inner" box, take half the distance to the adjacent datapoint on that side
if it is an "outer" box, take half the distance to the adjacent "inner" datapoint
Here, x-distances are irregular and y-distances are regular, but y-distances could also be irregular.
Data: SO19294342.dat
1 1 0.2
1 2 0.8
1 3 0.1
1 4 0.2
2 1 0.7
2 2 0.2
2 3 0.3
2 4 0.1
5 1 0.2
5 2 0.4
5 3 0.1
5 4 0.9
7 1 0.3
7 2 0.2
7 3 0.9
7 4 0.6
Script: (works with gnuplot>=4.6.0, March 2012)
### heatmap with boxxyerror and variable box-sizes
reset
FILE = "SO/SO19294342.dat"
set style fill solid 1.0
set tics out
set size ratio -1
# extract x-positions
Xs = Ys = ''
Nx = Ny = 0
b = -1
stats FILE u (column(-1)!=b ? (Nx=Nx+1, Xs=Xs.sprintf(" %g",$1), b=column(-1)) : 0, \
column(-1)==0 ? (Ny=Ny+1, Ys=Ys.sprintf(" %g",$2)) : 0) nooutput
d(vs,n0,n1) = abs(real(word(vs,n0))-real(word(vs,n1)))/2
dn(vs,n) = (n==1 ? (n0=1,n1=2) : (n0=n,n1=n-1), -d(vs,n0,n1))
dp(vs,n) = (Ns=words(vs), n==Ns ? (n0=Ns-1,n1=Ns) : (n0=n,n1=n+1), d(vs,n0,n1))
plot FILE u 1:2:($1+dn(Xs,column(-1)+1)):($1+dp(Xs,column(-1)+1)):\
($2+dn(Ys,int(column(0))%Ny+1)):($2+dp(Ys,int(column(0))%Ny+1)):3 w boxxy palette notitle
### end of script
For gnuplot>=4.6.5 you could add :xtic(1):xtic(2) to the plot command to only show your x- and y-coordinates as x,y-ticlabels.
plot FILE u 1:2:($1+dn(Xs,column(-1)+1)):($1+dp(Xs,column(-1)+1)):\
($2+dn(Ys,int(column(0))%Ny+1)):($2+dp(Ys,int(column(0))%Ny+1)):3:\
xtic(1):ytic(2) w boxxy palette notitle
And for gnuplot>=5.0.0 you could add noextend to the ranges to avoid white areas on the sides:
set xrange[:] noextend
set yrange[:] noextend
Result: (created with gnuplot 4.6.0)

Resources