Contour plots of noisy data - gridding and averaging - python-3.x

I am trying to make a contour plot from a dataframe in which the x and y coordinates are unevenly spaced and sometimes overlap and the z coordinate is noisy:
x y z
1 15.4707 174.6779 1592.811638
2 15.4707 171.3179 1304.953183
3 61.6107 108.2379 1687.233377
4 46.3707 151.6929 1688.368690
5 30.7107 124.5429 1339.451757
6 31.1307 202.8704 1616.756963
7 0.2307 141.5029 1620.288736
8 15.4707 141.9054 1167.798302
9 46.3707 72.0729 1687.546227
10 15.4707 212.6929 638.059709
What I'd like to do is to define a grid in x and y whose gridelines pass coordinates, say
x=[7.5, 22.5, 37.5, 52.5]
y=[60, 120, 180, 240]
In every grid section, I then take the average of the z values and make a new dataframe where the x and y columns are the centres of the grid sections and the z column is the aforementioned average. The dataframe should look something like
x y z
1 15 90 1621.1
2 30 150 1444.2
3 45 210 1651.7
From this stage it easy to get a contour plot using matplotlib.contourf or similar, but how can do this type of gridding and averaging? Is there an elegant way to do it in Pandas or other python packages?

Related

sum for each column per specie

I have a dataframe like this:
Id
head
leg
wing
A melifera
2
5
25
A melifera
9
5
16
Bombus sp
2
5
19
X strenua
0
56
25
and for more than 1000 observations like that
I want to know if with a dataset like this is possible to make a geom_bar with facet_wrap(Id) so it looks like this:
facet_wrap for each specie across sites
So in the x axis should be head, leg, wing as each tick and in the y axis the total sum for each body part all across each subset that would be the specie
Thanks!

Need to plot histogram in Pandas such that x axis is categorical and y axis is sum of some column

I have a data frame in Pandas (using Python 3.7) as shown below:
# actuals probability bucket
# 0 0.0 0.116375 2
# 1 0.0 0.239069 3
# 2 1.0 0.591988 6
# 3 0.0 0.273709 3
# 4 1.0 0.929855 10
Where 'bucket' can take discreet values from 1 to 10. And 'actuals' can take only 2 values, either 1 or 0.
I need to plot a histogram such that x-axis = 'bucket' (i.e 1 to 10) and y-axis = Sum of 'actuals' . Then how can I do that?
Use groupby.sum with plot:
df.groupby('bucket')['actuals'].sum().plot(kind='bar')
If need histogram use kind='hist'

how to plot some data that is in two scales in gnulpot

I want to plot some data that is in two scales.
for 1 < X < 20 , Y is between 0 and 1 and
for 20 < X < 100 Y is between 1 and 20
normal plot has this result:
and as you see the carve between 0 and 20 is hidden!
how can i solve it?
Use a logarithmic scale
set logscale y

How to build a scatter graph in excel with average y value for each x value

I am not sure that here is the best place to ask,
but I have summerized my program performance data in an excel file and I want to build a scatter graph.
For each x value I have 6 y values and I want my graph to contain the average of those 6 to each x.
Is there a way to do this in excel?
For example: I have
X Y
1 0.2
1 0
1 0
1 0.8
1 1.4
1 0
2 0.2
2 1.2
2 1
2 2.2
2 0
2 2.2
3 0.8
3 1.6
3 0
3 3.6
3 1.2
3 0.6
For each x I want my graph to contain the average y.
Thanks
Not certain what you want but suggest inserting a column (assumed to be B) immediately between your two existing ones and populating it with:
=AVERAGEIF(A:A,A2,C:C)
then plotting X against those values.
Or maybe better, just subtotal for each change in X with average for Y and plot that.

GNUPlot matrix plot with changing distance between lines

In GNUPlot you can make 3d plots based on a .dat file with a matrix notation:
#Y 0.1 0.2 0.3 0.4
0 1 4 9 #X = 1
1 2 5 10 #X = 2
4 5 8 13 #X = 3
9 10 13 18 #X = 5
16 17 20 25 #X = 7
25 26 29 34 #X = 10
However the file I want to plot has some changes in X-distance between the lines. As shown in comment. One can use set xtics but that only changes the numbers on the plot, while the points should be plotted on a linear axis.
Is there a way to do this?
No, not with this type of matrix notation. You would have to use a format as described here: http://t16web.lanl.gov/Kawano/gnuplot/datafile-e.html#3dim
The matrix format assumes an even spacing between x and y points, but the 3D data format allows arbitrary positioning of all the points.

Resources