How to project certain values from the graph on the axis in Python? - python-3.x

I am trying to plot a normal distribution curve in Python using matplotlib. I followed the accepted answer in the post python pylab plot normal distribution in order to generate the graph.
I would like to know if there is a way of projecting the mu - 3*sigma, mu + 3*sigma and the mean values on both the x-axis and y-axis.
Thanks
EDIT 1
Image for explaining projection
example_image.
In the image, I am trying to project the mean value on x and y-axis. I would like to know if there is a way I can achieve this along with obtaining the values (the blue circles on x and y-axis) on x and y-axis.

The following script shows how to achieve what you want:
import matplotlib.pyplot as plt
import numpy as np
import scipy.stats as stats
mu = 2
variance = 9
sigma = np.sqrt(variance)
x = np.linspace(mu - 3*sigma, mu + 3*sigma, 500)
y = stats.norm.pdf(x, mu, sigma)
fig, ax = plt.subplots()
ax.plot(x, y)
ax.set_xlim([min(x), max(x)])
ax.set_ylim([min(y), max(y)+0.02])
ax.hlines(y=max(y), xmin=min(x), xmax=mu, color='r')
ax.vlines(x=mu, ymin=min(y), ymax=max(y), color='r')
plt.show()
The produced plot is
If you are familiar with the properties of normal distribution, it is easy to know intersection with x axis is just mu, i.e., the distribution mean. Intersection with y axis is just the maximum value of y, i.e, max(y) in the code.

Related

Forcing colorbar ticks at min/max values

I am plotting using the contourf function from matplotlib and would like to add a colorbar, I've noticed that sometimes the ticks don't go the max/min values.
Is there a clean way to force it to set ticks at these values?
Note: Checking the max and min of z shows that the colorbar represents values from approx -1 to 1, therefor I would expect this ot be reflected such that one can see the range from the colobar, in addition to some ticks in between.
Plot and code demonstrating what I am talking about:
import matplotlib.pyplot as plt
import numpy as np
# Data to plot.
x, y = np.meshgrid(np.arange(7), np.arange(10))
z = np.sin(0.5 * x) * np.cos(0.52 * y)
fig, ax = plt.subplots()
cs = ax.contourf(x, y, z, levels=25)
ax.grid(c="k", ls="-", alpha=0.3)
fig.colorbar(cs, ax=ax)
fig.savefig("example.png", bbox_inches="tight")
The cleanest way seems to be to give explicit levels to contourf. If no explicit levels are given, contourf seems to choose its own, depending on the minimum and maximum value in the data, and also tries to find "nice looking" numbers. After that, ticks get set to a subset of these numbers, such that a tick always coincides with a real level. (If you use colorbar(..., ticks=...) those ticks will not necessarily coincide with the levels.)
As the sine and cosine don't reach -1 and 1 exact in the given example, they are not part of the range.
The following code shows how the ticks depend on the chosen levels. With np.linspace(-1, 1, 24) the levels aren't nice round numbers, but matplotlib still chooses a subset to show.
import matplotlib.pyplot as plt
import numpy as np
x, y = np.meshgrid(np.arange(7), np.arange(10))
z = np.sin(0.5 * x) * np.cos(0.52 * y)
fig, (ax1, ax2) = plt.subplots(ncols=2, figsize=(12, 3))
for ax in (ax1, ax2):
numcontours = 25 if ax == ax1 else 24
cs = ax.contourf(x, y, z, levels=np.linspace(-1, 1, numcontours))
ax.grid(c="k", ls="-", alpha=0.3)
fig.colorbar(cs, ax=ax)
ax.set_title(f'{numcontours} levels from -1 to 1')
plt.show()

What kind of plot from matplotlib should I use?

I am programming in Python 3 and I have data structured like this:
coordinates = [(0.15,0.25),(0.35,0.25),(0.55,0.45),(0.65,0.10),(0.15,0.25)]
These are coordinates. Within each pair, the first number is the x coordinate and the second one the y coordinate. Some of the coordinates repeat themselves. I want to plot these data like this:
The coordinates that are most frequently found should appear either as higher intensity (i.e., brighter) points or as points with a different color (for example, red for very frequent coordinates and blue for very infrequent coordinates). Don't worry about the circle and semicircle. That's irrelevant. Is there a matplotlib plot that can do this? Scatter plots do not work because they do not report on the frequency with which each coordinate is found. They just create a cloud.
The answer is:
import matplotlib.pyplot as plt
from scipy.stats import kde
import numpy as np
xvalues = np.random.normal(loc=0.5,scale=0.01,size=50000)
yvalues = np.random.normal(loc=0.25,scale=0.1,size=50000)
nbins=300
k = kde.gaussian_kde([xvalues,yvalues])
xi, yi = np.mgrid[0:1:nbins*1j,0:1:nbins*1j]
zi = k(np.vstack([xi.flatten(),yi.flatten()]))
fig, ax = plt.subplots()
ax.pcolormesh(xi, yi, zi.reshape(xi.shape), shading='auto', cmap=plt.cm.hot)
x = np.arange(0.0,1.01,0.01,dtype=np.float64)
y = np.sqrt((0.5*0.5)-((x-0.5)*(x-0.5)))
ax.axis([0,1,0,0.55])
ax.set_ylabel('S', fontsize=16)
ax.set_xlabel('G', fontsize=16)
ax.tick_params(labelsize=12, width=3)
ax.plot(x,y,'w--')
plt.show()

Python: how to create a smoothed version of a 2D binned "color map"?

I would like to create a version of this 2D binned "color map" with smoothed colors.
I am not even sure this would be the correct nomenclature for the plot, but, essentially, I want my figure to be color coded by the median values of a third variable for points that reside in each defined bin of my (X, Y) space.
Even though I am able to accomplish that to a certain degree (see example), I would like to find a way to create a version of the same plot with a smoothed color gradient. That would allow me to visualize the overall behavior of my distribution.
I tried ideas described here: Smoothing 2D map in python
and here: Python: binned_statistic_2d mean calculation ignoring NaNs in data
as well as links therein, but could not find a clear solution to the problem.
This is what I have so far:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm
from scipy.stats import binned_statistic_2d
import random
random.seed(999)
x = np.random.normal (0,10,5000)
y = np.random.normal (0,10,5000)
z = np.random.uniform(0,10,5000)
fig = plt.figure(figsize=(20, 20))
plt.rcParams.update({'font.size': 10})
ax = fig.add_subplot(3,3,1)
ax.set_axisbelow(True)
plt.grid(b=True, lw=0.5, zorder=-1)
x_bins = np.arange(-50., 50.5, 1.)
y_bins = np.arange(-50., 50.5, 1.)
cmap = plt.cm.get_cmap('jet_r',1000) #just a colormap
ret = binned_statistic_2d(x, y, z, statistic=np.median, bins=[x_bins, y_bins]) # Bin (X, Y) and create a map of the medians of "Colors"
plt.imshow(ret.statistic.T, origin='bottom', extent=(-50, 50, -50, 50), cmap=cmap)
plt.xlim(-40,40)
plt.ylim(-40,40)
plt.xlabel("X", fontsize=15)
plt.ylabel("Y", fontsize=15)
ax.set_yticks([-40,-30,-20,-10,0,10,20,30,40])
bounds = np.arange(2.0, 20.0, 1.0)
plt.colorbar(ticks=bounds, label="Color", fraction=0.046, pad=0.04)
# save plots
plt.savefig("Whatever_name.png", bbox_inches='tight')
Which produces the following image (from random data):
Therefore, the simple question would be: how to smooth these colors?
Thanks in advance!
PS: sorry for excessive coding, but I believe a clear visualization is crucial for this particular problem.
Thanks to everyone who viewed this issue and tried to help!
I ended up being able to solve my own problem. In the end, it was all about image smoothing with Gaussian Kernel.
This link: Gaussian filtering a image with Nan in Python gave me the insight for the solution.
I, basically, implemented the exactly same code, but, in the end, mapped the previously known NaN pixels from the original 2D array to the resulting smoothed version. Unlike the solution from the link, my version does NOT fill NaN pixels with some value derived from the pixels around. Or, it does, but then I erase those again.
Here is the final figure produced for the example I provided:
Final code, for reference, for those who might need in the future:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm
from scipy.stats import binned_statistic_2d
import scipy.stats as st
import scipy.ndimage
import scipy as sp
import random
random.seed(999)
x = np.random.normal (0,10,5000)
y = np.random.normal (0,10,5000)
z = np.random.uniform(0,10,5000)
fig = plt.figure(figsize=(20, 20))
plt.rcParams.update({'font.size': 10})
ax = fig.add_subplot(3,3,1)
ax.set_axisbelow(True)
plt.grid(b=True, lw=0.5, zorder=-1)
x_bins = np.arange(-50., 50.5, 1.)
y_bins = np.arange(-50., 50.5, 1.)
cmap = plt.cm.get_cmap('jet_r',1000) #just a colormap
ret = binned_statistic_2d(x, y, z, statistic=np.median, bins=[x_bins, y_bins]) # Bin (X, Y) and create a map of the medians of "Colors"
sigma=1 # standard deviation for Gaussian kernel
truncate=5.0 # truncate filter at this many sigmas
U = ret.statistic.T.copy()
V=U.copy()
V[np.isnan(U)]=0
VV=sp.ndimage.gaussian_filter(V,sigma=sigma)
W=0*U.copy()+1
W[np.isnan(U)]=0
WW=sp.ndimage.gaussian_filter(W,sigma=sigma)
np.seterr(divide='ignore', invalid='ignore')
Z=VV/WW
for i in range(len(Z)):
for j in range(len(Z[0])):
if np.isnan(U[i][j]):
Z[i][j] = np.nan
plt.imshow(Z, origin='bottom', extent=(-50, 50, -50, 50), cmap=cmap)
plt.xlim(-40,40)
plt.ylim(-40,40)
plt.xlabel("X", fontsize=15)
plt.ylabel("Y", fontsize=15)
ax.set_yticks([-40,-30,-20,-10,0,10,20,30,40])
bounds = np.arange(2.0, 20.0, 1.0)
plt.colorbar(ticks=bounds, label="Color", fraction=0.046, pad=0.04)
# save plots
plt.savefig("Whatever_name.png", bbox_inches='tight')

Matplotlib: How to copy a contour plot to another figure?

I have a figure with many different plots (contour plots and lots of other stuff). I want to extract the contour plot to another single figure to see more details. But I fail how to do so.
Have a look on this code:
import numpy as np
from matplotlib import gridspec as gs, pyplot as plt
# Figure 1 with many different plots.
fig1 = plt.figure()
gridSpec = gs.GridSpec(2, 3)
for i in range(6):
fig1.add_subplot(gridSpec[i])
# Create contour plot
x = np.arange(-3.0, 3.0, 0.02)
y = np.arange(-2.0, 2.0, 0.01)
X, Y = np.meshgrid(x, y)
Z1 = np.exp(-X**2 - Y**2)
Z2 = np.exp(-(X - 1)**2 - (Y - 1)**2)
Z = (Z1 - Z2) ** 4
# Plot it to a particular axes.
ax1 = fig1.axes[2]
contour = ax1.contour(X, Y, Z)
# Try to copy the contour plot to another figure (with only 1 subplot).
fig2, ax2 = plt.subplots()
# How to copy the content of ax1 to ax2?
plt.show()
This will give me the following:
I want to create a second figure with only 1 subplot and its content should be the same as you can see in top right corner of the first figure with 6 subplots.
First thing I tried was
ax2.add_collection(contour.collections[1])
but I got the error message
RuntimeError: Can not put single artist in more than one figure
This is because the content is already plottet to figure 1, so it is not possible to plot it to figure 2 as well. So I tried to make a copy of the contour plot:
from copy import deepcopy
ax2.add_collection(deepcopy(contour.collections[1]))
But this will get me a new error that copiing is not possible ...
NotImplementedError: TransformNode instances can not be copied. Consider using frozen() instead.
So .. what can I do? Any ideas for that problem? :)
Thanks a lot!
(Python 3.7.4, Matplotlib 3.1.1)

Using python and networkx to find the probability density function

I'm struggling to draw a power law graph for Facebook Data that I found online. I'm using Networkx and I've found how to draw a Degree Histogram and a degree rank. The problem that I'm having is I want the y axis to be a probability so I'm assuming I need to sum up each y value and divide by the total number of nodes? Can anyone please help me do this? Once I've got this I'd like to draw a log-log graph to see if I can obtain a straight line. I'd really appreciate it if anyone could help! Here's my code:
import collections
import networkx as nx
import matplotlib.pyplot as plt
from networkx.algorithms import community
import math
import pylab as plt
g = nx.read_edgelist("/Users/Michael/Desktop/anaconda3/facebook_combined.txt","r")
nx.info(g)
degree_sequence = sorted([d for n, d in g.degree()], reverse=True)
degreeCount = collections.Counter(degree_sequence)
deg, cnt = zip(*degreeCount.items())
fig, ax = plt.subplots()
plt.bar(deg, cnt, width=0.80, color='b')
plt.title("Degree Histogram for Facebook Data")
plt.ylabel("Count")
plt.xlabel("Degree")
ax.set_xticks([d + 0.4 for d in deg])
ax.set_xticklabels(deg)
plt.show()
plt.loglog(degree_sequence, 'b-', marker='o')
plt.title("Degree rank plot")
plt.ylabel("Degree")
plt.xlabel("Rank")
plt.show()
You seem to be on the right tracks, but some simplifications will likely help you. The code below uses only 2 libraries.
Without access your graph, we can use some graph generators instead. I've chosen 2 qualitatively different types here, and deliberately chosen different sizes so that the normalization of the histogram is needed.
import networkx as nx
import matplotlib.pyplot as plt
g1 = nx.scale_free_graph(1000, )
g2 = nx.watts_strogatz_graph(2000, 6, p=0.8)
# we don't need to sort the values since the histogram will handle it for us
deg_g1 = nx.degree(g1).values()
deg_g2 = nx.degree(g2).values()
# there are smarter ways to choose bin locations, but since
# degrees must be discrete, we can be lazy...
max_degree = max(deg_g1 + deg_g2)
# plot different styles to see both
fig = plt.figure()
ax = fig.add_subplot(111)
ax.hist(deg_g1, bins=xrange(0, max_degree), density=True, histtype='bar', rwidth=0.8)
ax.hist(deg_g2, bins=xrange(0, max_degree), density=True, histtype='step', lw=3)
# setup the axes to be log/log scaled
ax.set_yscale('log')
ax.set_xscale('log')
ax.set_xlabel('degree')
ax.set_ylabel('relative density')
ax.legend()
plt.show()
This produces an output plot like this (both g1,g2 are randomised so won't be identical):
Here we can see that g1 has an approximately straight line decay in the degree distribution -- as expected for scale-free distributions on log-log axes. Conversely, g2 does not have a scale-free degree distribution.
To say anything more formal, you could look at the toolboxes from Aaron Clauset: http://tuvalu.santafe.edu/~aaronc/powerlaws/ which implement model fitting and statistical testing of power-law distributions.

Resources