Adjust hspace one-sided for matplotlib subplots - python-3.x

My question is based on this question:
Adjust hspace for some of the subplots
Which adjusts the top plot of a number of subplots and increases the difference in hspace. I want to increase the hspace between two plots within the subplots (in my case: between plot 3 and plot4 from the top).
Here is my example:
import numpy as np
import matplotlib.pyplot as plt
noise = np.random.rand(300)
gs_top = plt.GridSpec(9, 1, hspace=0.5)
gs_base = plt.GridSpec(9, 1, hspace=0)
fig = plt.figure()
fig.patch.set_facecolor('white')
ax0 = fig.add_subplot(gs_base[0,:])
ax1 = fig.add_subplot(gs_base[1,:])
ax2 = fig.add_subplot(gs_top[2,:])
ax3 = fig.add_subplot(gs_base[3,:])
ax4 = fig.add_subplot(gs_base[4,:])
ax5 = fig.add_subplot(gs_base[5,:])
ax0.plot(noise)
ax1.plot(noise)
ax2.plot(noise)
ax3.plot(noise)
ax4.plot(noise)
ax5.plot(noise)
In the example it is shown that the hspace increases between plot 3 and 4. However, I don't want to increase the space between plot 2 and plot 3.
How can I adjust the hspace variable only on one side?

Found the answer after manipulating google by asking with various word combinations. Found this: Stackoverflow answer
In short (dirty way):
Adding a seperate axis and make it invisible.
Example:
import numpy as np
import matplotlib.pyplot as plt
noise = np.random.rand(300)
gs_base = plt.GridSpec(7, 1, hspace=0, height_ratios=[1, 1, 1, 0.8, 1,1,1])
fig = plt.figure()
fig.patch.set_facecolor('white')
ax0 = fig.add_subplot(gs_base[0,:])
ax1 = fig.add_subplot(gs_base[1,:])
ax2 = fig.add_subplot(gs_base[2,:])
ax3 = fig.add_subplot(gs_base[3,:])
ax3.set_visible(False)
ax4 = fig.add_subplot(gs_base[4,:])
ax5 = fig.add_subplot(gs_base[5,:])
ax6 = fig.add_subplot(gs_base[6,:])
ax0.plot(noise)
ax1.plot(noise)
ax2.plot(noise)
ax4.plot(noise)
ax5.plot(noise)
ax6.plot(noise)
In long (correct way):
Couldn't figure it out for the moment.

Related

Is there a library that will help me fit data easily? I found fitter and i will provide the code but it shows some errors

So, here is my code:
import pandas as pd
import scipy.stats as st
import matplotlib.pyplot as plt
from matplotlib.ticker import AutoMinorLocator
from fitter import Fitter, get_common_distributions
df = pd.read_csv("project3.csv")
bins = [282.33, 594.33, 906.33, 1281.33, 15030.33, 1842.33, 2154.33, 2466.33, 2778.33, 3090.33, 3402.33]
#declaring
facecolor = '#EAEAEA'
color_bars = '#3475D0'
txt_color1 = '#252525'
txt_color2 = '#004C74'
fig, ax = plt.subplots(1, figsize=(16, 6), facecolor=facecolor)
ax.set_facecolor(facecolor)
n, bins, patches = plt.hist(df.City1, color=color_bars, bins=10)
#grid
minor_locator = AutoMinorLocator(2)
plt.gca().xaxis.set_minor_locator(minor_locator)
plt.grid(which='minor', color=facecolor, lw = 0.5)
xticks = [(bins[idx+1] + value)/2 for idx, value in enumerate(bins[:-1])]
xticks_labels = [ "{:.0f}-{:.0f}".format(value, bins[idx+1]) for idx, value in enumerate(bins[:-1])]
plt.xticks(xticks, labels=xticks_labels, c=txt_color1, fontsize=13)
#beautify
ax.tick_params(axis='x', which='both',length=0)
plt.yticks([])
ax.spines['bottom'].set_visible(False)
ax.spines['left'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.spines['top'].set_visible(False)
for idx, value in enumerate(n):
if value > 0:
plt.text(xticks[idx], value+5, int(value), ha='center', fontsize=16, c=txt_color1)
plt.title('Histogram of rainfall in City1\n', loc = 'right', fontsize = 20, c=txt_color1)
plt.xlabel('\nCentimeters of rainfall', c=txt_color2, fontsize=14)
plt.ylabel('Frequency of occurrence', c=txt_color2, fontsize=14)
plt.tight_layout()
#plt.savefig('City1_Raw.png', facecolor=facecolor)
plt.show()
city1 = df['City1'].values
f = Fitter(city1, distributions=get_common_distributions())
f.fit()
fig = f.plot_pdf(names=None, Nbest=4, lw=1, method='sumsquare_error')
plt.show()
print(f.get_best(method = 'sumsquare_error'))
The issue is with the plots it shows. The first histogram it generates is
Next I get another graph with best fitted distributions which is
Then an output statement
{'chi2': {'df': 10.692966790090342, 'loc': 16.690849400411103, 'scale': 118.71595997157786}}
Process finished with exit code 0
I have a couple of questions. Why is chi2, the best fitted distribution not plotted on the graph?
How do I plot these distributions on top of the histograms and not separately? The hist() function in fitter library can do that but there I don't get to control the bins and so I end up getting like 100 bins with some flat looking data.
How do I solve this issue? I need to plot the best fit curve on the histogram that looks like image1. Can I use any other module/package to get the work done in similar way? This uses least squares fit but I am OK with least likelihood or log likelihood too.
Simple way of plotting things on top of each other (using some properties of the Fitter class)
import scipy.stats as st
import matplotlib.pyplot as plt
from fitter import Fitter, get_common_distributions
from scipy import stats
numberofpoints=50000
df = stats.norm.rvs( loc=1090, scale=500, size=numberofpoints)
fig, ax = plt.subplots(1, figsize=(16, 6))
n, bins, patches = ax.hist( df, bins=30, density=True)
f = Fitter(df, distributions=get_common_distributions())
f.fit()
errorlist = sorted(
[
[f._fitted_errors[dist], dist]
for dist in get_common_distributions()
]
)[:4]
for err, dist in errorlist:
ax.plot( f.x, f.fitted_pdf[dist] )
plt.show()
Using the histogram normalization, one would need to play with scaling to generalize again.

Matplotlib - maintain plot size of uneven subplots

I've been creating uneven subplots in matplotlib based on this question. The gridspec solution (third answer) worked a little better for me as it gives a bit more flexibility for the exact sizes of the subplots.
When I add a plot of a 2D array with imshow() the affected subplot is resized to the shape of the array. Is there any way to avoid that and keep the subplot-sizes (or rather aspect-ratio) fixed?
Here's the example code and the resulting image with the subplot-sizes I'm happy with:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import gridspec
# generate data
x = np.arange(0, 10, 0.2)
y = np.sin(x)
# plot
fig = plt.figure(figsize=(12, 9))
gs = gridspec.GridSpec(20, 20)
ax1 = fig.add_subplot(gs[0:5,0:11])
ax1.plot(x, y)
ax2 = fig.add_subplot(gs[6:11,0:11])
ax2.plot(y, x)
ax3 = fig.add_subplot(gs[12:20,0:11])
ax3.plot(y, x)
ax4 = fig.add_subplot(gs[0:9,13:20])
ax4.plot(x, y)
ax5 = fig.add_subplot(gs[11:20,13:20])
ax5.plot(y, x)
plt.show()
This is what happens if I additionally plot data from a 2D array with the following lines (insert before plt.show):
2Ddata = np.arange(0, 10, 0.1).reshape(10, 10)
im = ax3.imshow(2Ddata, cmap='rainbow')
How can I restore the original size of the subplot from ax3 (lower left corner)?
Including the line ax3.set_aspect('auto') seems to have solved the issue.

How to have a fast crosshair mouse cursor for subplots in matplotlib?

In this video of backtrader's matplotlib implementation https://youtu.be/m6b4Ti4P2HA?t=2008 I can see that a default and very fast and CPU saving crosshair mouse cursor seems to exist in matplotlib.
I would like to have the same kind of mouse cursor for a simple multi subplot plot in matplotlib like this:
import numpy as np
import matplotlib
matplotlib.use('QT5Agg')
matplotlib.rcParams['figure.figsize'] = (20.0, 22.0)
import matplotlib.pyplot as plt
fig = plt.figure()
ax1 = plt.subplot(2, 1, 1)
ax2 = plt.subplot(2, 1, 2, sharex=ax1)
ax1.plot(np.array(np.random.rand(100)))
ax2.plot(np.array(np.random.rand(100)))
plt.show()
So, if I am with my mouse in the lower subplot, I want to see directly and very precisely, which value of x/y in the lower plot corresponds to which value pair in the upper plot.
I have found other solutions to do this but they seem to be very slow compared to the implementation in the video.
You can create a crosshair cursor via mplcursors. sel.extras.append() takes care that the old cursor is removed when a new is drawn. With sel.annotation.set_text you can adapt the popup annotation shown. To leave out the annotation, use sel.annotation.set_visible(False). To find the corresponding y-value in the other subplot, np.interp with the data extracted from the curve can be used.
import numpy as np
import matplotlib.pyplot as plt
import mplcursors
def crosshair(sel):
x, y2 = sel.target
y1 = np.interp( sel.target[0], plot1.get_xdata(), plot1.get_ydata() )
sel.annotation.set_text(f'x: {x:.2f}\ny1: {y1:.2f}\ny2: {y2:.2f}')
# sel.annotation.set_visible(False)
hline1 = ax1.axhline(y1, color='k', ls=':')
vline1 = ax1.axvline(x, color='k', ls=':')
vline2 = ax2.axvline(x, color='k', ls=':')
hline2 = ax2.axhline(y2, color='k', ls=':')
sel.extras.append(hline1)
sel.extras.append(vline1)
sel.extras.append(hline2)
sel.extras.append(vline2)
fig = plt.figure(figsize=(15, 10))
ax1 = plt.subplot(2, 1, 1)
ax2 = plt.subplot(2, 1, 2, sharex=ax1)
plot1, = ax1.plot(np.array(np.random.uniform(-1, 1, 100).cumsum()))
plot2, = ax2.plot(np.array(np.random.uniform(-1, 1, 100).cumsum()))
cursor = mplcursors.cursor(plot2, hover=True)
cursor.connect('add', crosshair)
plt.show()
Here is an alternative implementation that stores the data in global variables and moves the lines (instead of deleting and recreating them):
import numpy as np
import matplotlib.pyplot as plt
import mplcursors
def crosshair(sel):
x = sel.target[0]
y1 = np.interp(x, plot1x, plot1y)
y2 = np.interp(x, plot2x, plot2y)
sel.annotation.set_visible(False)
hline1.set_ydata([y1])
vline1.set_xdata([x])
hline2.set_ydata([y2])
vline2.set_xdata([x])
hline1.set_visible(True)
vline1.set_visible(True)
hline2.set_visible(True)
vline2.set_visible(True)
fig = plt.figure(figsize=(15, 10))
ax1 = plt.subplot(2, 1, 1)
ax2 = plt.subplot(2, 1, 2, sharex=ax1)
plot1, = ax1.plot(np.array(np.random.uniform(-1, 1, 100).cumsum()))
plot2, = ax2.plot(np.array(np.random.uniform(-1, 1, 100).cumsum()))
plot1x = plot1.get_xdata()
plot1y = plot1.get_ydata()
plot2x = plot2.get_xdata()
plot2y = plot2.get_ydata()
hline1 = ax1.axhline(plot1y[0], color='k', ls=':', visible=False)
vline1 = ax1.axvline(plot1x[0], color='k', ls=':', visible=False)
hline2 = ax2.axhline(plot2y[0], color='k', ls=':', visible=False)
vline2 = ax2.axvline(plot2x[0], color='k', ls=':', visible=False)
cursor = mplcursors.cursor([plot1, plot2], hover=True)
cursor.connect('add', crosshair)
plt.show()
Sorry for the late answer, but I was horrified by how much code was suggested above, when there is this one-liner on matplotlib to do a simple crosshair accross different axes. It won't show your labels but it's CPU-light.
from matplotlib.widgets import MultiCursor
cursor = MultiCursor(fig.canvas, (ax[0], ax[1]), color='r',lw=0.5, horizOn=True, vertOn=True)

How can I add a normal distribution curve to multiple histograms?

With the following code I create four histograms:
import numpy as np
import pandas as pd
data = pd.DataFrame(np.random.normal((1, 2, 3 , 4), size=(100, 4)))
data.hist(bins=10)
I want the histograms to look like this:
I know how to make it one graph at the time, see here
But how can I do it for multiple histograms without specifying each single one? Ideally I could use 'pd.scatter_matrix'.
Plot each histogram seperately and do the fit to each histogram as in the example you linked or take a look at the hist api example here. Essentially what should be done is
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.mlab as mlab
fig = plt.figure()
ax1 = fig.add_subplot(221)
ax2 = fig.add_subplot(222)
ax3 = fig.add_subplot(223)
ax4 = fig.add_subplot(224)
for ax in [ax1, ax2, ax3, ax4]:
n, bins, patches = ax.hist(**your_data_here**, 50, normed=1, facecolor='green', alpha=0.75)
bincenters = 0.5*(bins[1:]+bins[:-1])
y = mlab.normpdf( bincenters, mu, sigma)
l = ax.plot(bincenters, y, 'r--', linewidth=1)
plt.show()

MatPlotLib + GeoPandas: Plot Multiple Layers, Control Figsize

Given the shape file available here: I know can produce the basic map that I need with county labels and even some points on the map (see below). The issue I'm having is that I cannot seem to control the size of the figure with figsize.
Here's what I have:
import geopandas as gpd
import matplotlib.pyplot as plt
%matplotlib inline
figsize=5,5
fig = plt.figure(figsize=(figsize),dpi=300)
shpfileshpfile=r'Y:\HQ\TH\Groups\NR\PSPD\Input\US_Counties\cb_2015_us_county_20m.shp'
c=gpd.read_file(shpfile)
c=c.loc[c['GEOID'].isin(['26161','26093','26049','26091','26075','26125','26163','26099','26115','26065'])]
c['coords'] = c['geometry'].apply(lambda x: x.representative_point().coords[:])
c['coords'] = [coords[0] for coords in c['coords']]
ax=c.plot()
#Control some attributes regarding the axis (for the plot above)
ax.spines['top'].set_visible(False);ax.spines['bottom'].set_visible(False);ax.spines['left'].set_visible(False);ax.spines['right'].set_visible(False)
ax.tick_params(axis='y',which='both',left='off',right='off',color='none',labelcolor='none')
ax.tick_params(axis='x',which='both',top='off',bottom='off',color='none',labelcolor='none')
for idx, row in c.iterrows():
ax.annotate(s=row['NAME'], xy=row['coords'],
horizontalalignment='center')
lat2=[42.5,42.3]
lon2=[-84,-83.5]
#Add another plot...
ax.plot(lon2,lat2,alpha=1,marker='o',linestyle='none',markeredgecolor='none',markersize=15,color='white')
plt.show()
As you can see, I opted to call the plots by the axis name because I need to control attributes of the axis, such as tick_params. I'm not sure if there is a better approach. This seems like a "no-brainer" but I can't seem to figure out why I can't control the figure size.
Thanks in advance!
I just had to do the following:
Use fig, ax = plt.subplots(1, 1, figsize = (figsize))
2.use the ax=ax argument in c.plot()
import geopandas as gpd
import matplotlib.pyplot as plt
%matplotlib inline
figsize=5,5
#fig = plt.figure(figsize=(figsize),dpi=300)
#ax = fig.add_subplot(111)
fig, ax = plt.subplots(1, 1, figsize = (figsize))
shpfileshpfile=r'Y:\HQ\TH\Groups\NR\PSPD\Input\US_Counties\cb_2015_us_county_20m.shp'
c=gpd.read_file(shpfile)
c=c.loc[c['GEOID'].isin(['26161','26093','26049','26091','26075','26125','26163','26099','26115','26065'])]
c['coords'] = c['geometry'].apply(lambda x: x.representative_point().coords[:])
c['coords'] = [coords[0] for coords in c['coords']]
c.plot(ax=ax)
ax.spines['top'].set_visible(False);ax.spines['bottom'].set_visible(False);ax.spines['left'].set_visible(False);ax.spines['right'].set_visible(False)
ax.tick_params(axis='y',which='both',left='off',right='off',color='none',labelcolor='none')
ax.tick_params(axis='x',which='both',top='off',bottom='off',color='none',labelcolor='none')
for idx, row in c.iterrows():
ax.annotate(s=row['NAME'], xy=row['coords'],
horizontalalignment='center')
lat2=[42.5,42.3]
lon2=[-84,-83.5]
ax.plot(lon2,lat2,alpha=1,marker='o',linestyle='none',markeredgecolor='none',markersize=15,color='white')

Resources