How to align heights and widths subplot axes with gridspec and matplotlib? - python-3.x

I am trying to use matplotlib with gridspec to create a subplot such that the axes are arranged to look similar to the figure below; the figure was taken from this unrelated question.
My attempt at recreating this axes arrangement is below. Specifically, my problem is that the axes are not properly aligned. For example, the axis object for the blue histogram is taller than the axis object for the image with various shades of green; the orange histogram seems to properly align in terms of width, but I attribute this to luck. How can I properly align these axes? Unlike the original figure, I would like to add/pad extra empty space between axes such that there borders do not intersect; the slice notation in the code below does this by adding a blank row/column. (In the interest of not making this post longer than it has to be, I did not make the figures "pretty" by playing with axis ticks and the like.)
Unlike the original picture, the axes are not perfectly aligned. Is there a way to do this without using constrained layout? By this, I mean some derivative of fig, ax = plt.subplots(constrained_layout=True)?
The MWE code to recreate my figure is below; note that there was no difference between ax.imshow(...) and ax.matshow(...).
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import Normalize
## initialize figure and axes
fig = plt.figure()
gs = fig.add_gridspec(6, 6, hspace=0.2, wspace=0.2)
ax_bottom = fig.add_subplot(gs[4:, 2:])
ax_left = fig.add_subplot(gs[:4, :2])
ax_big = fig.add_subplot(gs[:4, 2:])
## generate data
x = np.random.normal(loc=50, scale=10, size=100)
y = np.random.normal(loc=500, scale=50, size=100)
## get singular histograms
x_counts, x_edges = np.histogram(x, bins=np.arange(0, 101, 5))
y_counts, y_edges = np.histogram(y, bins=np.arange(0, 1001, 25))
x_mids = (x_edges[1:] + x_edges[:-1]) / 2
y_mids = (y_edges[1:] + y_edges[:-1]) / 2
## get meshed histogram
sample = np.array([x, y]).T
xy_counts, xy_edges = np.histogramdd(sample, bins=(x_edges, y_edges))
## subplot histogram of x
ax_bottom.bar(x_mids, x_counts,
width=np.diff(x_edges),
color='darkorange')
ax_bottom.set_xlim([x_edges[0], x_edges[-1]])
ax_bottom.set_ylim([0, np.max(x_counts)])
## subplot histogram of y
ax_left.bar(y_mids, y_counts,
width=np.diff(y_edges),
color='steelblue')
ax_left.set_xlim([y_edges[0], y_edges[-1]])
ax_left.set_ylim([0, np.max(y_counts)])
## subplot histogram of xy-mesh
ax_big.imshow(xy_counts,
cmap='Greens',
norm=Normalize(vmin=np.min(xy_counts), vmax=np.max(xy_counts)),
interpolation='nearest',
origin='upper')
plt.show()
plt.close(fig)
EDIT:
One can initialize the axes by explicitly setting width_ratios and height_ratios per row/column; this is shown below. This doesn't affect the output, but maybe I'm using it incorrectly?
## initialize figure and axes
fig = plt.figure()
gs = gridspec.GridSpec(ncols=6, nrows=6, figure=fig, width_ratios=[1]*6, height_ratios=[1]*6)
ax_bottom = fig.add_subplot(gs[4:, 2:])
ax_left = fig.add_subplot(gs[:4, :2])
ax_big = fig.add_subplot(gs[:4, 2:])

The problem is with imshow, which resizes the axes automatically to maintain a square pixel aspect.
You can prevent this by calling:
ax_big.imshow(..., aspect='auto')

Related

How to combine two geometries into one plot in Python

Question background: I am trying to make two geometries in a one plot in python. I have made one geometry which is an object having mesh as shown in figure below. The respective code is also mentioned here.
df_1_new = pd.DataFrame()
df_1_new['X_coordinate']=pd.Series(x_new)
df_1_new['Y_coordinate']=pd.Series(y_new)
df_1_new['node_number'] = df_1_new.index
df_1_new = df_1_new[['node_number','X_coordinate','Y_coordinate']]
plt.scatter(x_new, y_new)
plt.show
The second geometry, which is a circle and I made this geometry running below code.
from matplotlib import pyplot as plt, patches
plt.rcParams["figure.figsize"] = [9.00, 6.50]
plt.rcParams["figure.autolayout"] = True
fig = plt.figure()
ax = fig.add_subplot()
circle1 = plt.Circle((2, 2), radius=5, fill = False)
ax.add_patch(circle1)
ax.axis('equal')
plt.show()
My question: How can I combine both geometries mentioned above in a one plot. I would like to place my circle around my geometry (object). Geometry has a centroid (2, 2) and I want to place my circle's centroid exactly on the centroid of geometry therefore I will be having a circle around my geometry. What code I should write. Kindly help me on this.
For your reference: I want my plot just like in below picture.
you need to do all the plotting between the subplot creation and before you issue the plt.show() command, as any command after it will create a new figure.
from matplotlib import pyplot as plt, patches
plt.rcParams["figure.figsize"] = [9.00, 6.50]
plt.rcParams["figure.autolayout"] = True
fig = plt.figure()
ax = fig.add_subplot()
# other plt.scatter or plt.plot here
plt.scatter([3,4,5,6,4],[5,4,2,3,2]) # example
circle1 = plt.Circle((2, 2), radius=5, fill = False)
ax.add_patch(circle1)
ax.axis('equal')
plt.show()
image example
to get the points inside the circle, you need to play with the circle radius and center till you get it right.
something you can do is to make the circle at the np.median of your x and y values, so you are sure about the center position.

Control marker properties in seaborn pairwise boxplot

I'm trying to plot a boxplot for two different datasets on the same plot. The x axis are the hours in a day, while the y axis goes from 0 to 1 (let's call it Efficiency). I would like to have different markers for the means of each dataset' boxes. I use the 'meanprops' for seaborn but that changes the marker style for both datasets at the same time. I've added 2000 lines of data in the excel that can be downloaded here. The values might not coincide with the ones in the picture but should be enough.
Basically I want the red squares to be blue on the orange boxplot, and red on the blue boxplot. Here is what I managed to do so far:
I tried changing the meanprops by using a dictionary with the labels as keys , but it seems to be entering a loop (in PyCharm is says Evaluating...)
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
#make sure you have your path sorted out
group1 = pd.read_excel('group1.xls')
ax,fig = plt.subplots(figsize = (20,10))
#does not work
#ax = sns.boxplot(data=group1, x='hour', y='M1_eff', hue='labels',showfliers=False, showmeans=True,\
# meanprops={"marker":{'7':"s",'8':'s'},"markerfacecolor":{'7':"white",'8':'white'},
#"markeredgecolor":{'7':"blue",'8':'red'})
#works but produces similar markers
ax = sns.boxplot(data=group1, x='hour', y='M1_eff', hue='labels',showfliers=False, showmeans=True,\
meanprops={"marker":"s","markerfacecolor":"white", "markeredgecolor":"blue"})
plt.legend(title='Groups', loc=2, bbox_to_anchor=(1, 1),borderaxespad=0.5)
# Add transparency to colors
for patch in ax.artists:
r, g, b, a = patch.get_facecolor()
patch.set_facecolor((r, g, b, .4))
ax.set_xlabel("Hours",fontsize=14)
ax.set_ylabel("M1 Efficiency",fontsize=14)
ax.tick_params(labelsize=10)
plt.show()
I also tried the FacetGrid but to no avail (Stops at 'Evaluating...'):
g = sns.FacetGrid(group1, col="M1_eff", hue="labels",hue_kws=dict(marker=["^", "v"]))
g = (g.map(plt.boxplot, "hour", "M1_eff")
.add_legend())
g.show()
Any help is appreciated!
I don't think you can do this using sns.boxplot() directly. I think you'll have to draw the means "by hand"
N=100
df = pd.DataFrame({'hour':np.random.randint(0,3,size=(N,)),
'M1_eff': np.random.random(size=(N,)),
'labels':np.random.choice([7,8],size=(N,))})
x_col = 'hour'
y_col = 'M1_eff'
hue_col = 'labels'
width = 0.8
hue_order=[7,8]
marker_colors = ['red','blue']
# get the offsets used by boxplot when hue-nesting is used
# https://github.com/mwaskom/seaborn/blob/c73055b2a9d9830c6fbbace07127c370389d04dd/seaborn/categorical.py#L367
n_levels = len(hue_order)
each_width = width / n_levels
offsets = np.linspace(0, width - each_width, n_levels)
offsets -= offsets.mean()
fig, ax = plt.subplots()
ax = sns.boxplot(data=df, x=x_col, y=y_col, hue=hue_col, hue_order=hue_order, showfliers=False, showmeans=False)
means = df.groupby([hue_col,x_col])[y_col].mean()
for (gr,temp),o,c in zip(means.groupby(level=0),offsets,marker_colors):
ax.plot(np.arange(temp.values.size)+o, temp.values, 's', c=c)

How to plot fill_betweenx to fill the area between y1 and y2 with different scales using matplotlib.pyplot?

I am trying to fill the area between two vertical curves(RHOB and NPHI) using matplotlib.pyplot. Both RHOB and NPHI are having different scale of x-axis.
But when i try to plot i noticed that the fill_between is filling the area between RHOB and NPHI in the same scale.
#well_data is the data frame i am reading to get my data
#creating my subplot
fig, ax=plt.subplots(1,2,figsize=(8,6),sharey=True)
ax[0].get_xaxis().set_visible(False)
ax[0].invert_yaxis()
#subplot 1:
#ax01 to house the NPHI curve (NPHI curve are having values between 0-45)
ax01=ax[0].twiny()
ax01.set_xlim(-15,45)
ax01.invert_xaxis()
ax01.set_xlabel('NPHI',color='blue')
ax01.spines['top'].set_position(('outward',0))
ax01.tick_params(axis='x',colors='blue')
ax01.plot(well_data.NPHI,well_data.index,color='blue')
#ax02 to house the RHOB curve (RHOB curve having values between 1.95,2.95)
ax02=ax[0].twiny()
ax02.set_xlim(1.95,2.95)
ax02.set_xlabel('RHOB',color='red')
ax02.spines['top'].set_position(('outward',40))
ax02.tick_params(axis='x',colors='red')
ax02.plot(well_data.RHOB,well_data.index,color='red')
# ax03=ax[0].twiny()
# ax03.set_xlim(0,50)
# ax03.spines['top'].set_position(('outward',80))
# ax03.fill_betweenx(well_data.index,well_data.RHOB,well_data.NPHI,alpha=0.5)
plt.show()
ax03=ax[0].twiny()
ax03.set_xlim(0,50)
ax03.spines['top'].set_position(('outward',80))
ax03.fill_betweenx(well_data.index,well_data.RHOB,well_data.NPHI,alpha=0.5)
above is the code that i tried, but the end result is not what i expected.
it is filling area between RHOB and NPHI assuming RHOB and NPHI is in the same scale.
How can i fill the area between the blue and the red curve?
Since the data are on two different axes, but each artist needs to be on one axes alone, this is hard. What would need to be done here is to calculate all data in a single unit system. You might opt to transform both datasets to display-space first (meaning pixels), then plot those transformed data via fill_betweenx without transforming again (transform=None).
import numpy as np
import matplotlib.pyplot as plt
y = np.linspace(0, 22, 101)
x1 = np.sin(y)/2
x2 = np.cos(y/2)+20
fig, ax1 = plt.subplots()
ax2 = ax1.twiny()
ax1.tick_params(axis="x", colors="C0", labelcolor="C0")
ax2.tick_params(axis="x", colors="C1", labelcolor="C1")
ax1.set_xlim(-1,3)
ax2.set_xlim(15,22)
ax1.plot(x1,y, color="C0")
ax2.plot(x2,y, color="C1")
x1p, yp = ax1.transData.transform(np.c_[x1,y]).T
x2p, _ = ax2.transData.transform(np.c_[x2,y]).T
ax1.autoscale(False)
ax1.fill_betweenx(yp, x1p, x2p, color="C9", alpha=0.4, transform=None)
plt.show()
We might equally opt to transform the data from the second axes to the first. This has the advantage that it's not defined in pixel space and hence circumvents a problem that occurs when the figure size is changed after the figure is created.
x2p, _ = (ax2.transData + ax1.transData.inverted()).transform(np.c_[x2,y]).T
ax1.autoscale(False)
ax1.fill_betweenx(y, x1, x2p, color="grey", alpha=0.4)

Using "hue" for a Seaborn visual: how to get legend in one graph?

I created a scatter plot in seaborn using seaborn.relplot, but am having trouble putting the legend all in one graph.
When I do this simple way, everything works fine:
import pandas as pd
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
import seaborn as sns
df2 = df[df.ln_amt_000s < 700]
sns.relplot(x='ln_amt_000s', y='hud_med_fm_inc', hue='outcome', size='outcome', legend='brief', ax=ax, data=df2)
The result is a scatter plot as desired, with the legend on the right hand side.
However, when I try to generate a matplotlib figure and axes objects ahead of time to specify the figure dimensions I run into problems:
a4_dims = (10, 10) # generating a matplotlib figure and axes objects ahead of time to specify figure dimensions
df2 = df[df.ln_amt_000s < 700]
fig, ax = plt.subplots(figsize = a4_dims)
sns.relplot(x='ln_amt_000s', y='hud_med_fm_inc', hue='outcome', size='outcome', legend='brief', ax=ax, data=df2)
The result is two graphs -- one that has the scatter plots as expected but missing the legend, and another one below it that is all blank except for the legend on the right hand side.
How do I fix this such? My desired result is one graph where I can specify the figure dimensions and have the legend at the bottom in two rows, below the x-axis (if that is too difficult, or not supported, then the default legend position to the right on the same graph would work too)? I know the problem lies with "ax=ax", and in the way I am specifying the dimensions as matplotlib figure, but I'd like to know specifically why this causes a problem so I can learn from this.
Thank you for your time.
The issue is that sns.relplot is a "Figure-level interface for drawing relational plots onto a FacetGrid" (see the API page). With a simple sns.scatterplot (the default type of plot used by sns.relplot), your code works (changed to use reproducible data):
df = pd.read_csv("https://vincentarelbundock.github.io/Rdatasets/csv/datasets/iris.csv", index_col=0)
fig, ax = plt.subplots(figsize = (5,5))
sns.scatterplot(x = 'Sepal.Length', y = 'Sepal.Width',
hue = 'Species', legend = 'brief',
ax=ax, data = df)
plt.show()
Further edits to legend
Seaborn's legends are a bit finicky. Some tweaks you may want to employ:
Remove the default seaborn title, which is actually a legend entry, by getting and slicing the handles and labels
Set a new title that is actually a title
Move the location and make use of bbox_to_anchor to move outside the plot area (note that the bbox parameters need some tweaking depending on your plot size)
Specify the number of columns
fig, ax = plt.subplots(figsize = (5,5))
sns.scatterplot(x = 'Sepal.Length', y = 'Sepal.Width',
hue = 'Species', legend = 'brief',
ax=ax, data = df)
handles, labels = ax.get_legend_handles_labels()
ax.legend(handles=handles[1:], labels=labels[1:], loc=8,
ncol=2, bbox_to_anchor=[0.5,-.3,0,0])
plt.show()

How to combine gridspec with plt.subplots() to eliminate space between rows of subplots

I am trying to plot multiple images in subplots and either eliminate the space between subplots (horizontal and vertical) or control it. I tried to use the suggestion in How to Use GridSpec.... I also tried here but they are not using subplots(): space between subplots
I am able to eliminate the horizontal space but not the vertical space with what I am doing in the code below. Please do not mark as duplicate as I have tried the other posts and they do not do what I want. My code is shown below. Maybe there is another keyword argument that I need in the gridspec_kw dictionary?
I want to use plt.subplots() not plt.subplot() for this. In case it matters, the images are not square they are rectangular. I also tried adding f.tight_layout(h_pad=0,w_pad=0) before plt.show() but it did not change anything.
def plot_image_array_with_angles(img_array,correct_angles,predict_angles,
fontsize=10,figsize=(8,8)):
'''
Imports:
import matplotlib.gridspec as gridspec
import numpy as np
import matplotlib.pyplot as plt
'''
num_images = len(img_array)
grid = int(np.sqrt(num_images)) # will only show all images if square
#f, axarr = plt.subplots(grid,grid,figsize=figsize)
f, axarr = plt.subplots(grid,grid,figsize=figsize,
gridspec_kw={'wspace':0,'hspace':0})
im = 0
for row in range(grid):
for col in range(grid):
axarr[row,col].imshow(img_array[im])
title = 'cor = ' + str(correct_angles[im]) + ' pred = ' + str(predict_angles[im])
axarr[row,col].set_title(title,fontsize=fontsize)
axarr[row,col].axis('off') # turns off all ticks
#axarr[row,col].set_aspect('equal')
im += 1
plt.show()
return
The aspect ratio of an imshow plot is automatically set such that pixels in the image are squared. This setting is stronger than any of the subplots_adjust or gridspec settings for spacing. Or in other words you cannot directly control the spacing between subplots if those subplots have their aspect set to "equal".
First obvious solution is to set the image aspect to automatic ax.set_aspect("auto"). This solves the problem of subplot spacing, but distorts the images.
The other option is to adjust the figure margins and the figure size such that the spacing between the subplots is as desired.
Let's say figh and figw are the figure height and width in inch, and s the subplot width in inch. The margins are bottom, top, left and right (relative to figure size) and the spacings hspace in vertical and wspace in horizontal direction (relative to subplot size). The number of rows is denoted n and the number of columns m. The aspect is the ratio between subplot (image) height and width (aspect = image height / image width).
Then the dimensions can be set via
fig, axes = plt.subplots(nrows=n, ncols=m, figsize=(figwidth, figheight))
plt.subplots_adjust(top=top, bottom=bottom, left=left, right=right,
wspace=wspace, hspace=hspace)
The respective values can be calculated according to:
Or, if the margins are the same:
An example:
import matplotlib.pyplot as plt
image = plt.imread("https://i.stack.imgur.com/9qe6z.png")
aspect = image.shape[0]/float(image.shape[1])
print aspect
n = 2 # number of rows
m = 4 # numberof columns
bottom = 0.1; left=0.05
top=1.-bottom; right = 1.-left
fisasp = (1-bottom-(1-top))/float( 1-left-(1-right) )
#widthspace, relative to subplot size
wspace=0.15 # set to zero for no spacing
hspace=wspace/float(aspect)
#fix the figure height
figheight= 3 # inch
figwidth = (m + (m-1)*wspace)/float((n+(n-1)*hspace)*aspect)*figheight*fisasp
fig, axes = plt.subplots(nrows=n, ncols=m, figsize=(figwidth, figheight))
plt.subplots_adjust(top=top, bottom=bottom, left=left, right=right,
wspace=wspace, hspace=hspace)
for ax in axes.flatten():
ax.imshow(image)
ax.set_title("title",fontsize=10)
ax.axis('off')
plt.show()

Resources