Drawing very small shapes (size in µm) with python - python-3.x

I want to create "L" shapes black and white structure on a 20x20 mm figure. Each L shape width and length are defined as uw, ul, lw and ll (see code). A sper my understanding matplotlib works with points per inch (PPI) of 72 and with linewidth of 1, the shape will be 1/72 inch wide. I cannot understand how I can make these figures big enough to be visible when I use plt.show() and save them in the size I want (i.e. 20x20 mm page and each L with their exact shape size with high DPI so that I can view it when I open the saved figure). My code is:
import matplotlib.pyplot as plt
import numpy as np
uw = 20e-6 #upper width in meters
ul = 100e-6 #upper length in meters
lw = 20e-6 #lower width in meters
ll = 100e-6 #lower length in meters
w_space = 50e-6 #width spacing for subplots
h_space = 50e-6 #height spacing for subplots
N = 40
coord = [[0,0], [ll,0], [ll,lw], [uw,lw], [uw,ul], [0,ul]]
coord.append(coord[0]) #repeat the first point to create a 'closed loop'
xs, ys = zip(*coord) #create lists of x and y values
fig = plt.figure(num=None, figsize=(0.1, 0.1), dpi=100, facecolor='w', edgecolor='k') #figsize cannot be chosen below 0.1
for i in range(N):
ax = fig.add_subplot(5,10,i+1)
ax.fill(xs,ys,'k',linewidth=1)
plt.axis('off')
plt.subplots_adjust(wspace = w_space, hspace = h_space)
plt.savefig('screenshots/L_shape.png' ,bbox_inches = 'tight', pad_inches = 0, dpi=10000)
plt.show()

Related

Divide a circle into n number of equal pixels in Python

Question background: In python, I am working on a task in which I have to project the nodes of geometry (X and Y coordinates). I have plotted a graph which shows Geometry and a circle around the geometry as shown in picture below using below code.
from matplotlib import pyplot as plt, patches
plt.rcParams["figure.figsize"] = [9.00, 6.50]
plt.rcParams["figure.autolayout"] = True
fig = plt.figure()
ax = fig.add_subplot()
# other plt.scatter or plt.plot here
plt.scatter(x_new, y_new) # x_new and y_new is a list of coordinates
circle1 = plt.Circle((0, 0), radius=4, fill = False) # (0, 0) is a centre of circle with radius 4
ax.add_patch(circle1)
ax.axis('equal')
plt.show()
My Question: I have to divide the circle into 36 pixels. I do not have clue at the moment what code I should write to do this. I want my result like in the picture below. Kindly help me on this.

How to draw vertical average lines for overlapping histograms in a loop

I'm trying to draw with matplotlib two average vertical line for every overlapping histograms using a loop. I have managed to draw the first one, but I don't know how to draw the second one. I'm using two variables from a dataset to draw the histograms. One variable (feat) is categorical (0 - 1), and the other one (objective) is numerical. The code is the following:
for chas in df[feat].unique():
plt.hist(df.loc[df[feat] == chas, objective], bins = 15, alpha = 0.5, density = True, label = chas)
plt.axvline(df[objective].mean(), linestyle = 'dashed', linewidth = 2)
plt.title(objective)
plt.legend(loc = 'upper right')
I also have to add to the legend the mean and standard deviation values for each histogram.
How can I do it? Thank you in advance.
I recommend you using axes to plot your figure. Pls see code below and the artist tutorial here.
import numpy as np
import matplotlib.pyplot as plt
# Fixing random state for reproducibility
np.random.seed(19680801)
mu1, sigma1 = 100, 8
mu2, sigma2 = 150, 15
x1 = mu1 + sigma1 * np.random.randn(10000)
x2 = mu2 + sigma2 * np.random.randn(10000)
fig, ax = plt.subplots(1, 1, figsize=(7.2, 7.2))
# the histogram of the data
lbs = ['a', 'b']
colors = ['r', 'g']
for i, x in enumerate([x1, x2]):
n, bins, patches = ax.hist(x, 50, density=True, facecolor=colors[i], alpha=0.75, label=lbs[i])
ax.axvline(bins.mean())
ax.legend()

How to use `extent` in matplotlib ax.imshow() without changing the positions of the overlayed ax.text() handles?

I am trying to annotate a heatmap. The matplotlib docs present an example, which suggests creating a helper function to format the annotations. I feel there must be a simpler way to do what I want. I can annotate inside the boxes of the heatmap, but these texts change position when editing the extent of the heatmap. My question is how to use extent in ax.imshow(...) while also using ax.text(...) to annotate the correct positions. Below is an example:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import Normalize
def get_manhattan_distance_matrix(coordinates):
shape = (coordinates.shape[0], 1, coordinates.shape[1])
ct = coordinates.reshape(shape)
displacement = coordinates - ct
return np.sum(np.abs(displacement), axis=-1)
x = np.arange(11)[::-1]
y = x.copy()
coordinates = np.array([x, y]).T
distance_matrix = get_manhattan_distance_matrix(coordinates)
# print("\n .. {} COORDINATES:\n{}\n".format(coordinates.shape, coordinates))
# print("\n .. {} DISTANCE MATRIX:\n{}\n".format(distance_matrix.shape, distance_matrix))
norm = Normalize(vmin=np.min(distance_matrix), vmax=np.max(distance_matrix))
This is where to modify the value of extent.
extent = (np.min(x), np.max(x), np.min(y), np.max(y))
# extent = None
According to the matplotlib docs, the default extent is None.
fig, ax = plt.subplots()
handle = ax.imshow(distance_matrix, cmap='plasma', norm=norm, interpolation='nearest', origin='upper', extent=extent)
kws = dict(ha='center', va='center', color='gray', weight='semibold', fontsize=5)
for i in range(len(distance_matrix)):
for j in range(len(distance_matrix[i])):
if i == j:
ax.text(j, i, '', **kws)
else:
ax.text(j, i, distance_matrix[i, j], **kws)
plt.show()
plt.close(fig)
One can generate two figures by modifying extent - simply uncomment the commented line and comment the uncommented line. The two figures are below:
One can see that by setting extent, the pixel locations change, which in turn changes the positions of the ax.text(...) handles. Is there a simple solution to fix this - that is, set an arbitrary extent and still have the text handles centered in each box?
When extent=None, the effective extent is from -0.5 to 10.5 in both x and y. So the centers lie on the integer positions. Setting the extent from 0 to 10 doesn't align with the pixels. You'd have to multiply by 10/11 to get them right.
The best approach would be to set extent = (np.min(x)-0.5, np.max(x)+0.5, np.min(y)-0.5, np.max(y)+0.5) to get the centers back at integer positions.
Also note that default an image is displayed starting from the top, and that the y-axis is reversed. If you change the extent, to get the image upright, you need ax.imshow(..., origin='lower'). (The 0,0 pixel should be the blue one in the example plot.)
To put a text in the center of a pixel, you can add 0.5 to the horizontal index, divide by the width in pixels and multiply by the difference of the x-axis. And the similar calculation for the y-axis. To get better readability, the text color can be made dependent on the pixel color.
# ...
extent = (np.min(x), np.max(x), np.min(y), np.max(y))
x0, x1, y0, y1 = extent
fig, ax = plt.subplots()
handle = ax.imshow(distance_matrix, cmap='plasma', norm=norm, interpolation='nearest', origin='lower', extent=extent)
kws = dict(ha='center', va='center', weight='semibold', fontsize=5)
height = len(distance_matrix)
width = len(distance_matrix[0])
for i in range(height):
for j in range(width):
if i != j:
val = distance_matrix[i, j]
ax.text(x0 + (j + 0.5) / width * (x1 - x0), y0 + (i + 0.5) / height * (y1 - y0),
f'{val}\n{i},{j}', color='white' if norm(val) < 0.6 else 'black', **kws)
plt.show()

Proper reuse of Axes in GeoDataFrame.plot()

I want to draw a simple choropleth map of NYC with binned # of yellow cab rides. My gpd.DataFrame looks like this:
bin cnt shape
0 15 1 POLYGON ((-74.25559 40.62194, -74.24448 40.621...
1 16 1 POLYGON ((-74.25559 40.63033, -74.24448 40.630...
2 25 1 POLYGON ((-74.25559 40.70582, -74.24448 40.705...
3 27 1 POLYGON ((-74.25559 40.72260, -74.24448 40.722...
4 32 12 POLYGON ((-74.25559 40.76454, -74.24448 40.764...
where bin is a number of region, cnt is target variable of my plot and shape column is just a series of shapely rectangles composing one covering the whole New York.
Drawing NYC from shapefile:
usa = gpd.read_file('shapefiles/gadm36_USA_2.shp')[['NAME_1', 'NAME_2', 'geometry']]
nyc = usa[usa.NAME_1 == 'New York']
ax = plt.axes([0, 0, 2, 2], projection=ccrs.PlateCarree())
ax.set_extent([-74.25559, -73.70001, 40.49612, 40.91553], ccrs.Geodetic())
ax.add_geometries(nyc.geometry.values,
ccrs.PlateCarree(),
facecolor='#1A237E');
Drawing choropleth alone works fine:
gdf.plot(column='cnt',
cmap='inferno',
scheme='natural_breaks', k=10,
legend=True)
But if I put ax parameter:
gdf.plot(ax=ax, ...)
the output is
<Figure size 432x288 with 0 Axes>
EDIT:
Got it working with following code:
from matplotlib.colors import ListedColormap
cmap = plt.get_cmap('summer')
my_cmap = cmap(np.arange(cmap.N))
my_cmap[:,-1] = np.full((cmap.N, ), 0.75)
my_cmap = ListedColormap(my_cmap)
gax = gdf.plot(column='cnt',
cmap=my_cmap,
scheme='natural_breaks', k=10,
figsize=(16,10),
legend=True,
legend_kwds=dict(loc='best'))
gax.set_title('# of yellow cab rides in NYC', fontdict={'fontsize': 20}, loc='center');
nyc.plot(ax=gax,
color='#141414',
zorder=0)
gax.set_xlim(-74.25559, -73.70001)
gax.set_ylim(40.49612, 40.91553)
When only doing this with .plot calls from geopandas this seems to work fine. Had to make up some data as I don't have yours. Let me know if this helps somehow. Code example should work as is in IPython.
%matplotlib inline
import geopandas as gpd
import numpy as np
from shapely.geometry import Polygon
from random import random
crs = {'init': 'epsg:4326'}
num_squares = 10
# load natural earth shapes
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
# create random choropleth
minx, miny, maxx, maxy = world.geometry.total_bounds
x_coords = np.linspace(minx, maxx, num_squares+1)
y_coords = np.linspace(miny, maxy, num_squares+1)
polygons = [Polygon([[x_coords[i], y_coords[j]],
[x_coords[i+1], y_coords[j]],
[x_coords[i+1], y_coords[j+1]],
[x_coords[i], y_coords[j+1]]]) for i in
range(num_squares) for j in range(num_squares)]
vals = [random() for i in range(num_squares) for j in range(num_squares)]
choro_gdf = gpd.GeoDataFrame({'cnt' : vals, 'geometry' : polygons})
choro_gdf.crs = crs
# now plot both together
ax = choro_gdf.plot(column='cnt',
cmap='inferno',
scheme='natural_breaks', k=10,
#legend=True
)
world.plot(ax=ax)
This should give you something like the following
--Edit, if you're worried about setting the correct limits (as you're doing with the boroughs), please just paste the following to the end of the code (for example)
ax.set_xlim(0, 50)
ax.set_ylim(0, 25)
This should then give you:

Plotting multiple density curves on the same plot: weighting the subset categories in Python 3

I am trying to recreate this density plot in python 3: math.stackexchange.com/questions/845424/the-expected-outcome-of-a-random-game-of-chess
End Goal: I need my density plot to look like this
The area under the blue curve is equal to that of the red, green, and purple curves combined because the different outcomes (Draw, Black wins, and White wins) are the subset of the total (All).
How do I have python realize and plot this accordingly?
Here is the .csv file of results_df after 1000 simulations pastebin.com/YDVMx2DL
from matplotlib import pyplot as plt
import seaborn as sns
black = results_df.loc[results_df['outcome'] == 'Black']
white = results_df.loc[results_df['outcome'] == 'White']
draw = results_df.loc[results_df['outcome'] == 'Draw']
win = results_df.loc[results_df['outcome'] != 'Draw']
Total = len(results_df.index)
Wins = len(win.index)
PercentBlack = "Black Wins ≈ %s" %('{0:.2%}'.format(len(black.index)/Total))
PercentWhite = "White Wins ≈ %s" %('{0:.2%}'.format(len(white.index)/Total))
PercentDraw = "Draw ≈ %s" %('{0:.2%}'.format(len(draw.index)/Total))
AllTitle = 'Distribution of Moves by All Outcomes (nSample = %s)' %(workers)
sns.distplot(results_df.moves, hist=False, label = "All")
sns.distplot(black.moves, hist=False, label=PercentBlack)
sns.distplot(white.moves, hist=False, label=PercentWhite)
sns.distplot(draw.moves, hist=False, label=PercentDraw)
plt.title(AllTitle)
plt.ylabel('Density')
plt.xlabel('Number of Moves')
plt.legend()
plt.show()
The code above produces density curves without weights, which I really need to figure out how to generate density curve weights accordingly as well as preserve my labels in the legend
density curves, no weights; help
I also tried frequency histograms, that scaled the distribution heights correctly but I would rather keep the 4 curves overlaid on top of each other for a "cleaner" look...I don't like this frequency plot but this is my current fix at the moment.
results_df.moves.hist(alpha=0.4, bins=range(0, 700, 10), label = "All")
draw.moves.hist(alpha=0.4, bins=range(0, 700, 10), label = PercentDraw)
white.moves.hist(alpha=0.4, bins=range(0, 700, 10), label = PercentWhite)
black.moves.hist(alpha=0.4, bins=range(0, 700, 10), label = PercentBlack)
plt.title(AllTitle)
plt.ylabel('Frequency')
plt.xlabel('Number of Moves')
plt.legend()
plt.show()
If anyone can write the python 3 code that outputs the first plot with 4 density curves with correct subset weights as well as preserves the custom legend that show percentages, that would be much appreciated.
Once the density curves are plotted with the correct subset weights, I am also interested in the python 3 code in finding the max point coordinates of each density curve that shows max frequency of moves once I scale it up to 500,000 iterations.
Thanks
You need to be careful. The plot that you have produced is correct. All the curves shown are probability density functions of the underlying distributions.
In the plot that you want to have, only the curve labeled "All" is a probability density function. The other curves are not.
In any case, you will need to calculate the kernel density estimate yourself, if you want to scale it like shown in the desired plot. This can be done using scipy.stats.gaussial_kde().
In order to reproduce the desired plot, I see two options.
Calculate the kde for all involved cases and scale them with the number of samples.
import numpy as np; np.random.seed(0)
import matplotlib.pyplot as plt
import scipy.stats
a = np.random.gumbel(80, 25, 1000).astype(int)
b = np.random.gumbel(200, 46, 4000).astype(int)
kdea = scipy.stats.gaussian_kde(a)
kdeb = scipy.stats.gaussian_kde(b)
both = np.hstack((a,b))
kdeboth = scipy.stats.gaussian_kde(both)
grid = np.arange(500)
#weighted kde curves
wa = kdea(grid)*(len(a)/float(len(both)))
wb = kdeb(grid)*(len(b)/float(len(both)))
print "a.sum ", wa.sum()
print "b.sum ", wb.sum()
print "total.sum ", kdeb(grid).sum()
fig, ax = plt.subplots()
ax.plot(grid, wa, lw=1, label = "weighted a")
ax.plot(grid, wb, lw=1, label = "weighted b")
ax.plot(grid, kdeboth(grid), color="crimson", lw=2, label = "pdf")
plt.legend()
plt.show()
Calculate the kde for all individual cases, normalize their sum to obtain the total.
import numpy as np; np.random.seed(0)
import matplotlib.pyplot as plt
import scipy.stats
a = np.random.gumbel(80, 25, 1000).astype(int)
b = np.random.gumbel(200, 46, 4000).astype(int)
kdea = scipy.stats.gaussian_kde(a)
kdeb = scipy.stats.gaussian_kde(b)
grid = np.arange(500)
#weighted kde curves
wa = kdea(grid)*(len(a)/float(len(a)+len(b)))
wb = kdeb(grid)*(len(b)/float(len(a)+len(b)))
total = wa+wb
fig, ax = plt.subplots(figsize=(5,3))
ax.plot(grid, wa, lw=1, label = "weighted a")
ax.plot(grid, wb, lw=1, label = "weighted b")
ax.plot(grid, total, color="crimson", lw=2, label = "pdf")
plt.legend()
plt.show()

Resources