Proper reuse of Axes in GeoDataFrame.plot() - python-3.x

I want to draw a simple choropleth map of NYC with binned # of yellow cab rides. My gpd.DataFrame looks like this:
bin cnt shape
0 15 1 POLYGON ((-74.25559 40.62194, -74.24448 40.621...
1 16 1 POLYGON ((-74.25559 40.63033, -74.24448 40.630...
2 25 1 POLYGON ((-74.25559 40.70582, -74.24448 40.705...
3 27 1 POLYGON ((-74.25559 40.72260, -74.24448 40.722...
4 32 12 POLYGON ((-74.25559 40.76454, -74.24448 40.764...
where bin is a number of region, cnt is target variable of my plot and shape column is just a series of shapely rectangles composing one covering the whole New York.
Drawing NYC from shapefile:
usa = gpd.read_file('shapefiles/gadm36_USA_2.shp')[['NAME_1', 'NAME_2', 'geometry']]
nyc = usa[usa.NAME_1 == 'New York']
ax = plt.axes([0, 0, 2, 2], projection=ccrs.PlateCarree())
ax.set_extent([-74.25559, -73.70001, 40.49612, 40.91553], ccrs.Geodetic())
ax.add_geometries(nyc.geometry.values,
ccrs.PlateCarree(),
facecolor='#1A237E');
Drawing choropleth alone works fine:
gdf.plot(column='cnt',
cmap='inferno',
scheme='natural_breaks', k=10,
legend=True)
But if I put ax parameter:
gdf.plot(ax=ax, ...)
the output is
<Figure size 432x288 with 0 Axes>
EDIT:
Got it working with following code:
from matplotlib.colors import ListedColormap
cmap = plt.get_cmap('summer')
my_cmap = cmap(np.arange(cmap.N))
my_cmap[:,-1] = np.full((cmap.N, ), 0.75)
my_cmap = ListedColormap(my_cmap)
gax = gdf.plot(column='cnt',
cmap=my_cmap,
scheme='natural_breaks', k=10,
figsize=(16,10),
legend=True,
legend_kwds=dict(loc='best'))
gax.set_title('# of yellow cab rides in NYC', fontdict={'fontsize': 20}, loc='center');
nyc.plot(ax=gax,
color='#141414',
zorder=0)
gax.set_xlim(-74.25559, -73.70001)
gax.set_ylim(40.49612, 40.91553)

When only doing this with .plot calls from geopandas this seems to work fine. Had to make up some data as I don't have yours. Let me know if this helps somehow. Code example should work as is in IPython.
%matplotlib inline
import geopandas as gpd
import numpy as np
from shapely.geometry import Polygon
from random import random
crs = {'init': 'epsg:4326'}
num_squares = 10
# load natural earth shapes
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
# create random choropleth
minx, miny, maxx, maxy = world.geometry.total_bounds
x_coords = np.linspace(minx, maxx, num_squares+1)
y_coords = np.linspace(miny, maxy, num_squares+1)
polygons = [Polygon([[x_coords[i], y_coords[j]],
[x_coords[i+1], y_coords[j]],
[x_coords[i+1], y_coords[j+1]],
[x_coords[i], y_coords[j+1]]]) for i in
range(num_squares) for j in range(num_squares)]
vals = [random() for i in range(num_squares) for j in range(num_squares)]
choro_gdf = gpd.GeoDataFrame({'cnt' : vals, 'geometry' : polygons})
choro_gdf.crs = crs
# now plot both together
ax = choro_gdf.plot(column='cnt',
cmap='inferno',
scheme='natural_breaks', k=10,
#legend=True
)
world.plot(ax=ax)
This should give you something like the following
--Edit, if you're worried about setting the correct limits (as you're doing with the boroughs), please just paste the following to the end of the code (for example)
ax.set_xlim(0, 50)
ax.set_ylim(0, 25)
This should then give you:

Related

Divide a circle into n number of equal pixels in Python

Question background: In python, I am working on a task in which I have to project the nodes of geometry (X and Y coordinates). I have plotted a graph which shows Geometry and a circle around the geometry as shown in picture below using below code.
from matplotlib import pyplot as plt, patches
plt.rcParams["figure.figsize"] = [9.00, 6.50]
plt.rcParams["figure.autolayout"] = True
fig = plt.figure()
ax = fig.add_subplot()
# other plt.scatter or plt.plot here
plt.scatter(x_new, y_new) # x_new and y_new is a list of coordinates
circle1 = plt.Circle((0, 0), radius=4, fill = False) # (0, 0) is a centre of circle with radius 4
ax.add_patch(circle1)
ax.axis('equal')
plt.show()
My Question: I have to divide the circle into 36 pixels. I do not have clue at the moment what code I should write to do this. I want my result like in the picture below. Kindly help me on this.

How to draw vertical average lines for overlapping histograms in a loop

I'm trying to draw with matplotlib two average vertical line for every overlapping histograms using a loop. I have managed to draw the first one, but I don't know how to draw the second one. I'm using two variables from a dataset to draw the histograms. One variable (feat) is categorical (0 - 1), and the other one (objective) is numerical. The code is the following:
for chas in df[feat].unique():
plt.hist(df.loc[df[feat] == chas, objective], bins = 15, alpha = 0.5, density = True, label = chas)
plt.axvline(df[objective].mean(), linestyle = 'dashed', linewidth = 2)
plt.title(objective)
plt.legend(loc = 'upper right')
I also have to add to the legend the mean and standard deviation values for each histogram.
How can I do it? Thank you in advance.
I recommend you using axes to plot your figure. Pls see code below and the artist tutorial here.
import numpy as np
import matplotlib.pyplot as plt
# Fixing random state for reproducibility
np.random.seed(19680801)
mu1, sigma1 = 100, 8
mu2, sigma2 = 150, 15
x1 = mu1 + sigma1 * np.random.randn(10000)
x2 = mu2 + sigma2 * np.random.randn(10000)
fig, ax = plt.subplots(1, 1, figsize=(7.2, 7.2))
# the histogram of the data
lbs = ['a', 'b']
colors = ['r', 'g']
for i, x in enumerate([x1, x2]):
n, bins, patches = ax.hist(x, 50, density=True, facecolor=colors[i], alpha=0.75, label=lbs[i])
ax.axvline(bins.mean())
ax.legend()

Visualize terrain ground elevation and water depth in the same plot

I would like to get some tips on how to properly visualize/plot two 2-dimensional arrays of the same shape,
say ground_arr and water_arr. ground_arr represents the elevation of some surface, and water_arr represents the height/depth of water on top of that surface. The total elevation is then ofc ground_arr + water_arr.
For now im using plt.imshow(water_arr, cmap=...) to only see the water and plt.imshow(water_arr+ ground_arr) to see the total elevation but i would like to merge both of them in the same plot, to get some map alike plot.
Any tips?
Supposing you have 2D arrays of height values for the terrain and for the water level. And that the water level is set to zero at the places without water.
Just set the water level to Nan where you want the water image to be transparent.
import numpy as np
import matplotlib.pyplot as plt
# Generate test data, terrain is some sine on the distance to the center
terrain_x, terrain_y = np.meshgrid(np.linspace(-15, 15, 1000), np.linspace(-15, 15, 1000))
r = np.sqrt(terrain_x * terrain_x + terrain_y * terrain_y)
terrain_z = 5 + 5 * np.sin(r)
# test data for water has some height where r is between 3 and 4 pi, zero everywhere else
water_z = np.where(3 * np.pi < r, 3 - terrain_z, 0)
water_z = np.where(4 * np.pi > r, water_z, 0)
extent = [-15, 15, -15, 15]
fig, (ax1, ax2, ax3) = plt.subplots(ncols=3)
ax1.imshow(terrain_z, cmap="YlOrBr", extent=extent)
ax1.set_title('Terrain')
ax2.imshow(water_z, cmap="Blues", extent=extent)
ax2.set_title('Water')
ax3.imshow(terrain_z, cmap="YlOrBr", extent=extent)
water_z = np.where(water_z > 0, water_z, np.nan)
ax3.imshow(water_z, cmap="Blues", extent=extent)
ax3.set_title('Combined')
plt.show()

Drawing very small shapes (size in µm) with python

I want to create "L" shapes black and white structure on a 20x20 mm figure. Each L shape width and length are defined as uw, ul, lw and ll (see code). A sper my understanding matplotlib works with points per inch (PPI) of 72 and with linewidth of 1, the shape will be 1/72 inch wide. I cannot understand how I can make these figures big enough to be visible when I use plt.show() and save them in the size I want (i.e. 20x20 mm page and each L with their exact shape size with high DPI so that I can view it when I open the saved figure). My code is:
import matplotlib.pyplot as plt
import numpy as np
uw = 20e-6 #upper width in meters
ul = 100e-6 #upper length in meters
lw = 20e-6 #lower width in meters
ll = 100e-6 #lower length in meters
w_space = 50e-6 #width spacing for subplots
h_space = 50e-6 #height spacing for subplots
N = 40
coord = [[0,0], [ll,0], [ll,lw], [uw,lw], [uw,ul], [0,ul]]
coord.append(coord[0]) #repeat the first point to create a 'closed loop'
xs, ys = zip(*coord) #create lists of x and y values
fig = plt.figure(num=None, figsize=(0.1, 0.1), dpi=100, facecolor='w', edgecolor='k') #figsize cannot be chosen below 0.1
for i in range(N):
ax = fig.add_subplot(5,10,i+1)
ax.fill(xs,ys,'k',linewidth=1)
plt.axis('off')
plt.subplots_adjust(wspace = w_space, hspace = h_space)
plt.savefig('screenshots/L_shape.png' ,bbox_inches = 'tight', pad_inches = 0, dpi=10000)
plt.show()

Plotting multiple density curves on the same plot: weighting the subset categories in Python 3

I am trying to recreate this density plot in python 3: math.stackexchange.com/questions/845424/the-expected-outcome-of-a-random-game-of-chess
End Goal: I need my density plot to look like this
The area under the blue curve is equal to that of the red, green, and purple curves combined because the different outcomes (Draw, Black wins, and White wins) are the subset of the total (All).
How do I have python realize and plot this accordingly?
Here is the .csv file of results_df after 1000 simulations pastebin.com/YDVMx2DL
from matplotlib import pyplot as plt
import seaborn as sns
black = results_df.loc[results_df['outcome'] == 'Black']
white = results_df.loc[results_df['outcome'] == 'White']
draw = results_df.loc[results_df['outcome'] == 'Draw']
win = results_df.loc[results_df['outcome'] != 'Draw']
Total = len(results_df.index)
Wins = len(win.index)
PercentBlack = "Black Wins ≈ %s" %('{0:.2%}'.format(len(black.index)/Total))
PercentWhite = "White Wins ≈ %s" %('{0:.2%}'.format(len(white.index)/Total))
PercentDraw = "Draw ≈ %s" %('{0:.2%}'.format(len(draw.index)/Total))
AllTitle = 'Distribution of Moves by All Outcomes (nSample = %s)' %(workers)
sns.distplot(results_df.moves, hist=False, label = "All")
sns.distplot(black.moves, hist=False, label=PercentBlack)
sns.distplot(white.moves, hist=False, label=PercentWhite)
sns.distplot(draw.moves, hist=False, label=PercentDraw)
plt.title(AllTitle)
plt.ylabel('Density')
plt.xlabel('Number of Moves')
plt.legend()
plt.show()
The code above produces density curves without weights, which I really need to figure out how to generate density curve weights accordingly as well as preserve my labels in the legend
density curves, no weights; help
I also tried frequency histograms, that scaled the distribution heights correctly but I would rather keep the 4 curves overlaid on top of each other for a "cleaner" look...I don't like this frequency plot but this is my current fix at the moment.
results_df.moves.hist(alpha=0.4, bins=range(0, 700, 10), label = "All")
draw.moves.hist(alpha=0.4, bins=range(0, 700, 10), label = PercentDraw)
white.moves.hist(alpha=0.4, bins=range(0, 700, 10), label = PercentWhite)
black.moves.hist(alpha=0.4, bins=range(0, 700, 10), label = PercentBlack)
plt.title(AllTitle)
plt.ylabel('Frequency')
plt.xlabel('Number of Moves')
plt.legend()
plt.show()
If anyone can write the python 3 code that outputs the first plot with 4 density curves with correct subset weights as well as preserves the custom legend that show percentages, that would be much appreciated.
Once the density curves are plotted with the correct subset weights, I am also interested in the python 3 code in finding the max point coordinates of each density curve that shows max frequency of moves once I scale it up to 500,000 iterations.
Thanks
You need to be careful. The plot that you have produced is correct. All the curves shown are probability density functions of the underlying distributions.
In the plot that you want to have, only the curve labeled "All" is a probability density function. The other curves are not.
In any case, you will need to calculate the kernel density estimate yourself, if you want to scale it like shown in the desired plot. This can be done using scipy.stats.gaussial_kde().
In order to reproduce the desired plot, I see two options.
Calculate the kde for all involved cases and scale them with the number of samples.
import numpy as np; np.random.seed(0)
import matplotlib.pyplot as plt
import scipy.stats
a = np.random.gumbel(80, 25, 1000).astype(int)
b = np.random.gumbel(200, 46, 4000).astype(int)
kdea = scipy.stats.gaussian_kde(a)
kdeb = scipy.stats.gaussian_kde(b)
both = np.hstack((a,b))
kdeboth = scipy.stats.gaussian_kde(both)
grid = np.arange(500)
#weighted kde curves
wa = kdea(grid)*(len(a)/float(len(both)))
wb = kdeb(grid)*(len(b)/float(len(both)))
print "a.sum ", wa.sum()
print "b.sum ", wb.sum()
print "total.sum ", kdeb(grid).sum()
fig, ax = plt.subplots()
ax.plot(grid, wa, lw=1, label = "weighted a")
ax.plot(grid, wb, lw=1, label = "weighted b")
ax.plot(grid, kdeboth(grid), color="crimson", lw=2, label = "pdf")
plt.legend()
plt.show()
Calculate the kde for all individual cases, normalize their sum to obtain the total.
import numpy as np; np.random.seed(0)
import matplotlib.pyplot as plt
import scipy.stats
a = np.random.gumbel(80, 25, 1000).astype(int)
b = np.random.gumbel(200, 46, 4000).astype(int)
kdea = scipy.stats.gaussian_kde(a)
kdeb = scipy.stats.gaussian_kde(b)
grid = np.arange(500)
#weighted kde curves
wa = kdea(grid)*(len(a)/float(len(a)+len(b)))
wb = kdeb(grid)*(len(b)/float(len(a)+len(b)))
total = wa+wb
fig, ax = plt.subplots(figsize=(5,3))
ax.plot(grid, wa, lw=1, label = "weighted a")
ax.plot(grid, wb, lw=1, label = "weighted b")
ax.plot(grid, total, color="crimson", lw=2, label = "pdf")
plt.legend()
plt.show()

Resources