Working with Grid Cells in a GeoDataFrame - python-3.x

I have a GeoDataFrame filled with Points from a given city xy, which I loaded from osmnx package.
If I plot this, I'm getting the Longitude and Latitude on the plot as x and y Axis. (See picture)
I want to create a better grid, which is based on 100x100 meters and Not on longitude and latitude
I also want to access these grid cells so that I can iterate through them and even index them, example top left grid cell should have Cell_ID = "1". the next one "2"
What I have so far
Packages:
import pandas as pd
import geopandas as gpd
import numpy as np
from shapely.geometry import Point, Polygon, LineString
%matplotlib inline
import matplotlib.pyplot as plt
import shapely
import plotly_express as px
import networkx as nx
import osmnx as ox
ox.config(use_cache=True, log_console=True)
Create the Graph Function
def create_graph(loc, dist, transport_mode, loc_type="address"):
"""Transport mode = ‘walk’, ‘bike’, ‘drive’, ‘drive_service’, ‘all’, ‘all_private’, ‘none’"""
if loc_type == "address":
V = ox.graph_from_address(loc, dist=dist, network_type=transport_mode)
elif loc_type == "points":
V = ox.graph_from_point(loc, dist=dist, network_type=transport_mode )
return V
Enter City:
V = create_graph("Enter a city here", 2500, "drive")
ox.plot_graph(V)
# Retrieve nodes and edges
nodes, edges = ox.graph_to_gdfs(V)
Put a Grid on it and plot it:
pcproj = ccrs.PlateCarree()
fig = plt.figure(figsize=(12, 8))
extent =[16.01,16.10, 48.305, 48.345] #lonmin, lonmax, latmin, latmax
ax = plt.axes(projection= pcproj )
ax.set_extent(extent, crs=pcproj)
lon_grid = np.arange(16.0, 16.09, 0.01)
lat_grid = np.arange(48.310, 48.340, 0.005)
gl = ax.gridlines(draw_labels=True,
xlocs=lon_grid, ylocs=lat_grid,
x_inline=False, y_inline=False,
color='r', linestyle='dotted')
ax = nodes.plot(ax=ax, edgecolor='k', lw=0.9)
ax.set_title("Gridded Version : Some_City Points")
plt.show()

Related

creating a rainfall colormap for points inside a watershed polygon

I really appreciate your help in developing my code since I am not an expert in python. I attempt to write a code to be able to:
Read all the points (longitude, latitude, cumulative forecasted rainfall for 24, 48, and 72 hours) from a csv file (Mean_PCP_REPS_12_20220809_Gridded.csv).
Read the polygon representing the watershed boundary (NelsonRiverBasin.shp).
Mask/remove the points outside of the watershed polygon.
Create a rainfall colormap image or raster for the points inside the watershed polygon.
Color boundaries should be based on rainfall value. I defined the rainfall range for each color in my code.
I tried many ways but I was not successful in creating an image or raster with desired color map (please click here as an example of the intended image). My python code is as follows. It creates and saves "New_ras.tiff" but my code cannot remap the colors of this image based on the range of rainfall after its creation.
from __future__ import division
import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt
from shapely.geometry import Point, Polygon, MultiPolygon
import operator
#extending the code
import os
from matplotlib.patches import Patch
from matplotlib.colors import ListedColormap
import matplotlib.colors as colors
import seaborn as sns
import numpy as np
import rioxarray as rxr
import earthpy as et
import earthpy.plot as ep
from scipy.interpolate import griddata #added code up to here
import rasterio
# load the data that should be cropped by the polygon
# this assumes that the csv file already includes
# a geometry column with point data as performed below
dat_gpd = pd.read_csv(r'Mean_PCP_REPS_12_20220809_Gridded.csv')
# make shapely points out of the X and Y coordinates
point_data = [Point(xy) for xy in zip(dat_gpd.iloc[:,0], dat_gpd.iloc[:,1])]
all_pts = list(zip(dat_gpd.iloc[:,0], dat_gpd.iloc[:,1]))
# assign shapely points as geometry to a geodataframe
# Like this you can also inspect the individual points if needed
arr_gpd = gpd.GeoDataFrame(dat_gpd, crs=4269, geometry=point_data)
# assign defined polygon to a new dataframe
nlpoly = gpd.read_file('NelsonRiverBasin.shp')
nlpoly = nlpoly.to_crs('epsg:4269')
mask = [nlpoly.contains(Point(p)).any() for p in all_pts]
# define a new dataframe from the spatial join of the dataframe with the data to be cropped
# and the dataframe with the polygon data, using the within function.
#dat_fin = gpd.sjoin(arr_gpd, nlpoly[['OCEAN_EN', 'COUNT', 'geometry']], predicate = 'within')
#dat_fin = dat_fin.to_crs('epsg:4326')
#dat_fin.plot(column= 'Hr72')
#plt.savefig('Raster2.tiff')
data = dat_gpd[['Long', 'Lat', 'Hr72']]
pts = list(zip(data.Long, data.Lat))
print (pts)
print(type(pts))
pts2 = [pts[i] for i in range(len(pts)) if mask[i]]
print(pts2)
print(type(pts2))
pts_val = data.Hr72.values
pts_val2 = [pts_val[i] for i in range(len(pts_val)) if mask[i]]
new_pts = [Point(xy) for xy in pts2]
print(type(pts_val2[1]))
pts3=[]
for tup, j in zip(pts2,range(len(pts_val2))):
pts3.append(list(tup)+[pts_val2[j]])
print(type(pts3))
masked_pts = pd.DataFrame(pts3)
print(masked_pts)
masked_pts.columns = pd.Series(['Long', 'Lat', 'Hr72'])
new_arr_gpd = gpd.GeoDataFrame(masked_pts, crs = 4269, geometry = new_pts)
new_arr_gpd.plot(column = 'Hr72')
plt.savefig('new_ras.tiff')
rRes = 0.01
#xRange = np.arange(data.Long.min(), data.Long.max(), rRes)
#yRange = np.arange(data.Lat.min(), data.Lat.max(), rRes)
#print(xRange[:5],yRange[:5])
#gridX, gridY = np.meshgrid(xRange, yRange)
#grid_pcp = griddata(pts2, pts_val2, (gridX, gridY), method = 'linear')
#Extending the code
sns.set(font_scale = 1, style = "white")
lidar_chm = rxr.open_rasterio(r'new_ras.tiff', masked=True).squeeze()
# Define the colors you want
cmap = ListedColormap(["white", "lightskyblue","dodgerblue","mediumblue","lawngreen","limegreen", "forestgreen","darkgreen", "yellow", "orange","darkorange", "chocolate", "red", "maroon", "indianred","lightpink", "pink", "lightgray", "whitesmoke" ])
# Define a normalization from values -> colors
norm = colors.BoundaryNorm([0, 1, 5, 10, 15, 20, 25, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 150, 200, 250], 19)
fig, ax = plt.subplots(figsize=(9, 5))
chm_plot = ax.imshow(np.squeeze(r'new_ras.tiff'),cmap=cmap,norm=norm)
#print(chm_plot)
map_title = input ("Enter a title for this map (for ex. 72-hr accumulated forecast map):")
ax.set_title("Hydrologic Forecast Centre (MTI)\n" + map_title)
# Add a legend for labels
legend_labels = {"white": "0-1", "lightskyblue": "1-5","dodgerblue": "5-10","mediumblue": "10-15","lawngreen": "15-20","limegreen": "20-25", "forestgreen": "25-30","darkgreen": "30-40", "yellow": "40-50", "orange": "50-60","darkorange": "60-70", "chocolate": "70-80", "red": "80-90", "maroon": "90-100","indianred": "100-110", "lightpink": "110-120", "pink": "120-150", "lightgray": "150-200", "whitesmoke": "200-250"}
patches = [Patch(color=color, label=label) for color, label in legend_labels.items()]
ax.legend(handles=patches,bbox_to_anchor=(1.2, 1),facecolor="white")
ax.set_axis_off()
plt.show()

Is there a library that will help me fit data easily? I found fitter and i will provide the code but it shows some errors

So, here is my code:
import pandas as pd
import scipy.stats as st
import matplotlib.pyplot as plt
from matplotlib.ticker import AutoMinorLocator
from fitter import Fitter, get_common_distributions
df = pd.read_csv("project3.csv")
bins = [282.33, 594.33, 906.33, 1281.33, 15030.33, 1842.33, 2154.33, 2466.33, 2778.33, 3090.33, 3402.33]
#declaring
facecolor = '#EAEAEA'
color_bars = '#3475D0'
txt_color1 = '#252525'
txt_color2 = '#004C74'
fig, ax = plt.subplots(1, figsize=(16, 6), facecolor=facecolor)
ax.set_facecolor(facecolor)
n, bins, patches = plt.hist(df.City1, color=color_bars, bins=10)
#grid
minor_locator = AutoMinorLocator(2)
plt.gca().xaxis.set_minor_locator(minor_locator)
plt.grid(which='minor', color=facecolor, lw = 0.5)
xticks = [(bins[idx+1] + value)/2 for idx, value in enumerate(bins[:-1])]
xticks_labels = [ "{:.0f}-{:.0f}".format(value, bins[idx+1]) for idx, value in enumerate(bins[:-1])]
plt.xticks(xticks, labels=xticks_labels, c=txt_color1, fontsize=13)
#beautify
ax.tick_params(axis='x', which='both',length=0)
plt.yticks([])
ax.spines['bottom'].set_visible(False)
ax.spines['left'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.spines['top'].set_visible(False)
for idx, value in enumerate(n):
if value > 0:
plt.text(xticks[idx], value+5, int(value), ha='center', fontsize=16, c=txt_color1)
plt.title('Histogram of rainfall in City1\n', loc = 'right', fontsize = 20, c=txt_color1)
plt.xlabel('\nCentimeters of rainfall', c=txt_color2, fontsize=14)
plt.ylabel('Frequency of occurrence', c=txt_color2, fontsize=14)
plt.tight_layout()
#plt.savefig('City1_Raw.png', facecolor=facecolor)
plt.show()
city1 = df['City1'].values
f = Fitter(city1, distributions=get_common_distributions())
f.fit()
fig = f.plot_pdf(names=None, Nbest=4, lw=1, method='sumsquare_error')
plt.show()
print(f.get_best(method = 'sumsquare_error'))
The issue is with the plots it shows. The first histogram it generates is
Next I get another graph with best fitted distributions which is
Then an output statement
{'chi2': {'df': 10.692966790090342, 'loc': 16.690849400411103, 'scale': 118.71595997157786}}
Process finished with exit code 0
I have a couple of questions. Why is chi2, the best fitted distribution not plotted on the graph?
How do I plot these distributions on top of the histograms and not separately? The hist() function in fitter library can do that but there I don't get to control the bins and so I end up getting like 100 bins with some flat looking data.
How do I solve this issue? I need to plot the best fit curve on the histogram that looks like image1. Can I use any other module/package to get the work done in similar way? This uses least squares fit but I am OK with least likelihood or log likelihood too.
Simple way of plotting things on top of each other (using some properties of the Fitter class)
import scipy.stats as st
import matplotlib.pyplot as plt
from fitter import Fitter, get_common_distributions
from scipy import stats
numberofpoints=50000
df = stats.norm.rvs( loc=1090, scale=500, size=numberofpoints)
fig, ax = plt.subplots(1, figsize=(16, 6))
n, bins, patches = ax.hist( df, bins=30, density=True)
f = Fitter(df, distributions=get_common_distributions())
f.fit()
errorlist = sorted(
[
[f._fitted_errors[dist], dist]
for dist in get_common_distributions()
]
)[:4]
for err, dist in errorlist:
ax.plot( f.x, f.fitted_pdf[dist] )
plt.show()
Using the histogram normalization, one would need to play with scaling to generalize again.

Matplotlib get_ylim() changing data transformation result

"get_ylim()" is changing the result of the transformation from data to display coordinates in matplotlib (I'm using version 3.2.1). Is it supposed to change axis properties? It's the same effect using "get_xlim()".
Here is my code:
import matplotlib.pyplot as plt
import numpy as np
dpi = 80
plt.rcParams.update({'font.size': 12})
fig, ax = plt.subplots(figsize=(1280/dpi, 720/dpi), dpi=dpi)
x = np.arange(200)
y = - 0.1 * x
ax.plot(x, y)
points = ax.transData.transform(np.vstack((x, y)).T).astype(int)
print(points[:5])
ax.get_ylim()
points = ax.transData.transform(np.vstack((x, y)).T).astype(int)
print(points[:5])
Both prints output different results only with the ax.get_ylim() in place.

How to plot Lon/Lat values at the border of a orthographic cartopy plot?

I use some shapefile data of the outline of Antartica in cartopy and this works fine. I can generate a plot with the shapefile and some more information on it. But I'm not able to plot the Longitude and Latitude information at the border of the image.
I use the orthographic projection with central_longitude and central_latitude.
I also need to mention that I'm comparably new to cartopy.
My code:
import numpy as np
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
from cartopy.io.shapereader import Reader
# 01
b01e01_lat = -73.86750000
b01e01_lon = -60.22694444
b01e02_lat = -73.89166667
b01e02_lon = -56.68500000
b01e03_lat = -74.87222222
b01e03_lon = -58.26805556
b01e04_lat = -74.85000000
b01e04_lon = -60.43083333
b01e05_lat = -73.86750001
b01e05_lon = -60.22694445
b01_lat = np.array([b01e01_lat,b01e02_lat,b01e03_lat,b01e04_lat, b01e01_lat, b01e05_lat])
b01_lon = np.array([b01e01_lon,b01e02_lon,b01e03_lon,b01e04_lon, b01e01_lon, b01e05_lon])
# 02
b02e01_lat = -73.94555556
b02e01_lon = -51.00055556
b02e02_lat = -74.22333333
b02e02_lon = -49.37000000
b02e03_lat = -74.87555556
b02e03_lon = -50.71888889
b02e04_lat = -74.87583333
b02e04_lon = -51.00055556
b02e05_lat = -73.94555557
b02e05_lon = -51.00055557
fname='Coastline_Antarctica_v02.shp'
#ax = plt.axes(projection=ccrs.SouthPolarStereo())
plt.figure()
ax = plt.axes(projection=ccrs.Orthographic(central_longitude=-41,
central_latitude=-71))
ax.set_extent([-85,-12,-75,-60], crs=ccrs.PlateCarree())
ax.add_geometries(Reader(fname).geometries(),ccrs.Orthographic(central_longitude=-0,
central_latitude=-90), color='grey')
ax.gridlines()
plt.plot(b01_lon,b01_lat, color='r', transform=ccrs.PlateCarree())
plt.plot(b02_lon,b02_lat, color='r', transform=ccrs.PlateCarree())
plt.show()
With this I get the following plot (without the blue shapes):
Any help appreciated!
If you run your code to produce interactive plot (using %matplotlib notebook on jupyter notebook), you can move the mouse cursor to read the locations that you need to plot the labels.
With this method, I can get the approximate (long, lat) locations for plotting 2 sample labels. The code to plot them is as follows:
ax.text(-80.6, -57.0, '{0}\N{DEGREE SIGN} S '.format(57), va='center', ha='right',
transform=ccrs.PlateCarree())
ax.text(-75.15, -56.0, '{0}\N{DEGREE SIGN} W '.format(75), va='bottom', ha='center',
transform=ccrs.PlateCarree())
And the output plot will look like this:

Assign edge weights to a networkx graph using pandas dataframe

I am contructing a networkx graph in python 3. I am using a pandas dataframe to supply the edges and nodes to the graph. Here is what I have done :
test = pd.read_csv("/home/Desktop/test_call1", delimiter = ';')
g_test = nx.from_pandas_edgelist(test, 'number', 'contactNumber', edge_attr='callDuration')
What I want is that the "callDuration" column of the pandas dataframe act as the weight of the edges for the networkx graph and the thickness of the edges also change accordingly.
I also want to get the 'n' maximum weighted edges.
Let's try:
import pandas as pd
import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
df = pd.DataFrame({'number':['123','234','345'],'contactnumber':['234','345','123'],'callduration':[1,2,4]})
df
G = nx.from_pandas_edgelist(df,'number','contactnumber', edge_attr='callduration')
durations = [i['callduration'] for i in dict(G.edges).values()]
labels = [i for i in dict(G.nodes).keys()]
labels = {i:i for i in dict(G.nodes).keys()}
fig, ax = plt.subplots(figsize=(12,5))
pos = nx.spring_layout(G)
nx.draw_networkx_nodes(G, pos, ax = ax, labels=True)
nx.draw_networkx_edges(G, pos, width=durations, ax=ax)
_ = nx.draw_networkx_labels(G, pos, labels, ax=ax)
Output:
Do not agree with what has been said. In the calcul of different metrics that takes into account the weight of each edges like the pagerank or the betweeness centrality your weight would not be taking into account if is store as an edge attributes.
Use graph.
Add_edges(source, target, weight, *attrs)

Resources