OSMNx : get coordinates of nodes/corners/edges of polygons/buildings - python-3.x

I am trying to retrieve the coordinates of all nodes/corners/edges of each commercial building in a list. E.g. for the supermarket Aldi in Macclesfield (UK), I can get from the UI 10 nodes (all the corners/edges of the supermarket) but I can only retrieve from osmnx 2 of those 10 nodes. I would need to access to the complete list of nodes but it truncates the results giving only 2 nodes of 10 in this case.Using this code below:
import osmnx as ox
test = ox.geocode_to_gdf('aldi, Macclesfield, Cheshire, GB')
ax = ox.project_gdf(test).plot()
test.geometry
or
gdf = ox.geometries_from_place('Grosvenor, Macclesfield, Cheshire, GB', tags)
gdf.geometry
Both return just two coordinates and truncate other info/results that is available in openStreetMap UI (you can see it in the first column of the image attached geometry>POLYGON>only two coordinates and other results truncated...). I would appreciate some help on this, thanks in advance.

It's hard to guess what you're doing here because you didn't provide a reproducible example (e.g., tags is undefined). But I'll try to guess what you're going for.
I am trying to retrieve the coordinates of all nodes/corners/edges of commercial buildings
Here I retrieve all the tagged commercial building footprints in Macclesfield, then extract the first one's polygon coordinates. You could instead filter these by other attribute values as you see fit if you only want certain kinds of buildings. Proper usage of OSMnx's geometries module is described in the documentation.
import osmnx as ox
# get the building footprints in Macclesfield
place = 'Macclesfield, Cheshire, England, UK'
tags = {'building': 'commercial'}
gdf = ox.geometries_from_place(place, tags)
# how many did we get?
print(gdf.shape) # (57, 10)
# extract the coordinates for the first building's footprint
gdf.iloc[0]['geometry'].exterior.coords
Alternatively, if you want a specific building's footprint, you can look up its OSM ID and tell OSMnx to geocode that value:
gdf = ox.geocode_to_gdf('W251154408', by_osmid=True)
polygon = gdf.iloc[0]['geometry']
polygon.exterior.coords

gdf = ox.geocode_to_gdf('W352332709', by_osmid=True)
polygon = gdf.iloc[0]['geometry']
polygon.exterior.coords
list(polygon.exterior.coords)

Related

OSMNX: get external coordinates of a building giving a coordinate where it is located

hoping someone can help me
I am trying to retrieve the exterior coordinates of the nearest building given a coordinate/geolocalization.
I can get all external coordinates of a building giving an address (code below) but I would need to retrieve same information, now giving a coordinate/geolocalization.
For example, I would need to get the external coordinates of the building located at this point with lat/long: 53.2588051, -2.124499.
import osmnx as ox
tesco = ox.geocode_to_gdf('Tesco, Exchange Street, SK11 6UZ, Macclesfield, Cheshire, GB')
polygon = tesco.iloc[0]['geometry']
polygon.exterior.coords
list(polygon.exterior.coords)
I tried using method "ox.pois_from_point" but I get error: AttributeError: module 'osmnx' has no attribute 'pois_from_point'
Many thanks in advance!
import osmnx as ox
tags = {'building': True} # would return all building footprints in the area
center_point = (53.2588051, -2.124499)
a = ox.geometries.geometries_from_point(center_point, tags, dist=20)
polygon = a.iloc[0]['geometry']
polygon.exterior.coords
list(polygon.exterior.coords)

Expand netcdf to the whole globe with xarray

I have a dataset that looks like this:
As you can see, it only covers Latitudes between -55.75 and 83.25. I would like to expand that dataset so that it covers the whole globe (-89.75 to 89.75 in my case) and fill it with an arbitrary NA value.
Ideally I would want to do this with xarray. I have looked at .pad(), .expand_dims() and .assign_coords(), but did not really get a handle on the working ofeither of those.
If someone can provide an alternative solution with cdo, I would also be grateful for that.
You could do this with nctoolkit (https://nctoolkit.readthedocs.io/en/latest/), which uses CDO as a backend.
The example below shows how you could do it. Example starts by cropping a global temperature dataset to latitudes between -50 and 50. You would then need to regrid it to a global dataset, at whatever resolution you need. This uses CDO, which will extrapolate at the edges. So you probably want to set everything to NA outside the original dataset's values, so my code calls masklonlatbox from CDO.
import nctoolkit as nc
ds = nc.open_thredds("https://psl.noaa.gov/thredds/dodsC/Datasets/COBE2/sst.mon.ltm.1981-2010.nc")
ds.subset(time = 0)
ds.crop(lat = [-50, 50])
ds.to_latlon(lon = [-179.5, 179.5], lat = [-89.5, 89.5], res = 1)
ds.mask_box(lon = [-179.5, 179.5], lat = [-50, 50])
ds.plot()
# convert to xarray dataset
ds_xr = ds.to_xarray()

Geospatial fixed radius cluster hunting in python

I want to take an input of millions of lat long points (with a numerical attribute) and then find all fixed radius geospatial clusters where the sum of the attribute within the circle is above a defined threshold.
I started by using sklearn BallTree to sum the attribute within any defined circle, with the intention of then expanding this out to run across a grid or lattice of circles. The run time for one circle is around 0.01s, so this is fine for small lattices, but won't scale if I want to run 200m radius circles across the whole of the UK.
#example data (use 2m rows from postcode centroid file)
df = pandas.read_csv('National_Statistics_Postcode_Lookup_Latest_Centroids.csv', usecols=[0,1], nrows=2000000)
#this will be our grid of points (or lattice) use points from same file for example
df2 = pandas.read_csv('National_Statistics_Postcode_Lookup_Latest_Centroids.csv', usecols=[0,1], nrows=2000)
#reorder lat long columns for balltree input
columnTitles=["Y","X"]
df = df.reindex(columns=columnTitles)
df2 = df2.reindex(columns=columnTitles)
# assign new columns to existing dataframe. attribute will hold the data we want to sum over (set to 1 for now)
df['attribute'] = 1
df2['aggregation'] = 0
RADIANT_TO_KM_CONSTANT = 6367
class BallTreeIndex:
def __init__(self, lat_longs):
self.lat_longs = np.radians(lat_longs)
self.ball_tree_index =BallTree(self.lat_longs, metric='haversine')
def query_radius(self,query,radius):
radius_km = radius/1000
radius_radiant = radius_km / RADIANT_TO_KM_CONSTANT
query = np.radians(np.array([query]))
indices = self.ball_tree_index.query_radius(query,r=radius_radiant)
return indices[0]
#index the base data
a=BallTreeIndex(df.iloc[:,0:2])
#begin to loop over the lattice to test performance
for i in range(0,100):
b = df2.iloc[i,0:2]
output = a.query_radius(b, 200)
accumulation = sum(df.iloc[output, 2])
df2.iloc[i,2] = accumulation
It feels as if the above code is really inefficient as I don't need to run the calculation across all circles on my lattice (as most will be well below my threshold - or will have no data points in at all).
Instead of this for loop, is there a better way of scaling this algorithm to give me the most dense circles?
I'm new to python, so any help would be massively appreciated!!
First don't try to do this on a sphere! GB is small and we have a well defined geographic projection that will work. So use the oseast1m and osnorth1m columns as X and Y. They are in metres so no need to convert (roughly) to degrees and use Haversine. That should help.
Next add a spatial index to speed up lookups.
If you need more speed there are various tricks like loading a 2R strip across the country into memory and then running your circles across that strip, then moving down a grid step and updating that strip (checking Y values against a fixed value is quick, especially if you store the data sorted on Y then X value). If you need more speed then look at any of the papers the Stan Openshaw (and sometimes I) wrote about parallelising the GAM. There are examples of implementing GAM in python (e.g. this paper, this paper) that may also point to better ways.

How to change the limits for geo_shape in altair (python vega-lite)

I am trying to plot locations in three states in the US in python with Altair. I saw the tutorial about the us map but I am wondering if there is anyway to zoom the image to the only three states of interest, i.e. NY,NJ and CT.
Currently, I have the following code:
from vega_datasets import data
states = alt.topo_feature(data.us_10m.url, 'states')
# US states background
background = alt.Chart(states).mark_geoshape(
fill='lightgray',
stroke='white',
limit=1000
).properties(
title='US State Capitols',
width=700,
height=400
).project("albers")
points=alt.Chart(accts).mark_point().encode(
longitude = "longitude",
latitude = "latitude",
color = "Group")
background+points
I inspected the us_10m.url data set and seems like there is no field which specifies the individual states. So I am hoping if I could just somehow change the xlim and ylim for the background to [-80,-70] and [35,45] for example. I want to zoom in to the regions where there are data points(blue dots).
Could someone kindly show me how to do that? Thanks!!
Update
There is a field called ID in the JSON file and I manually found out that NJ is 34, NY is 36 and CT is 9. Is there a way to filter on these IDs? That will get the job done!
Alright seems like the selection/zoom/xlim/ylim feature for geotype is not supported yet:
Document and add warning that geo-position doesn't support selection yet #3305
So I end up with a hackish way to solve this problem by first filtering based on the IDs using pure python. Basically, load the JSON file into a dictionary and then change the value field before converting the dictionary to topojson format. Below is an example for 5 states,PA,NJ,NY,CT,RI and MA.
import altair as alt
from vega_datasets import data
# Load the data, which is loaded as a dict object
us_10m = data.us_10m()
# Select the geometries under states under objects, filter on id (9,25,34,36,42,44)
us_10m['objects']['states']['geometries']=[item for item in us_10m['objects'] \
['states']['geometries'] if item['id'] in [9,25,34,36,42,44]]
# Make the topojson data
states = alt.Data(
values=us_10m,
format=alt.TopoDataFormat(feature='states',type='topojson'))
# Plot background (now only has 5 states)
background = alt.Chart(states).mark_geoshape(
fill='lightgray',
stroke='white',
limit=1000
).properties(
title='US State Capitols',
width=700,
height=400
).project("mercator")
# Plot the points
points=alt.Chart(accts).mark_circle(size=60).encode(
longitude = "longitude",
latitude = "latitude",
color = "Group").project("mercator")
# Overlay the two plots
background+points
The resulting plot looks ok:

Creating a map with basemap, filling countries

I'm currently working in my final project for my Coding class (my first coding class, so kind of an amateur).
My idea is for a code to search every newspaper in the world for a specific word within the titles (using bs4) and then obtaining a dictionary with the average mentions by country, taking into account the number of newspaper in each country. Afterwards, and this is the part where I'm stuck, I want to put this in a map.
The whole program is already working properly, until the part where I have a CSV with the following form:
'Country','Average'
'Afghanistan',10
'Albania',5
'Algeria',0
'Andorra',2
'Antigua and Barbuda',7
'Argentina',0
'Armenia',4
Now, I want to create a worldmap where the higher the number, the redder (or any other color) the whole polygon of the country. So far I've found many codes that work well placing points in space, but I haven't found one that "appends" the CSV data presented above and then fills each country accordingly. Below is the part of the code that currently created the worldmap:
# Now we proceed with the creation of the map
fig, ax = plt.subplots(figsize=(15,10)) # We define the size of the map
m = Basemap(resolution='c', # c, l, i, h, f or None
projection='merc', # Mercator projection
lat_0=24.20, lon_0=-6.67, # The center of the mas, so that the whole world is shown without splitting Asia
llcrnrlon=-180, llcrnrlat= -85,urcrnrlon=180, urcrnrlat=85) # The coordinates of the whole world
m.drawmapboundary(fill_color='#46bcec') # We choose a color for the boundary of the map
m.fillcontinents(color='#f2f2f2',lake_color='#46bcec') # We choose a color for the land and one for the lakes
m.drawcoastlines() # We choose to draw the lines of the map
m.readshapefile('Final project\\vincent_map_data-master\\ne_110m_admin_0_countries\\ne_110m_admin_0_countries', 'areas') # We import the shape file of the whole world
df_poly = pd.DataFrame({ # We define the polygon structure
'shapes': [Polygon(np.array(shape), True) for shape in m.areas],
'area': [area['name'] for area in m.areas_info]
})
cmap = plt.get_cmap('Oranges')
pc = PatchCollection(df_poly.shapes, zorder=2)
norm = Normalize()
mapper = matplotlib.cm.ScalarMappable(norm=norm, cmap=cmap)
# We show the map
plt.show(m)
I opened the shapefile of the countries and the way to identify the countries is with the variable "sovereignty". There might be some non-sensical things within my code, since I've extracted things from many places. Sorry about that.
If someone could help me out, I would really appreciated.
Thanks

Resources