OSMnx: Divide a cache ox.graph into equal squares without redownloading each - python-3.x

I am trying to divide a city into n squares.
Right now, I'm calculating the coordinates for all square centres and using the ox.graph_from_point function to extract the OSM data for each of them.
However, this is getting quite long at high n due to the API pausing times.
My question:
Is there a way to download all city data from OSM, and then divide the cache file into squares (using ox.graph_from_point or other) without making a request for each?
Thanks

Using OSMnx directly - no, there isn't. You would have to script your own solution using the existing tools OSMnx provides.

Related

Finding powerlines in LIDAR point clouds with RANSAC

I'm trying to find powerlines in LIDAR points clouds with skimage.measures ransac() function. This is my very first time meddling with these modules in python so bear with me.
So far all I knew how to do reliably was filtering low or 'ground' points from the cloud to reduce the number of points to deal with.
def filter_Z(las, threshold):
filtered = laspy.create(point_format = las.header.point_format, file_version = las.header.version)
filtered.points = las.points[las.Z > las.Z.min() + threshold]
print(f'original size: {len(las.points)}')
print(f'filtered size: {len(filtered.points)}')
filtered.write('filtered_points2.las')
return filtered
The threshold is something I put in by hand since in the las files I worked with are some nasty outliers that prevent me from dynamically calculating it.
The filtered point cloud, or one of them atleast looks like this:
Note the evil red outliers on top, maybe they're birds or something. Along with them are trees and roofs of buildings. If anyone wants to take a look at the .las files, let me know. I can't put a wetransfer link in the body of the question.
A top down view:
I've looked into it as much as I could, and found the skimage.measure module and the ransac function that comes with it. I played around a bit to get a feel for it and currently I'm stumped on how to continue.
def ransac_linefit_sklearn(points):
model_robust, inliers = ransac(points, LineModelND, min_samples=2, residual_threshold=1000, max_trials=1000)
return model_robust, inliers
The result is quite predictable (I ran ransac on a 2D view of the cloud just to make it a bit easier on the pc)
Using this doesn't really yield any good results in examples like the one I posted. The vegetation clusters have too many points and the line is fitted through it because it has the highest point density.
I tried DBSCAN() to cluster up the points but it didn't work. I also attempted OPTICS() but as I write it still hasn't finished running.
From what I've read on various articles, the best course of action would be to cluster up the points and perform RANSAC on each individual cluster to find lines, but I'm not really sure on how to do that or what clustering method to use in situations like these.
One thing I'm also curious about doing is just filtering out the big blobs of trees that mess with model fititng.
Inadequacy of RANSAC
RANSAC works best whenever your data fits a mono-modal distribution around your model. In the case of this point cloud, it works best whenever there is only one line with outliers, but there are at least 5 lines when viewed birds-eye. Check out this older SO post that discusses your problem. Francesco's response suggests an iterative RANSAC based approach.
Octrees and SVD
Colleagues worked on a similar problem in my previous job. I am not fluent in the approach, but I know enough to provide some hints.
Their approach resembled Francesco's suggestion. They partitioned the point-cloud into octrees and calculated the singular value decomposition (SVD) within each partition. The three resulting singular values will correspond to the geometric distribution of the data.
If the first singular value is significantly greater than the other two, then the points are line-like.
If the first and second singular values are significantly greater than the other, then the points are plane-like
If all three values are of similar magnitude, then the data is just a "glob" of points.
They used these rules iteratively to rule out which points were most likely NOT part of the lines.
Literature
If you want to look into published methods, maybe this paper is a good starting point. Power lines are modeled as hyperbolic functions.

Merging satellite images and retaining coordinates

Thanks for dropping in here.
I'm currently working on a project, and I'm not that strong with python yet. So I was hoping for some constructive feedback on this question.
I have a dataset containing core samples, all stored with sample id, latitude, longitude, content and other data irrelevant for this question.
Now I've imported this dataset and sliced it as I want it to be. For the images I'm using the rasterio module to open 2 satellite images that covers the region. I'm using the utm module to convert back and forth between latlong->UTM->Pixel values (Which also seems to be throwing me strange coordinates at some points).
Annoyingly enough, the two Sentinel-2 images are cut right across the center of the map.
As I'm doing bounding boxes on top of where the samples are taken, this is a problem as I need to extract 10x10 pixel cut outs of that region. This leads to a lot of the samples not getting a proper cut out.
So I thought why not merge the two images together into one large rectangular bit. But I still need to retain the meta data with the UTM coordinates.
How would you suggest I proceed. Can it be done in an easy way? Is there another angle on this I've overlooked?
Thank you for your time.
I'm not sure I completely understand the question, but if you are simply trying to merge 2 images, have you looked at the command line tool gdal_merge.py?
A very simple example:
gdal_merge.py -o merged_image.tif image1.tif image2.tif

Updatable nearest neighbor search

I'm trying to come up with a good design for a nearest neighbor search application. This would be somewhat similar to this question:
Saving and incrementally updating nearest-neighbor model in R
In my case this would be in Python but the main point being the part that when new data comes, the model / index must be updated. I'm currently playing around with scikit-learn neighbors module but I'm not convinced it's a good fit.
The goal of the application:
User comes in with a query and then the n (probably will be fixed to 5) nearest neighbors in the existing data set will be shown. For this step such a search structure from sklearn would help but that would have to be regenerated when adding new records.Also this is a first ste that happens 1 per query and hence could be somewhat "slow" as in 2-3 seconds compared to "instantly".
Then the user can click on one of the records and see that records nearest neighbors and so forth. This means we are now within the exiting dataset and the NNs could be precomputed and stored in redis (for now 200k records but can be expanded to 10th or 100th of millions). This should be very fast to browse around.
But here I would face the same problem of how to update the precomputed data without having to do a full recomputation of the distance matrix especially since there will be very few new records (like 100 per week).
Does such a tool, method or algorithm exist for updatable NN searching?
EDIT April, 3rd:
As is indicated in many places KDTree or BallTree isn't really suited for high-dimensional data. I've realized that for a Proof-of-concept with a small data set of 200k records and 512 dimensions, brute force isn't much slower at all, roughly 550ms vs 750ms.
However for large data set in millions+, the question remains unsolved. I've looked at datasketch LSH Forest but it seems in my case this simply is not accurate enough or I'm using it wrong. Will ask a separate question regarding this.
You should look into FAISS and its IVFPQ method
What you can do there is create multiple indexes for every update and merge them with the old one
You could try out Milvus that supports adding and near real-time search of vectors.
Here are the benchmarks of Milvus.
nmslib supports adding new vectors. It's used by OpenSearch as part their Similarity Search Engine, and it's very fast.
One caveat:
While the HNSW algorithm allows incremental addition of points, it forbids deletion and modification of indexed points.
You can also look into solutions like Milvus or Vearch.

How to process KML/GeoJSON in Nodejs?

I ran out of google searches and reaching out for help here. We are right now processing KML file using geoXml3 at client side. But Ideally I would want to pre-process it in server side and send the ploygons on the client side. Because KML file is 18MB file and it takes forever to download on client side and then client parses it and draws the polygon on google map.
We changed KML files to GeoJSON and reduced the size , compressed it - after all the circus the response time is still not good. I just want to know if there is a way / library in node that can do this.
When you say that you are compressing the file, what do you mean? If you mean an algorithm such as zip or lha, that won't necessarily reduce the size of the file that much. What you want to do is to remove line segments from the KML file. In reducing some geographic information, I found that there were lengths of many miles that varied less than a foot from a straight line. Since the data points were spaced every few feet, that meant that the vast majority of the points in the KML file could be removed without making an appreciable change in the appearance of the geometry. Looking for straight line segments is relatively simple.
You should also keep in mind the scale of the map you are viewing and the spacing of the data points in the KML file. Even if the lines are complex curves, it may be possible to remove large numbers of points by trying to curve fit segments of the features and reduce the size of the data in this way.
You seem to be implying that downloading the data from the server to the client takes far more time than processing the data on the server. If this is correct, reducing the number of points is the most efficient method.
There are a number of tricks you can try to reduce the download & rendering time of a KML polygon file.
As already suggested in previous answers the key is to reduce the size of your data. This can be accomplished in many ways, depending on your use case:
Reduce the number of points that each line segment / polygon boundary is made up of. There are plenty of algorithms for this, the Douglas-Peucker line simplification algorithm is the most well known.
Reduce the precision of your data points. If your coordinates are stored to a high degree of precision (i.e. latitude / longitudes with several decimal places) then you can round these up to a lower degree of precision / fewer decimal places. Note that you might have to play with this as it will make the boundary of your polygons choppy / jagged if you go too far.
Compression. It looks as though you have already experimented with this. Gzip compression should be able to reduce the wire payload size of KML significantly.
Finally, if you are still not getting the results you want you could consider generalizing your data further by removing small / insignificant polygons. Again, this will depend on your use case

Obstacle avoidance using 2 fixed cameras on a robot

I will be start working on a robotics project which involves a mobile robot that has mounted 2 cameras (1.3 MP) fixed at a distance of 0.5m in between.I also have a few ultrasonic sensors, but they have only a 10 metter range and my enviroment is rather large (as an example, take a large warehouse with many pillars, boxes, walls .etc) .My main task is to identify obstacles and also find a roughly "best" route that the robot must take in order to navigate in a "rough" enviroment (the ground floor is not smooth at all). All the image processing is not made on the robot, but on a computer with NVIDIA GT425 2Gb Ram.
My questions are :
Should I mount the cameras on a rotative suport, so that they take pictures on a wider angle?
It is posible creating a reasonable 3D reconstruction based on only 2 views at such a small distance in between? If so, to what degree I can use this for obstacle avoidance and a best route construction?
If a roughly accurate 3D representation of the enviroment can be made, how can it be used as creating a map of the enviroment? (Consider the following example: the robot must sweep an fairly large area and it would be energy efficient if it would not go through the same place (or course) twice;however when a 3D reconstruction is made from one direction, how can it tell if it has already been there if it comes from the opposite direction )
I have found this response on a similar question , but I am still concerned with the accuracy of 3D reconstruction (for example a couple of boxes situated at 100m considering the small resolution and distance between the cameras).
I am just starting gathering information for this project, so if you haved worked on something similar please give me some guidelines (and some links:D) on how should I approach this specific task.
Thanks in advance,
Tamash
If you want to do obstacle avoidance, it is probably easiest to use the ultrasonic sensors. If the robot is moving at speeds suitable for a human environment then their range of 10m gives you ample time to stop the robot. Keep in mind that no system will guarantee that you don't accidentally hit something.
(2) It is posible creating a reasonable 3D reconstruction based on only 2 views at such a small distance in between? If so, to what degree I can use this for obstacle avoidance and a best route construction?
Yes, this is possible. Have a look at ROS and their vSLAM. http://www.ros.org/wiki/vslam and http://www.ros.org/wiki/slam_gmapping would be two of many possible resources.
however when a 3D reconstruction is made from one direction, how can it tell if it has already been there if it comes from the opposite direction
Well, you are trying to find your position given a measurement and a map. That should be possible, and it wouldn't matter from which direction the map was created. However, there is the loop closure problem. Because you are creating a 3D map at the same time as you are trying to find your way around, you don't know whether you are at a new place or at a place you have seen before.
CONCLUSION
This is a difficult task!
Actually, it's more than one. First you have simple obstacle avoidance (i.e. Don't drive into things.). Then you want to do simultaneous localisation and mapping (SLAM, read Wikipedia on that) and finally you want to do path planning (i.e. sweeping the floor without covering area twice).
I hope that helps?
I'd say no if you mean each eye rotating independently. You won't get the accuracy you need to do the stereo correspondence and make calibration a nightmare. But if you want the whole "head" of the robot to pivot, then that may be doable. But you should have some good encoders on the joints.
If you use ROS, there are some tools which help you turn the two stereo images into a 3d point cloud. http://www.ros.org/wiki/stereo_image_proc. There is a tradeoff between your baseline (the distance between the cameras) and your resolution at different ranges. large baseline = greater resolution at large distances, but it also has a large minimum distance. I don't think i would expect more than a few centimeters of accuracy from a static stereo rig. and this accuracy only gets worse when you compound there robot's location uncertainty.
2.5. for mapping and obstacle avoidance the first thing i would try to do is segment out the ground plane. the ground plane goes to mapping, and everything above is an obstacle. check out PCL for some point cloud operating functions: http://pointclouds.org/
if you can't simply put a planar laser on the robot like a SICK or Hokuyo, then i might try to convert the 3d point cloud into a pseudo-laser-scan then use some off the shelf SLAM instead of trying to do visual slam. i think you'll have better results.
Other thoughts:
now that the Microsoft Kinect has been released, it is usually easier (and cheaper) to simply use that to get a 3d point cloud instead of doing actual stereo.
This project sounds a lot like the DARPA LAGR program. (learning applied to ground robots). That program is over, but you may be able to track down papers published from it.

Resources