Fast "local" search from a center point or bounding box - search

Building a local business directory. Using elastic search as our search provider and have a bit of dilemma:
What we really want to do is a "center point" search. But as I understand it, a true "center point" search is significantly slower than a "bounding box" search. So instead we are using a "bounding box" search around a general area that the user is searching. This works great for searches like "tacos" because there are always many results for this search. The bounding box search does not work well for searches using less common keywords or on searches for specific businesses that are in the area, but lie outside of the bounding box.
A few solutions that I've considered, but I'm not sure if any are any good:
Do multiple "bounding box" searches, the first of which uses a smaller bounding box, if none or very few results are found, do another search with a larger bounding box.
Use a "center point" search instead even though it is slower. This search should return results roughly in order of geographical distance from the center point (in combination with other ways that elastic search "prioritizes" results.
I see that websites like Yelp.com seem to have this problem handled. I'm just not sure what is the best way to go about it.
We are running Django on Heroku servers. Out elastic search provider is Found.
Your input is greatly appreciated!

Related

Foursquare API: how to explore venues within an administrative division such as ‘borough’

I am a beginner user of Foursquare API.
Most of the Foursquare techniques I learned so far is to do query such as search for or explore from a single location point (e.g. a café or a hotel) in a single geographical coordinate pair of longitude and latitude.
My question is, if it is possible to make an query to explore venues within an area, instead of in the vicinity of a single location point. By ‘an area’, I specifically mean the unit of the administrative division, such as a borough, of a neighbourhood.
In other words, my intending query, if possible, would start from an area specification of the administrative division of my interest (e.g. borough), such as its name or its border’s geographic coordinates—as a 'key' to link with Foursquare data, rather than start from a single location point.
I downloaded GeoJson file that already defined the geographical coordinates of the border of the administrative division of neighbourhoods in a city that I am interested in (link: http://cdn.buenosaires.gob.ar/datosabiertos/datasets/barrios/barrios.geojson). Just FYI, in this link, the neighbourhood is described as ‘barrio’ and the border is defined in the form of 'Polygon'.
I just wonder if I can use an area specification—either the name of an administrative division or a set of the geographic coordinates of an administrative division's border—as a key to make an query about venues such as restaurants, hospitals, and polices within the unit of an administrative division (e.g. borough) from corner to corner.
I guess that the underlying question is if Foursquare side has such info stored in somewhere in its system: if not, my contemplated approach would not work.
Or there might be a totally different workaround to achieve my goal.
If anyone can advise me on this matter, I would highly appreciate it.
Thanks
Given the parameters listed in the docs I think that the best approach would be to use the ll or near parameters and also include a radius so you can limit the search for a given area or region.
To get the middle point for the polygon I guess you would need to do some math but shouldn't be that difficult.
Besides this I think there doesn't seem to be any other parameter in Foursquare API to search by area or by a coordinates array (polygon).
Anyways, I would suggest that you go through the Foursquare API docs for both search and explore endpoints and check for yourself.
since you already have the polygon of interested region:
you could fit many small radius circles within to cover majority of the area.
this is not recommended as it may be rate limited or get you blacklisted but: the foursquare website has a 'draw' tool that allows you to draw a polygon and search venues within. (open networks tab under inspect in your browser and see request) I have noticed that it also can't take very complex polygons, no enclaves, and it further aggressively simplifies polygons to remove holes/dents/land bridges.
here's my demo requests. polygon search isn't strict and might show some venues outside the border.
the url:
https://foursquare.com/explore?mode=url&polygon=35.957999786220704%2C-80.41236877441406%3B35.897393965545646%2C-80.38215637207031%3B35.87847989454576%2C-80.55107116699219%3B35.954664894270834%2C-80.54901123046875%3B35.994118756097%2C-80.386962890625%3B35.957999786220704%2C-80.41236877441406
the corresponding get for venues:
https://api.foursquare.com/v2/search/recommendations?locale=en&explicit-lang=false&v=20210302&m=foursquare&limit=30&intent=bestnearby&polygon=40.8252411857252%2C-74.00630950927733%3B40.817446884558805%2C-73.99772644042969%3B40.81147063339219%2C-73.99875640869139%3B40.80757278825516%2C-74.00768280029297%3B40.80887209540822%2C-74.01729583740234%3B40.81406906961218%2C-74.02175903320312%3B40.8197852710803%2C-74.02210235595702%3B40.826280356677124%2C-74.01695251464844%3B40.8252411857252%2C-74.00630950927733&wsid={}&oauth_token={}

Finding all geohashes within two bounding coordinates

I have coordinates, which are assigned a corresponding geohash in my database. Now I want to retrieve all of the coordinates within two bounding coordinates (top right and top left corner). How can I do that properly?
I tried getting the geohash that fits both of those bounding coordinates, but this solution does not work when they are in completely different regions of the world (so they are not sharing anything in common).
Is there a better way to do that?
Thanks for your help
Unfortunately, this isn't something you can do out-of-the-box with datastore / App engine. (There are no built in spatial queries.)
For early prototyping, etc., you can do it the hard way - retrieve all the rows, and discard the ones not meeting your query in code. Obviously, probably not viable with real production data.
See related question Query for Entities Nearby with Geopt for some possible production solutions.

Search feature on website

I am interested in implementing a search feature on a website. It is a location search, so address/state/zip all should work. Which will then show results in that area and allow it to be filtered.
My question is:
What's the best approach for something like this?
There are literally dozens of ways of doing this (if not more). The exact implementation would depend on the technology stack that you use, but as a very top level overview:
you'd need to store the things you are searching for somewhere, and tag them with a lat/long location. Often, this would be in a database of some kind.
using a programming language, you would need to write a search that accepts a postcode, translates that to a lat/long and then searches the things in your database based on the distance between the location of the thing, and the location entered in the search.
if you want to support filtering, your search would need to support that too. This is often called "faceting" the search.
Working out the lat/long locations will need to be done using a GeoLocation service, there are some, such as PostCode Anywhere that will do this as a paid service, and others that are free (within reason), such as the Google Maps APIs.
There are probably some hosted services that will do what you want, you'd have to shop around.
Examples of search software that supports geolocation searching out of the box are things like Solr, Azure Search, Lucene and Elastic.

How do I sort search results by relevance?

I'm working on a project which searches through a database, then sorts the search results by relevance, according to a string the user inputs. I think my current search is fairly decent, but the comparator I wrote to sort the results by relevance is giving me funny results. I don’t know what to consider relevant. I know this is a big branch of information retrieval, but I have no idea where to start finding examples of searches which sort objects by relevance and would appreciate any feedback.
To give a little more background about my specific issue, the user will input a string in a website database, which stores objects (items in the store) with various fields, such as a minor and major classification (for example, an XBox 360 game might be stored with major=video_games and minor=xbox360 fields along with its specific name). The four main fields that I think should be considered in the search are the specific name, major, minor, and genre of the type of object, if that helps.
In case you don't wanna use lucene/Solr, you can always use distance metrics to find the similarity between query and the rows retrieved from database. Once you get the score you can sort them and they will be considered as sorted by relevance.
This is what exactly happens behind the scene of lucene. You can use simple similarity metrics like manhattan distance, distance of points in n-dimensional space etc. Look for lucene scoring formula for more insight.

Group features in Google Earth to hide detail when zoomed out

I'm trying to generate a KML file to display a set of features scattered around the UK. I would like the features to be grouped together at higher zoom levels, ideally displaying as an icon with a count of the number of features, so that users can see clusters of features easily.
Essentially I'm trying to do something along these lines, but in Google Earth, not Maps.
Can anyone point me in the right direction. I'm a bit of a newbie with KML :-)
Cheers,
RB.
ANSWERS :
My own research suggests I can do what I want using Regions to define bounding boxes for certain features.
It has also been suggested I should do this using network links, which I'm going to investigate as I think it's a better match for other reasons too.
Is this a standalone KML file? Or the KML returned as data for a network link?
In the first case I'm not sure this is even possible. I have seen layer transparency change with "camera altitude", so perhaps something like this is also possible on features? Then you could add both the single features and the groups features into the same KML file and make them visible based on "distance to camera"? Could be a new KML feature I missed, but you'd have the check the KML specification.
In the second case, you just return KML that matches the given network link viewport information. Based on the bounding box you get, you can subdivide that box into a grid and cluster per box. If you have one feature in a box, return the feature. If you have more than one in a box, return just a "grouped feature" for that box. The clustering will then automatically change when the user moves around in Google Earth: after each camera change your network link URL is called again and you again do feature selection and clustering with the given bounding box viewport. This makes your clustering dynamic.
Does this help?

Resources