Difference between distance() and geo_distance() in arangodb - arangodb

What is the difference between the arango function - DISTANCE() and GE0_DISTANCE(). I know both of them calculates distance using haversines formula.
Thanks,
Nilotpal

Both are used for two different purposes
DISTANCE(latitude1, longitude1, latitude2, longitude2) → distance
The value is computed using the haversine formula, which is based on a spherical Earth model. It’s fast to compute and is accurate to around 0.3%, which is sufficient for most use cases such as location-aware services.
GEO_DISTANCE(geoJsonA, geoJsonB, ellipsoid) → distance
Return the distance between two GeoJSON objects, measured from the centroid of each shape. For a list of supported types see the geo index page. (Ref: https://www.arangodb.com/docs/3.8/aql/functions-geo.html#geo-index-functions)
This GeoJSON objects can be anything like GEO_LINESTRING, GEO_MULTILINESTRING, GEO_MULTIPOINT, GEO_POINT, GEO_POLYGON and GEO_MULTIPOLYGON - Reference<2>
Reference:
https://www.arangodb.com/docs/3.8/aql/functions-geo.html#geo-utility-functions
https://www.arangodb.com/docs/3.8/aql/functions-geo.html#geojson-constructors

Related

Snowflake DB : How to plot a Geospatial sphere feature in Snowflake?

Snowflake now supports geospatial datatype.It supports geographical features like point,line,Polygon.
But wanted to know ,whether it supports for spherical feature?
How to plot a spherical feature in Snowflake geospatial datatype?
According to the Snowflake documentation it only supports WGS84 (EPSG:4326), which does not have a z (height/elevation) coordinate, making it impossible to represent a sphere, which is a 3d shape. Approximating the 2d projection, a circle, should be somewhat possible.
A lot of geospatial engines have a function called buffer (or some variation thereof), which when performed on a point would give you an approximation of a circle. Depending on what you are doing, this might be what you need. Unfortunately, it doesn't look like Snowflake has a buffer function as of July 2020.
If all you need to do is detect whether a point p lies within a disc of radius r around a point c, you can use the ST_DWITHIN function to accomplish that goal. You can find the documentation for that function here. Note that you must supply your radius in meters.

How to index d dimensional vector in solr using geospatial search?

I am trying to create a plugin for solr that will enable indexing 3d models. I will take 'screenshots' of each model from several different views and preprocess those images so they'll be represented in 1d vector.
I wanted to use lucene/solr Geospatial search for that purpose, as I saw there is an option to index a vector (larger than 2 dims) and search according to distance form the vector (according to location).
Unfortunately the documentation for this option disappeared last week and it isn't cached in google.
How can I index a location vector with dimension > 2?
The link for the documentation was here:
https://wiki.apache.org/solr
And I found it from here:
https://lucene.apache.org/solr/guide/6_6/spatial-search.html#SpatialSearch-LatLonPointSpatialField
Geospatial search is intended to work with n-dimension points (Lucene), but it seems the Solr implementation for dimensions higher than 2d is not available.
You can still do 3d spatial search with Solr if you index one dimension vector/coordinate per field (use a double fieldType).
Then, in order to query and sort documents by distance from a given point, instead of using geodist(sfield2D,x,y), you may use dist() :
Returns the distance between two vectors (points) in an n-dimensional
space. Takes in the power, plus two or more ValueSource instances and
calculates the distances between the two vectors.
To compute the euclidean distance between an arbitrary point (0,0,0) and the indexed points for each document, you would use :
dist(2, fieldx, fieldy, fieldz, 0, 0, 0)
See also :
- Calculate distance in 3D space
- Distance Sorting or Boosting (Function Queries)
- Difference between geodist() and dist() for Geo-Spacial Search

Mariadb: geography

I need to check if the distance between two geographic point is less then N km. I'm trying to execute this query:
select st_distance(
ST_GeomFromText('point(45.764043 4.835658999999964)', 4326),
ST_GeomFromText('point(45.750371 5.053963)', 4326)
) < :n
But it doesn't work because:
So far the SRID property is just a dummy in MySQL, it is stored as part of a geometries meta data but all actual calculations ignore it and calculations are done assuming Euclidean (planar) geometry.
(https://mariadb.com/kb/en/mariadb/st_transform-missing/)
My goal is to convert this distance to the metric distance or to convert the N to the degrees.
How I can do it?
Maybe, you know a better solution?
P.S. I need a solution based on the spatial methods (or more better for the performance).
I don't think the "distance" function is available (yet) in SPATIAL. There is a regular FUNCTION in https://mariadb.com/kb/en/latitudelongitude-indexing/ that does the work. However, the args and output are scaled lat/lng (10000*degrees). The code could be altered to avoid the scaling, but it is needed in the context of that blog page.

Closest distance + vertices of two meshes

I would like to find two vertices of two meshes (1 vertex per mesh) that define the closest distance between them. Or the two triangles would be fine I guess.
However I'm not sure how to search for this in CGAL's documentation, I'm sure that this is doable with some existing tool (probably based on a 3d distance field and/or AABBs). Could I please get a hint (keywords/link) on what to look for?
I've been pointed to the Optimal Distances CGAL package, but it's not exactly what I want, since it outputs the distance and the coordinates, so finding the vertex ID is an additional computational overhead.
I've already implemented a collision detection with CGAL to find the triangle-triangle intersection in a triangle-soup, using AABB-trees. I guess that I should be somehow close to this, although now a simple soup with all me object-triangles wouldn't do the job.
The solution found was this:
CGAL's Optimal Distances package can give an approximation of the closest distance between the convex hulls of two meshes, without explicitly computing the hulls. As a result one gets the shortest distance between these hulls, and the coordinates of the 2 points that lie on them and define this distance.
Then these coordinates can be used as a search-query in kd-trees that contains the original vertices of the meshes in order to find the closest vertices.
In case one mesh is non-convex, the hull that CGAL is using is very approximate, so convex decomposition might be necessary. In such a case one would have to check distances for each convex part and then take the shortest distance.
The above would result in something like this:
enter link description here

Representing classification confidence

I am working on a simple AI program that classifies shapes using unsupervised learning method. Essentially I use the number of sides and angles between the sides and generate aggregates percentages to an ideal value of a shape. This helps me create some fuzzingness in the result.
The problem is how do I represent the degree of error or confidence in the classification? For example: a small rectangle that looks very much like a square would yield night membership values from the two categories but can I represent the degree of error?
Thanks
Your confidence is based on used model. For example, if you are simply applying some rules based on the number of angles (or sides), you have some multi dimensional representation of objects:
feature 0, feature 1, ..., feature m
Nice, statistical approach
You can define some kind of confidence intervals, baesd on your empirical results, eg. you can fit multi-dimensional gaussian distribution to your empirical observations of "rectangle objects", and once you get a new object you simply check the probability of such value in your gaussian distribution, and have your confidence (which would be quite well justified with assumption, that your "observation" errors have normal distribution).
Distance based, simple approach
Less statistical approach would be to directly take your model's decision factor and compress it to the [0,1] interaval. For example, if you simply measure distance from some perfect shape to your new object in some metric (which yields results in [0,inf)) you could map it using some sigmoid-like function, eg.
conf( object, perfect_shape ) = 1 - tanh( distance( object, perfect_shape ) )
Hyperbolic tangent will "squash" values to the [0,1] interval, and the only remaining thing to do would be to select some scaling factor (as it grows quite quickly)
Such approach would be less valid in the mathematical terms, but would be similar to the approach taken in neural networks.
Relative approach
And more probabilistic approach could be also defined using your distance metric. If you have distances to each of your "perfect shapes" you can calculate the probability of an object being classified as some class with assumption, that classification is being performed at random, with probiability proportional to the inverse of the distance to the perfect shape.
dist(object, perfect_shape1) = d_1
dist(object, perfect_shape2) = d_2
dist(object, perfect_shape3) = d_3
...
inv( d_i )
conf(object, class_i) = -------------------
sum_j inv( d_j )
where
inv( d_i ) = max( d_j ) - d_i
Conclusions
First two ideas can be also incorporated into the third one to make use of knowledge of all the classes. In your particular example, the third approach should result in confidence of around 0.5 for both rectangle and circle, while in the first example it would be something closer to 0.01 (depending on how many so small objects would you have in the "training" set), which shows the difference - first two approaches show your confidence in classifing as a particular shape itself, while the third one shows relative confidence (so it can be low iff it is high for some other class, while the first two can simply answer "no classification is confident")
Building slightly on what lejlot has put forward; my preference would be to use the Mahalanobis distance with some squashing function. The Mahalanobis distance M(V, p) allows you to measure the distance between a distribution V and a point p.
In your case, I would use "perfect" examples of each class to generate the distribution V and p is the classification you want the confidence of. You can then use something along the lines of the following to be your confidence interval.
1-tanh( M(V, p) )

Resources