Trilateration and nonlienar least squares - python-3.x

I am trying to find user distances from beacon positions using trilateration. I am having a mean squared error function that I am trying to minimize using non-linear least squares but I am not getting correct results. Any help is appreciated. The code is below.
def mse(self, user_pos, positions, distances):
mse = 0.0
for pos, dist in zip(positions, distances):
distance = great_circle((user_pos[0], user_pos[1]), (pos[0], pos[1])).meters
mse += math.pow(distance - dist, 2.0)
return mse/len(positions)
def least_squares_func(self, positions, distances):
# Returns users coordinates
return least_squares(self.mse, [0,0], args=(positions, distances)).x
Starting position in least_squares is [0,0] but after changing it I am not getting much different results.
Input example:
positions = [(5.0, -6.0), (13.0, -15.0), (21.0, -3.0)]
distances = [8.06, 13.97, 23.32]

great_circle is used for GPS where we deal with an oblate spheroid for beacons you must used simple Euclidean metric to calculate the distance between the user and each beacons.

Related

using Geopandas, How to randomly select in each polygon 5 Points by sampling method

I want to select 5 Points in each polygon based on random sampling method. And required 5 points co-ordinates(Lat,Long) in each polygon for identify which crop is grawn.
Any ideas for do this using geopandas?
Many thanks.
My suggestion involves sampling random x and y coordinates within the shape's bounding box and then checking whether the sampled point is actually within the shape. If the sampled point is within the shape then return it, otherwise repeat until a point within the shape is found. For sampling, we can use the uniform distribution, such that all points in the shape have the same probability of being sampled. Here is the function:
from shapely.geometry import Point
def random_point_in_shp(shp):
within = False
while not within:
x = np.random.uniform(shp.bounds[0], shp.bounds[2])
y = np.random.uniform(shp.bounds[1], shp.bounds[3])
within = shp.contains(Point(x, y))
return Point(x,y)
and here's an example how to apply this function to an example GeoDataFrame called geo_df to get 5 random points for each entry:
for num in range(5):
geo_df['Point{}'.format(num)] = geo_df['geometry'].apply(random_point_in_shp)
There might be more efficient ways to do this, but depending on your application the algorithm could be sufficiently fast. With my test file, which contains ~2300 entries, generating five random points for each entry took around 15 seconds on my machine.

Fastest way to determine if two points are closest to one another

My problems consists of the following: I am given two pairs angles (in spherical coordinates) which consists of two parts--an azimuth and a colatitude angle. If we extend both angles (thereby increasing their respective radii) infinitely to make a long line pointing in the direction given by the pair of angles, then my goal is to determine
if they intersect or extremely close to one another and
where exactly they intersect.
Currently, I have tried several methods:
The most obvious one is to iteratively compare each radii until there is either a match or a small enough distance between the two. (When I say compare each radii, I am referring to converting each spherical coordinate into Cartesian and then finding the euclidean distance between the two). However, this runtime is $O(n^{2})$, which is extremely slow if I am trying to scale this program
The second most obvious method is to use the optimization package to find this distance. Unfortunately, I cannot the optimization package iteratively and after one instance the optimization algorithm repeats the same answer, which is not useful.
The least obvious method is to directly calculate (using calculus) the exact radii from the angles. While this is fast method, it is not extremely accurate.
Note: while it might seem simple that the intersection is always at the zero-origin (0,0,0), this is not ALWAYS the case. Some points never intersect.
Code for Method (1)
def match1(azimuth_recon_1,colatitude_recon_1,azimuth_recon_2, colatitude_recon_2,centroid_1,centroid_2 ):
# Constants: tolerance factor and extremely large distance
tol = 3e-2
prevDist = 99999999
# Initialize a list of radii to loop through
# Checking iteravely for a solution
for r1 in list(np.arange(0,5,tol)):
for r2 in list(np.arange(0,5,tol)):
# Get the estimates
estimate_1 = np.array(spher2cart(r1,azimuth_recon_1,colatitude_recon_1)) + np.array(centroid_1)
estimate_2 = np.array(spher2cart(r2,azimuth_recon_2,colatitude_recon_2))+ np.array(centroid_2)
# Calculate the euclidean distance between them
dist = np.array(np.sqrt(np.einsum('i...,i...', (estimate_1 - estimate_2), (estimate_1 - estimate_2)))[:,np.newaxis])
# Compare the distance to this tolerance
if dist < tol:
if dist == 0:
return estimate_1, [], True
else:
return estimate_1, estimate_2, False
## If the distance is too big break out of the loop
if dist > prevDist:
prevDist = 9999999
break
prevDist = dist
return [], [], False
Code for Method (3)
def match2(azimuth_recon_1,colatitude_recon_1,azimuth_recon_2, colatitude_recon_2,centriod_1,centroid_2):
# Set a Tolerance factor
tol = 3e-2
def calculate_radius_2(azimuth_1,colatitude_1,azimuth_2,colatitude_2):
"""Return radius 2 using both pairs of angles (azimuth and colatitude). Equation is provided in the document"""
return 1/((1-(math.sin(azimuth_1)*math.sin(azimuth_2)*math.cos(colatitude_1-colatitude_2))
+math.cos(azimuth_1)*math.cos(azimuth_2))**2)
def calculate_radius_1(radius_2,azimuth_1,colatitude_1,azimuth_2,colatitude_2):
"""Returns radius 1 using both pairs of angles (azimuth and colatitude) and radius 2.
Equation provided in document"""
return (radius_2)*((math.sin(azimuth_1)*math.sin(azimuth_2)*math.cos(colatitude_1-colatitude_2))
+math.cos(azimuth_1)*math.cos(azimuth_2))
# Compute radius 2
radius_2 = calculate_radius_2(azimuth_recon_1,colatitude_recon_1,azimuth_recon_2,colatitude_recon_2)
#Compute radius 1
radius_1 = calculate_radius_1(radius_2,azimuth_recon_1,colatitude_recon_1,azimuth_recon_2,colatitude_recon_2)
# Get the estimates
estimate_1 = np.array(spher2cart(radius_1,azimuth_recon_1,colatitude_recon_1))+ np.array(centroid_1)
estimate_2 = np.array(spher2cart(radius_2,azimuth_recon_2,colatitude_recon_2))+ np.array(centroid_2)
# Calculate the euclidean distance between them
dist = np.array(np.sqrt(np.einsum('i...,i...', (estimate_1 - estimate_2), (estimate_1 - estimate_2)))[:,np.newaxis])
# Compare the distance to this tolerance
if dist < tol:
if dist == 0:
return estimate_1, [], True
else:
return estimate_1, estimate_2, False
else:
return [], [], False
My question is two-fold:
Is there a faster and more accurate way to find the radii for both
points?
If so, how do I do it?
EDIT: I am thinking about just creating two numpy arrays of the two radii and then comparing them via numpy boolean logic. However, I would still be comparing them iteratively. Is there is a faster way to perform this comparison?
Use a kd-tree for such situations. It will easily look up the minimal distance:
def match(azimuth_recon_1,colatitude_recon_1,azimuth_recon_2, colatitude_recon_2,centriod_1,centroid_2):
cartesian_1 = np.array([np.cos(azimuth_recon_1)*np.sin(colatitude_recon_1),np.sin(azimuth_recon_1)*np.sin(colatitude_recon_1),np.cos(colatitude_recon_1)]) #[np.newaxis,:]
cartesian_2 = np.array([np.cos(azimuth_recon_2)*np.sin(colatitude_recon_2),np.sin(azimuth_recon_2)*np.sin(colatitude_recon_2),np.cos(colatitude_recon_2)]) #[np.newaxis,:]
# Re-center them via adding the centroid
estimate_1 = r1*cartesian_1.T + np.array(centroid_1)[np.newaxis,:]
estimate_2 = r2*cartesian_2.T + np.array(centroid_2)[np.newaxis,:]
# Add them to the output list
n = estimate_1.shape[0]
outputs_list_1.append(estimate_1)
outputs_list_2.append(estimate_2)
# Reshape them so that they are in proper format
a = np.array(outputs_list_1).reshape(len(two_pair_mic_list)*n,3)
b = np.array(outputs_list_2).reshape(len(two_pair_mic_list)*n,3)
# Get the difference
c = a - b
# Put into a KDtree
tree = spatial.KDTree(c)
# Find the indices where the radius (distance between the points) is 3e-3 or less
indices = tree.query_ball_tree(3e-3)
This will output a list of the indices where the distance is 3e-3 or less. Now all you will have to do is use the list of indices with the estimate list to find the exact points. And there you have it, this will save you a lot of time and space!

Cosmic ray removal in spectra

Python developers
I am working on spectroscopy in a university. My experimental 1-D data sometimes shows "cosmic ray", 3-pixel ultra-high intensity, which is not what I want to analyze. So I want to remove this kind of weird peaks.
Does anybody know how to fix this issue in Python 3?
Thanks in advance!!
A simple solution could be to use the algorithm proposed by Whitaker and Hayes, in which they use modified z scores on the derivative of the spectrum. This medium post explains how it works and its implementation in python https://towardsdatascience.com/removing-spikes-from-raman-spectra-8a9fdda0ac22 .
The idea is to calculate the modified z scores of the spectra derivatives and apply a threshold to detect the cosmic spikes. Afterwards, a fixer is applied to remove the spike points and replace it by the mean values of the surrounding pixels.
# definition of a function to calculate the modified z score.
def modified_z_score(intensity):
median_int = np.median(intensity)
mad_int = np.median([np.abs(intensity - median_int)])
modified_z_scores = 0.6745 * (intensity - median_int) / mad_int
return modified_z_scores
# Once the spike detection works, the spectrum can be fixed by calculating the average of the previous and the next point to the spike. y is the intensity values of a spectrum, m is the window which we will use to calculate the mean.
def fixer(y,m):
threshold = 7 # binarization threshold.
spikes = abs(np.array(modified_z_score(np.diff(y)))) > threshold
y_out = y.copy() # So we don't overwrite y
for i in np.arange(len(spikes)):
if spikes[i] != 0: # If we have an spike in position i
w = np.arange(i-m,i+1+m) # we select 2 m + 1 points around our spike
w2 = w[spikes[w] == 0] # From such interval, we choose the ones which are not spikes
y_out[i] = np.mean(y[w2]) # and we average the value
return y_out
The answer depends a on what your data looks like: If you have access to two-dimensional CCD readouts that the one-dimensional spectra were created from, then you can use the lacosmic module to get rid of the cosmic rays there. If you have only one-dimensional spectra, but multiple spectra from the same source, then a quick ad-hoc fix is to make a rough normalisation of the spectra and remove those pixels that are several times brighter than the corresponding pixels in the other spectra. If you have only one one-dimensional spectrum from each source, then a less reliable option is to remove all pixels that are much brighter than their neighbours. (Depending on the shape of your cosmics, you may even want to remove the nearest 5 pixels or something, to catch the wings of the cosmic ray peak as well).

Custom metric for NearestNeighbors sklearn

Hello I'm on a project where I use 512 bits hash to create clusters. I'm using a custom metric bitwise hamming distance. But when I compare two hash with this function I obtain different distance results than using the NearestNeighbors.
Extending this to DBSCAN, using a eps=5, the cluster are created with some consistence, are being correctly clustered. But I try to check the distance between points from the same cluster I obtain distance enormous. Here is an example.
Example:
This a list of points from 2 clusters created by DBSCAN, and as you can see when using the function to calculate the distance gives number bigger than 30 but the NN gives results consistent with the eps=5.
from sklearn.neighbors import NearestNeighbors
hash_list_1 = [2711636196460699638441853508983975450613573844625556129377064665736210167114069990407028214648954985399518205946842968661290371575620508000646896480583712,
2711636396252606881895803338309150146134565539796776390549907030396205082681800682439355456735713892762967881436259141637319066484744271299497977370896760,
2711636396252606881918517135048330084905033589325484952567856239496981859330884970846906663264518266744879431357749780779892124020350824669153434630258784,
2711636396252797418317524490088561493800258861799581574018898781319096107333812163580085003775074676924785748114206505865657620572909617106316367216148512,
2711636196460318585955127494483972276879239064090689852809978361705086216958169367104329622890567955158961917611852516176654399246340379120409329566384160,
2711636396252606881918605860499354102197401318666579124151729671752374458560929422237113300739169875232495266727513833203360007861082211711747836501459040,
2685449071597530523833230885351500532369477539914318172159429043161052628696351016818586542171509728747070238075233795777242761861490021015910382103951968,
2685449271584547381638295372872027557715092296493457397817270817010861872186702795218797216694169625716749654321460983923962566367029011600112932108533792,
2685449071792640184514638654713547133316375160837810451952682241651988724244365461216285304336254942220323815140042850082680124299635209323646382761738272,
1847461275963134712629870519594779049860430827711272857522520377357653173694038204556169999876899727026751811340128091158803029889914422883922033917198368,
2711636396252606881901567718540735842607739343712295416931961674938924754114357607352250040524848697769853213132484145241805622979375000168935113673834592,
2711636396252606881901567718538101947732706353297593371282460773094032493492652041376662823635245997887100968237677157520342076957158825588198798784364576]
hash_list_2 = [1677246762479319235863065539858628614044010438213592493389244703420353559152336301659250128835190166728647823546464421558167523127086351613289685036466208,
1677246762479700308655934218233084077989052614799077817712715603728397519829375248244181345837838956827991047769168833176865438232999821278031784406056992,
1677246762479700314487411751526941880161990070273187005125752885368412445003620183982282356578440274746789782460884881633682918768578649732794162647826464,
1677246762479319238759152196394352786642547660315097253847095508872934279466872914748604884925141826161428241625796765725368284151706959618924400925900832,
1677246762479890853811162999308711253291696853123890392766127782305403145675433285374478727414572392743118524142664546768046227747593095585347134902140960,
1677246765601448867710925237522621090876591539557992237656925108430781026329148912958069241932475038282622646533152559554888274158032061637714105308528752,
1678883457783648388335228538833424204662395277995143067623864457726472665342252064374635323999849241968448535982901839797440478656657327613912450890367008,
1677246765601448864793634462245189770642489500950753120409198344054454862566173176691699195659218600616315451200851360013275424257209428603245704937128032,
1677246762479700314471974894075267160937462491405299015541470373650765401692659096424270522124311243007780041455682577230603077926878181390448030335795232,
1677246762479700317400446530288778920091525622772690226165317385340164047644547471081180880454458397836230795248631079659291423401151022423365062554976288,
1677246762479700317400446530288758590086745084806873060513679821541689120894219120403259478342385343805541797540566045409406476458247878183422733877936160,
2516871453405060707064684111867902766968378200849671168835363528433280949578746081906100803610196553501646503982070255639855643685380535999494563083255776,
1677246762479319230037086118512223039643232176451879100417048497454912234466993748113993020733268935613563596294183318283010061477487433484794582123053088,
1677246762479319235834673272207747972667132521699112379991979781620810490520617303678451683578338921267417975279632387450778387555221361833006151849902112,
1677246762479700305748490595643272813492272250002832996415474372704463760357437926852625171223210803220593114114602433734175731538424778624130491225112608]
def custom_metric(x, y):
return bin(int(x[0]) ^ int(y[0])).count('1')
objective_hash = hash_list_1[0]
complete_list = hash_list_1 + hash_list_2
distance = [custom_metric([objective_hash], [hash_point]) for hash_point in complete_list]
print("Function iteration distance:")
print(distance)
neighbors_model = NearestNeighbors(radius=100, algorithm='ball_tree',
leaf_size=2,
metric=custom_metric,
metric_params=None,
n_jobs=4)
X = [[x] for x in complete_list]
neighbors_model.fit(X)
distance, neighborhoods = neighbors_model.radius_neighbors(objective_hash, 100, return_distance=True)
print("Nearest Neighbors distance:")
print(distance)
print("Nearest Neighbors index:")
print(neighborhoods)
The problem:
Numpy can't handle numbers so big and converts them to float losing a lot of precision.
The solution:
Precompute with your custom metric all the distances and feed them to the DBSCAN algorithm.

Euclidean Distance in Tensorflow, convert matrix

I have a knn classification project, which needs to calculate euclidean distance with tensorflow for comparison.
The original code without tensorflow is like this:
def euclidean_distance(self,x1, x2):
distance = 0.0
for i in range(len(x1)):
distance += pow( x1[i] - x2[i], 2)
print(distance)
return math.sqrt(distance)
and with tensorflow is like this:
distance = 0.0
for i in range(len(x1)):
distance = tf.negative(tf.sqrt(tf.reduce_sum(tf.square(tf.subtract(x1, x2)))))
return distance
Is this right? Because of that code distance became tensor, and I need a method for converting that tensor into a normal matrix.
Any help is appreciated, thanks!
In order to get nd array(matrix) you need to run the graph like blow
session=tf.Session()
nd_distance=session.run(distance)
You have to change your code to
......
......
distance = tf.sqrt(tf.reduce_sum(tf.square(tf.subtract(x1, x2))))
nd_distance=session.run(distance)
print (nd_distance)
return nd_distance
I don't see the need for tf.negative function and for loop

Resources