How to find the L1-Norm/Manhattan distance between two vectors in Python without libraries - python-3.x

I have two vectors with equal dimensions and need to find the distance between them
I have tried various approaches:
sum([a-b for a, b in zip(u, v)])
c= sum([a-b for a, b in zip(u, v)]
#If x is negative, multiply by negative one to convert x to a positive
if c<=0:
return c*-1
#No changes are made to x if it is positive
else:
return c
I am yet to have success!

You want to use the abs() function, which is available in standard python.
So if you have
a = [1,2,3,4,5,.4]
b = [4,3,4,5,-2,.8]
Than you can get the distance with
sum([abs(i-j) for i,j in zip(a,b)])
We can use the sklearn implementation to check indeed this is the correct answer.
from sklearn.metrics.pairwise import manhattan_distances
manhattan_distances([a], [b])

Related

taking the norm of 3 vectors in python

This is probably a stupid question, but for some reason I can't get the norm of three matrices of vectors.
Each vector in the x matrix represents the x coordinate of a sensor (8 sensors total) for three different experiments. Same for y and z.
ex:
x = [array([ 2.239, 3.981, -8.415, 33.895, 48.237, 52.13 , 60.531, 56.74 ]), array([ 2.372, 6.06 , -3.672, 3.704, -5.926, -2.341, 35.667, 62.097])]
y = [array([ 18.308, -17.83 , -22.278, -99.67 , -121.575, -116.794,-123.132, -127.802]), array([ -3.808, 0.974, -3.14 , 6.645, 2.531, 7.312, -129.236, -112. ])]
z = [array([-1054.728, -1054.928, -1054.928, -1058.128, -1058.928, -1058.928, -1058.928, -1058.928]), array([-1054.559, -1054.559, -1054.559, -1054.559, -1054.559, -1054.559, -1057.959, -1058.059])]
I tried doing:
norm= np.sqrt(np.square(x)+np.square(y)+np.square(z))
x = x/norm
y = y/norm
z = z/norm
However, I'm pretty sure its wrong. When I then try and sum the components of let's say np.sum(x[0]) I don't get anywhere close to 1.
Normalization does not make the sum of the components equal to one. Normalization makes the norm of the vector equal to one. You can check if your code worked by taking the norm (square root of the sum of the squared elements) of the normalized vector. That should equal 1.
From what I can tell, your code is working as intended.
I made a mistake - your code is working as intended, but not for your application. You could define a function to normalize any vector that you pass to it, much as you did in your program as follows:
def normalize(vector):
norm = np.sqrt(np.sum(np.square(vector)))
return vector/norm
However, because x, y, and z each have 8 elements, you can't normalize x with the components from x, y, and z.
What I think you want to do is normalize the vector (x,y,z) for each of your 8 sensors. So, you should pass 8 vectors, (one for each sensor) into the normalize function I defined above. This might look something like this:
normalized_vectors = []
for i in range(8):
vector = np.asarray([x[i], y[i],z[i]])
normalized_vectors.append = normalize(vector)

multiply many matrices and many vectors pytorch

I am trying to multiply the following:
A batch of matrices N x M x D
A batch of vectors N x D x 1
To get a result: N x M x 1
as if I were doing N dot products on M x D D x 1.
I cant seem to find the correct function in PyTorch.
torch.bmm as far as I can tell only works for a batch of vectors and a single matrix. If I have to use torch.einsum then so be it but id rather not!
It's pretty straightforward and intuitive with einsum:
torch.einsum('ijk, ikl->ijl', mats, vecs)
But your operation is just:
mats # vecs

user def. function modifying argument though it is not supposed to

Just for practice, I am using nested lists (for exaple, [[1, 0], [0, 1]] is the 2*2 identity matrix) as matrices. I am trying to compute determinant by reducing it to an upper triangular matrix and then by multiplying its diagonal entries. To do this:
"""adds two matrices"""
def add(A, B):
S = []
for i in range(len(A)):
row = []
for j in range(len(A[0])):
row.append(A[i][j] + B[i][j])
S.append(row)
return S
"""scalar multiplication of matrix with n"""
def scale(n, A):
return [[(n)*x for x in row] for row in A]
def detr(M):
Mi = M
#the loops below are supossed to convert Mi
#to upper triangular form:
for i in range(len(Mi)):
for j in range(len(Mi)):
if j>i:
k = -(Mi[j][i])/(Mi[i][i])
Mi[j] = add( scale(k, [Mi[i]]), [Mi[j]] )[0]
#multiplies diagonal entries of Mi:
k = 1
for i in range(len(Mi)):
k = k*Mi[i][i]
return k
Here, you can see that I have set M (argument) equal to Mi and and then operated on Mi to take it to upper triangular form. So, M is supposed to stay unmodified. But after using detr(A), print(A) prints the upper triangular matrix. I tried:
setting X = M, then Mi = X
defining kill(M): return M and then setting Mi = kill(M)
But these approaches are not working. This was causing some problems as I was trying to use detr(M) in another function, problems which I was able to bypass, but why is this happening? What is the compiler doing here, why was M modified even though I operated only on Mi?
(I am using Spyder 3.3.2, Python 3.7.1)
(I am sorry if this question is silly, but I have only started learning python and new to coding in general. This question means a lot to me because I still don't have a deep understanding of this language.)
See python documentation about assignment:
https://docs.python.org/3/library/copy.html
Assignment statements in Python do not copy objects, they create bindings between a target and an object. For collections that are mutable or contain mutable items, a copy is sometimes needed so one can change one copy without changing the other.
You need to import copy and then use Mi = copy.deepcopy(M)
See also
How to deep copy a list?

How to calculate modulus value for a set of values in python?

In one material I found one formula to calculate Precision as below
Here a and b are set of values. After many search in internet I found that modulus means remainder value or absolute value. Here I take modulus as absolute value and my python code for the above formula is as below
import numpy as np
def intersection(lst1, lst2):
return list(set(lst1) & set(lst2))
a = [7,21]
b = [11, 7, 27, 21]
a_intersect_b=intersection(a,b)
print(" a_intersect_b : ",a_intersect_b)
mod_a_intersect_b=[abs(x) for x in a_intersect_b]
print("|a_intersect_b| : ",mod_a_intersect_b)
mod_a=[abs(x) for x in a]
print("|a| : ",mod_a)
numerator=np.array(mod_a_intersect_b, dtype=np.float)
denominator=np.array(mod_a, dtype=np.float)
print(" mod_a_intersect_b / mod_a : ", numerator/denominator)
Here I get 2 output values. But in the material and in general the precision is a single value. If the list size increases then the output values also increases. Finally I found that I misunderstood the modulus meaning here. Guide me to get the single precision value as per the above formula. Thanks in advance.
Note: In the formula a and b are set of values. So I used list in my code. Also guide me if I use other option to mention set of values in python then I can get single precision value.
As #Hoog mentioned in gis comment, the modulus operation in the case of precision means a cardinality of some set (just a number of elements of the set), so you can define a precision as the following:
def precision(a, b):
"""
a: set, relevant items
b: set, retrieved items
returns: float, precision value
"""
return len(a & b) / len(a)
len(a) returns nuber of elements of the set, i.e. cardinality, |a| operation.
If a, b is lists, just wrap them in sets first:
def precision(a, b):
"""
a: set, relevant items
b: set, retrieved items
returns: float, precision value
"""
a, b = set(a), set(b)
return len(a & b) / len(a)
Also, in data science and related areas precision is a metric which calculates ratio 'true positives' / ('true positives' + 'false positives'). It's the same thing described in other terms - but standart implementations of precision won't help you.

math.sqrt function python gives same result for two different values [duplicate]

Why does the math module return the wrong result?
First test
A = 12345678917
print 'A =',A
B = sqrt(A**2)
print 'B =',int(B)
Result
A = 12345678917
B = 12345678917
Here, the result is correct.
Second test
A = 123456758365483459347856
print 'A =',A
B = sqrt(A**2)
print 'B =',int(B)
Result
A = 123456758365483459347856
B = 123456758365483467538432
Here the result is incorrect.
Why is that the case?
Because math.sqrt(..) first casts the number to a floating point and floating points have a limited mantissa: it can only represent part of the number correctly. So float(A**2) is not equal to A**2. Next it calculates the math.sqrt which is also approximately correct.
Most functions working with floating points will never be fully correct to their integer counterparts. Floating point calculations are almost inherently approximative.
If one calculates A**2 one gets:
>>> 12345678917**2
152415787921658292889L
Now if one converts it to a float(..), one gets:
>>> float(12345678917**2)
1.5241578792165828e+20
But if you now ask whether the two are equal:
>>> float(12345678917**2) == 12345678917**2
False
So information has been lost while converting it to a float.
You can read more about how floats work and why these are approximative in the Wikipedia article about IEEE-754, the formal definition on how floating points work.
The documentation for the math module states "It provides access to the mathematical functions defined by the C standard." It also states "Except when explicitly noted otherwise, all return values are floats."
Those together mean that the parameter to the square root function is a float value. In most systems that means a floating point value that fits into 8 bytes, which is called "double" in the C language. Your code converts your integer value into such a value before calculating the square root, then returns such a value.
However, the 8-byte floating point value can store at most 15 to 17 significant decimal digits. That is what you are getting in your results.
If you want better precision in your square roots, use a function that is guaranteed to give full precision for an integer argument. Just do a web search and you will find several. Those usually do a variation of the Newton-Raphson method to iterate and eventually end at the correct answer. Be aware that this is significantly slower that the math module's sqrt function.
Here is a routine that I modified from the internet. I can't cite the source right now. This version also works for non-integer arguments but just returns the integer part of the square root.
def isqrt(x):
"""Return the integer part of the square root of x, even for very
large values."""
if x < 0:
raise ValueError('square root not defined for negative numbers')
n = int(x)
if n == 0:
return 0
a, b = divmod(n.bit_length(), 2)
x = (1 << (a+b)) - 1
while True:
y = (x + n//x) // 2
if y >= x:
return x
x = y
If you want to calculate sqrt of really large numbers and you need exact results, you can use sympy:
import sympy
num = sympy.Integer(123456758365483459347856)
print(int(num) == int(sympy.sqrt(num**2)))
The way floating-point numbers are stored in memory makes calculations with them prone to slight errors that can nevertheless be significant when exact results are needed. As mentioned in one of the comments, the decimal library can help you here:
>>> A = Decimal(12345678917)
>>> A
Decimal('123456758365483459347856')
>>> B = A.sqrt()**2
>>> B
Decimal('123456758365483459347856.0000')
>>> A == B
True
>>> int(B)
123456758365483459347856
I use version 3.6, which has no hardcoded limit on the size of integers. I don't know if, in 2.7, casting B as an int would cause overflow, but decimal is incredibly useful regardless.

Resources