Generic Computation of Distance Matrices in Pytorch [closed] - pytorch

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I have two tensors a & b of shape (m,n), and I would like to compute a distance matrix m using some distance metric d. That is, I want m[i][j] = d(a[i], b[j]). This is somewhat like cdist(a,b) but assuming a generic distance function d which is not necessarily a p-norm distance. Is there a generic way to implement this in PyTorch?
And a more specific side question: Is there an efficient way to perform this with the following metric
d(x,y) = 1 - cos(x,y)
edit
I've solved the specific case above using this answer:
def metric(a, b, eps=1e-8):
a_norm, b_norm = a.norm(dim=1)[:, None], b.norm(dim=1)[:, None]
a_norm = a / torch.max(a_norm, eps * torch.ones_like(a_norm))
b_norm = b / torch.max(b_norm, eps * torch.ones_like(b_norm))
similarity_matrix = torch.mm(a_norm, b_norm.transpose(0, 1))
return 1 - similarity_matrix

I'd suggest using broadcasting: since a,b both have shape (m,n) you can compute
m = d( a[None, :, :], b[:, None, :])
where d needs to operate on the last dimension, so for instance
def d(a,b): return 1 - (a * b).sum(dim=2) / a.pow(2).sum(dim=2).sqrt() / b.pow(2).sum(dim=2).sqrt()
(here I assume that cos(x,y) represents the normalized inner product between x and y)

Related

How to multiply input given by user for metric system calculator [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I am working on a metric system converter for class using python3 in cs50 but I am having some troubles. Basically, I want the user to input a number(value) and then he'd choose the prefix of measurement(ex: kilo, milli, micro, etc) and when the person does this it multiplies or divides the value by a number to convert it into the requested unit of measurement. For example, if they wanted to convert centimeters to kilometers i want the user to input for example 200 centimeters and then for a function to divide that by 1000 to get 0.002km and to print it to the user. But I have no idea how to go about this. Any help would be appreciated.
I'm not trying to do your homework, but this should give you a first idea:
factor = {'km': 3,
'm': 0,
'dm': -1,
'cm': -2,
'mm': -3} # dictionary with powers, e.g. 1 km = 10**3 m
# For your example of 200 cm = 0.002 km, you type...
number = float(input('numerical value: ')) # 200
unit = input('unit: ') # cm
target_unit = input('target unit: ') # km
print(number * 10**factor[unit] / 10**factor[target_unit], target_unit)
The QuantiPhy package will do this for you.
>>> from quantiphy import Quantity
>>> for v in '1MHz 10ug 1ps'.split():
... q = Quantity(v)
... print(v, q, 1/q, sep=', ')
1MHz, 1 MHz, 1e-06
10ug, 10 ug, 99999.99999999999
1ps, 1 ps, 1000000000000.0
For each case in this example the original string is printed along with the value of the quantity and its reciprocal. This was done to show that quantities are rendered naturally with SI scale factors and units, and that you can use them anywhere a float can be used.

Finding Shortest Distance Between Two Parallel Lines, With Arbitrary Point [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
I need to write a reliable method to retrieve the answer to the following scenario...
Given a line segment AB and an arbitrary point C, how would I find the closest point to A on a line parallel to AB that passes through point C? (Reliable mentioned above refers to the algorithms ability to find D while allowing the coordinates for A, B, and C to be completely arbitrary and unpredictable. I've ran in to a quite a few solutions that I was not able to adapt to all possible scenarios, sadly...)
In the case of the data displayed in the picture below, how would I reliably find the x,y coordinates of D?
A = <425, 473>
B = <584, 533>
C = <371, 401>
D = <???, ???>
Knowing that AB and CD are parallel, that obviously means the slopes are the same.
I have tried many different formulas to no avail and have been working on this for weeks now. I'm stumped!
It's a minimization problem.
In general, the Euclidean distance between two points (A and B) in N dimensional space is given by
Dist(A,B) = sqrt((A1-B1)^2 + ... + (AN-BN)^2)
If you need to find the minimum distance between a space curve A(t) (a 1-dimensional object embedded in some N dimensional space) and a point B, then you need to solve this equation:
d Dist(A(t),B) / dt = 0 // (this is straightforward calculus: we're at either a minimum or maximum when the rate of change is 0)
and test that set of roots (t1, t2, etc) against the distance function to find which one yields the smallest value of D.
Now to find the equation for the parallel line passing through C in y=mx+b form:
m = (Ay - By)/(Ax-Bx)
b = Cy - mCx
Let's write this in space-curve form as and plug it into our formula from part 1:
Dist(D(t),A) = sqrt((t-Ax)^2 + (m*t+b-Ay)^2)
taking our derivative:
d Dist(D(t),A)/ dt = d sqrt((t-Ax)^2 + (m*t+b-Ay)^2) / dt
= (t + (m^2)*t - Ax + m*b - m*Ay)/sqrt(t^2 + (m^2)t^2 - 2*t*Ax + 2*m*t*b - 2*m*t*Ay + (Ax)^2 + (Ay)^2 + b^2 - 2*b*Ay )
= ((1+m^2)*t - Ax + m*b - m*Ay)/sqrt((1+m^2)*(t^2) + 2*t*(m*b - Ax - m*Ay) + (Ax)^2 + (Ay)^2 + b^2 - 2*b*Ay )
Settings this equal to 0 and solving for t yields:
t = (Ax-m*b+m*Ay)/(1+m^2)
as the only root (you can check this for yourself by substituting back in and verifying that everything cancels as desired).
Plugging this value of t back in to our space curve yields the following:
D=<(Ax-m*b+m*Ay)/(1+m^2),b+m*(Ax-m*b+m*Ay)/(1+m^2)>
You can then plug in your expressions for m and b if you want an explicit solution in terms A,B,C, or if you only want the numerical solution you can just compute it as a three step process:
m = (Ay - By)/(Ax-Bx)
b = Cy - mCx
D=<(Ax-m*b+m*Ay)/(1+m^2),b+m*(Ax-m*b+m*Ay)/(1+m^2)>
This will be valid for all cases with parallel straight lines. One caveat when implementing it as a numerical (rather than analytical) code: if the lines are oriented vertically, calculating m = (Ay-By)/(Ax-Bx) will result in division by 0, which will make your code not work. You can throw in a safety valve as follows:
if( Ax == Bx) {
D = <Cx,Ay>
} else {
// normal calculation here
}
For serious numerical work, you probably want to implement that in terms of tolerances rather than a direct comparison due to roundoff errors and all that fun stuff (i.e., abs(Ax-Bx) < epsilon, rather than Ax==Bx)

The best way to map correlation matrix from [-1, 1] space to [0, 1] space [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
SO is warning me my question is likely to be closed, I hope they're wrong :)
My question: let you have a correlation matrix; you would like correlations which are next to 1 and -1 go towards 1, while those next to 0 stay there.
The simplest way is to use absolute values, e.g. if Rho is you correlation matrix then you will use abs(Rho).
Is there any way which is theoretically more correct than the one above?
As an example: what if I use Normal p.d.f. instead of absolute value?
Adjusted Rho = N(Rho, mu = 0, sigma = stdev(Rho))
where N is the Normal p.d.f. function.
Have you any better way?
What are strengths and weaknesses of each method?
Thanks,
Try this.
x <- runif(min = -1, max = 1, n = 100)
tr <- (x - min(x))/diff(range(x))
plot(x)
points(tr, col = "red")
You could also use a logit link function that guarantees the values to be between 0 and 1. But given that you're limited to values between -1 and 1, you would get only values in the range of ~[0.3, 1].

Get the position of the point that lies at 25% of a line? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I have a line with 2 points. I know the distance between the 2 points. I also calculated the angle of the line.
My target is to get a point that lies at 25% ot the line.
I calculate the y of this point with (dist/100)*25.
My only problem is calculating the x of the point. I suspect i have all the variables needed i only can't seem to find how to calculate the x. Does anybody know this?
You have a segment (not line) with endpoints P0 (coordinates x0,y0) and P1(x1,y1). New point P lies at this segment and distance |P0P| = 0.25 * |P0P1|, if their coordinates are:
x = x0 + 0.25 * (x1-x0)
y = y0 + 0.25 * (y1-y0)
It's just simple vector maths, no need for any angles or trig here.
startPos = (0,0)
endPos = (10,10)
fratcion = 0.25
distX = endPos.x - startPos.x
distY = endPos.y - startPos.y
pos.x = startPos.x + fraction*distX
pos.y = startPos.y + fraction*distY

Heavy tail distribution - Weibull [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
I know that the Weibull distribution exhibits subexponential heavy-tailed behavior when the shape parameter is < 1. I need to demonstrate this using the limit definition of a heavy tailed distribution:
for all
How do I incorporate the cumulative distribution function (CDF) or any other equation characteristic of the Weibull distribution to prove that this limit holds?
The CDF of the Weibull distribution is 1 - exp(-(x/lambda)^k) = P(X <= x).
So
P(X > x) = 1 - CDF = exp(-(x/lambda)^k),
and
lim exp(lambda * x) * P(X > x) = lim exp(lambda x) * exp( - (x/lambda)^k)
= lim exp(lambda x - x^k/lambda^k)
Since k<1, and x is large, and lambda>0, lambda x grows large faster than x^k/lambda^k (the monomial with the greater exponent wins). In other words, the lambda x term dominates the x^k/lambda^k term. So lambda x - x^k/lambda^k is large and positive.
Thus, the limit goes to infinity.

Resources