Solving linear equation systems with tensors in pytorch - pytorch

I have three tensors v_1, v_2 and v_3, each of shape n x 3. And three tensors v_1', v_2' and v_3', also each of shape n x 3. I want to compute a tensor which stores n 3 x 3 matrices M_i, each solving the equation system
M_i * v_1_i = v_1_i'
M_i * v_2_i = v_2_i'
M_i * v_3_i = v_3_i'
It is guaranteed that this has one solution by construction. I just need the calculation for the rotation matrices. I tried torch.linalg.solve, but I can't figure out how to reshape the tensors correctly.
Thanks for your help.

Related

Plotting a Line of Best Fit on the Same Plot for Multiple Datasets

I am trying to approximate a line of best fit between multiple datasets, and display everything on one plot. This question addresses a similar notion, but the contents are in MatLab and, hence, not the same.
I have data from 4 different experiments that's composed of 146 values, the Y values represent changes in distance over time, the X value, which is represented by integer timesteps (1,2,3,...). The shape of my Y data is (4,146), as I've decided to keep all of it in a nested list, and the shape of my X data is (146,). I have the following set-up for my subplots:
x = [i for i in range(len(temp[0]))]
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax1.scatter(x,Y[0],c="blue", marker='.',linewidth=1)
ax1.scatter(x,Y[1],c="orange", marker='.',linewidth=1)
ax1.scatter(x,Y[2],c="green", marker='.',linewidth=1)
ax1.scatter(x,Y[3],c="purple", marker='.',linewidth=1)
z = np.polyfit(x,Y,3) # Throws an error because x,Y are not the same length
p = np.poly1d(z)
plt.plot(x, p(x))
I do not know how to fit a line of best fit between the scatter plots. numpy.polyfit documentation suggests that "Several data sets of sample points sharing the same x-coordinates can be fitted at once", but I have been unsuccessful thus far, and can only fit the line to one dataset. Is there a way that I can fit the line to all of the data sets? Should I use a different library entirely, like Seaborn?
Try to cast x and Y to a numpy arrays (I assume it is in a list). You can do this by using x = np.asarray(x). Now to fit on the data collectively, you can flatten the Y array using Y.flatten(). It transforms the shape from (n,N) to (n*N). And you can tile the x array n times to make a fit, this just copies the array n times into a new array so this will also become shape (n*N,). In this way you match the values form Y to corresponding values of x.
N = 10 # no. datapoints
n = 4 # no. experiments
# creating some dummy data
x = np.linspace(0,1, N) # shape (N,)
Y = np.random.normal(0,1,(n, N))
np.polyfit(np.tile(x, n), Y.flatten(), deg=3)
The polyfit function expects the Y array to be, in your case, (146, 4) rather than (4, 146), so you should pass it the transpose of Y, e.g.,
z = np.polyfit(x, Y.T, 3)
The poly1d function can only do one polynomial at a time, so you have to loop over the results from polyfit, e.g.,:
for res in z:
p = np.poly1d(res)
plt.plot(x, p(x))

PyTorch doubly stochastic normalisation of 3D tensor

I'm trying to implement double stochastic normalisation of an N x N x P tensor as described in Section 3.2 in Gong, CVPR 2019. This can be done easily in the N x N case using matrix operations but I am stuck with the 3D tensor case. What I have so far is
def doubly_stochastic_normalise(E):
"""E: n x n x f"""
E = E / torch.sum(E, dim=1, keepdim=True) # normalised across rows
F = E / torch.sum(E, dim=0, keepdim=True) # normalised across cols
E = torch.einsum('ijp,kjp->ikp', E, F)
return E
but I'm wondering if there is a method without einsum.
In this setting, you can always fall back to using torch.matmul (batched matrix multiplication to be more precise). However, this requires you to transpose the axis. Recall the matrix multiplication for two 3D inputs, in einsum notation, it gives us:
bik,bkj->bij
Notice how the k dimension gets reduces. To get to this setting, we need to transpose the inputs of the operator. In your case we have:
ijp ? kjp -> ikp
↓ ↓ ↑
pij # pjk -> pik
This translates to:
>>> (E.permute(2,0,1) # F.permute(2,1,0)).permute(1,2,0)
# ijp ➝ pij kjp ➝ pjk pik ➝ ikp
You can argue your method is not only shorter but also a lot more readable. I would therefore stick with torch.einsum. The reason why the einsum operator is so useful here is because you can perform axes transpositions on the fly.

To calculate euclidean distance between vectors in a torch tensor with multiple dimensions

There is a random initialized torch tensor of the shape as below.
Inputs
tensor1 = torch.rand((4,2,3,100))
tensor2 = torch.rand((4,2,3,100))
tensor1 and tensor2 are torch tensors with 24 100-dimensional vectors, respectively.
I want to get a tensor with a shape of torch.size([4,2,3]) by obtaining the Euclidean distance between vectors with the same index of two tensors.
I used dist = torch.nn.functional.pairwise_distance(tensor1, tensor2) to get the results I wanted.
However, the pairwise_distance function calculates the euclidean distance for the second dimension of the tensor. So dist shape is torch.size([4,3,100]).
I have performed transpose several times to solve these problems. My code is as follows.
tensor1 = tensor1.transpose(1,3)
tensor2 = tensor2.transpose(1,3)
dist = torch.nn.functional.pairwise_distance(tensor1, tensor2)
dist = dist.transpose(1,2)
Is there a simpler or easier way to get the result I want?
Here ya go
dist = (tensor1 - tensor2).pow(2).sum(3).sqrt()
Basically that's what Euclidean distance is.
Subtract -> power by 2 -> sum along the unfortunate axis you want to eliminate-> square root

Vectorized implementation of field-aware factorization

I would like to implement the field-aware factorization model (FFM) in a vectorized way. In FFM, a prediction is made by the following equation
where w are the embeddings that depend on the feature and the field of the other feature. For more info, see equation (4) in FFM.
To do so, I have defined the following parameter:
import torch
W = torch.nn.Parameter(torch.Tensor(n_features, n_fields, n_factors), requires_grad=True)
Now, given an input x of size (batch_size, n_features), I want to be able to compute the previous equation. Here is my current (non-vectorized) implementation:
total_inter = torch.zeros(x.shape[0])
for i in range(n_features):
for j in range(i + 1, n_features):
temp1 = torch.mm(
x[:, i].unsqueeze(1),
W[i, feature2field[j], :].unsqueeze(0))
temp2 = torch.mm(
x[:, j].unsqueeze(1),
W[j, feature2field[i], :].unsqueeze(0))
total_inter += torch.sum(temp1 * temp2, dim=1)
Unsurprisingly, this implementation is horribly slow since n_features can easily be as large as 1000! Note however that most of the entries of x are 0. All inputs are appreciated!
Edit:
If it can help in any ways, here are some implementations of this model in PyTorch:
pytorch-fm
ctr_model_zoo
Unfortunately, I cannot figure out exactly how they have done it.
Additional update:
I can now obtain the product of x and W in a more efficient way by doing:
temp = torch.einsum('ij, jkl -> ijkl', x, W)
Thus, my loop is now:
total_inter = torch.zeros(x.shape[0])
for i in range(n_features):
for j in range(i + 1, n_features):
temp1 = temp[:, i, feature2field[j], :]
temp2 = temp[:, j, feature2field[i], :]
total_inter += 0.5 * torch.sum(temp1 * temp2, dim=1)
It is however still too long since this loop goes over for about 500 000 iterations.
Something that could potentially help you speed up the multiplication is using pytorch sparse tensors.
Also something that might work would be the following:
Create n arrays, one for each feature i that would hold its corresponding field factors in each row. e.g. for feature i = 0
[ W[0, feature2field[0], :],
W[0, feature2field[1], :],
W[0, feature2field[n], :]]
Then calculate the multiplication of those arrays, lets call them F, with X
R[i] = F[i] * X
So each element in R would hold the result of the multiplication, an array, of the F[i] with X.
Next you would multiply each R[i] with its transpose
R[i] = R[i] * R[i].T
Now you can do the summation in a loop like before
for i in range(n_features):
total_inter += torch.sum(R[i], dim=1)
Please take this with a grain of salt as i haven't tested it. In any case i think that it will point you in the right direction.
One problem that might occur is in the transpose multiplication in which each element will also be multiplied with itself and then be added in the sum. I don't think it will affect the classifier but in any case you can make the elements in the diagonal of the transpose and above 0 (including the diagonal).
Also although minor nevertheless please move the 1st unsqueeze operation outside of the nested for loop.
I hope it helps.

How to set up the number of inputs neurons in sklearn MLPClassifier?

Given a dataset of n samples, m features, and using [sklearn.neural_network.MLPClassifier][1], how can I set hidden_layer_sizes to start with m inputs? For instance, I understand that if hidden_layer_sizes= (10,10) it means there are 2 hidden layers each of 10 neurons (i.e., units) but I don't know if this also implies 10 inputs as well.
Thank you
This classifier/regressor, as implemented, is doing this automatically when calling fit.
This can be seen in it's code here.
Excerpt:
n_samples, n_features = X.shape
# Ensure y is 2D
if y.ndim == 1:
y = y.reshape((-1, 1))
self.n_outputs_ = y.shape[1]
layer_units = ([n_features] + hidden_layer_sizes +
[self.n_outputs_])
You see, that your potentially given hidden_layer_sizes is surrounded by layer-dimensions defined by your data within .fit(). This is the reason, the signature reads like this with a subtraction of 2!:
Parameters
hidden_layer_sizes : tuple, length = n_layers - 2, default (100,)
The ith element represents the number of neurons in the ith hidden layer.

Resources