Is there accumarray equivalent in pytorch? - pytorch

Let's assume I have a feature tensor [f1, f2, f3, f4].
I want to pool the features according to an arbitrary index tensor (e.g. [0,2,1,0]).
Then the results would be [f1+f4, f3, f2].
I found that accumarray is the function that I want, but it is on MATLAB.
Is there any similar one in PyTorch, while preserving gradient for learning?

torch_scatter.scatter_add works perfectly.

Related

Merging several convolutional layers into one

How can I compose several convolutional layers into one layer. I mean if there is no non-linear activations in between. How do I write a code for it in pytorch?
I want the code to account for different padding and strides. I thought about having a template image and run the conv layers on it to obtain one kernel, but can't really come up with a meaningful way to do it
Here there are detailed instructions for collapsing 2 convolution layers into 1.
You can use the code to merge the first two and then to merge the outcome with the third.
Conceptually, you can vision the process in a simpler way by using the 'Toeplitz matrix' represantaion of each convolution operation, then use matrix multiplication to multiply all three, and then return to one convolution operation (since it is more efficiently implemented, and the Toeplitz representation is very sparse).
The convolution operation can be constructed as a matrix multiplication, where one of the inputs is converted into a Toeplitz matrix.
You can see an example of this approach here.

Is there any reason for using the word "column" in the context of one-dimensional tensor?

Consider the following statements from the chapter named Tensors: Multidimensional arrays from the textbook titled Deep Learning with PyTorch by Eli Stevens et al.
Let’s construct our first PyTorch tensor and see what it looks like.
It won’t be a particularly meaningful tensor for now, just three ones
in a column:
# In[4]:
import torch
a = torch.ones(3)
a
In general, the notion of the column is used in the case of at least two dimensions. The tensor initialized is of a single dimension. So, I am guessing that it is immaterial if we use either row or column.
Am I true? Else, Is there any reason behind using the word "column" in this context?
Often, in linear algebra theory, an n-dimensional vector is considered as a n x 1 matrix, called a column vector.
Indeed, the behavior of a tensor t with shape (n,) is very similar to that of a tensor u of shape (n, 1). In mathematical terms, you can think of a vector t in R^n and a vector u in R^{n x 1}.
In conclusion, the author, perhaps, is suggesting to treat the tensor as a mathematical column vector.

Get exact formula used by Pytorch autograd to compute gradients

I am implementing a custom CNN with some custom modules in it. I have implemented only the forward pass for the custom modules and left their backward pass to autograd.
I have manually computed the correct formulae for backpropagation through the parameters of the custom modules, and I wished to see whether they match with the formulae used internally by autograd to compute the gradients.
Is there any way to see this?
Thanks
Edit (To add a test case) :-
I have a complex affine layer where the weights and inputs are complex-valued matrices, and the operation is a matrix multiplication of the weight and input matrices.
The multiplication of two complex numbers is given by -
(a+ib)(c+id) = (ac-bd)+i(ad+bc)
I computed the backpropagation formula for this layer given we have the incoming gradient from the higher layer.
It comes out to be dL/dI(n) = (hermitian(W(n))).matmul(dL/dI(n+1))
where I(n) and W(n) are the input and weight of nth layer and I(n+1) is input of (n+1)th layer.
So I wished to check whether autograd is also computing dL/dI(n) using the same formula that I derived.
(Since Pytorch doesn't support complex-valued tensors backpropagation as for now, I have created my own representation of complex numbers by dealing with separate real and imaginary tensors)
I don't believe there is such a feature in pytorch, even because it would be quite unreadable. What you can do is to implement a custom backward method for your layer with the formula you derived, then know by design that the backpropagation is what you want.

Compute a linear combination of tensors in Tensorflow

I am attempting to compute a linear combination of n tensors of the same dimension in Tensorflow. The scalar coefficients are Tensorflow Variables.
Since tf.scalar_mul does not generalise to multiplying a vector of tensors by a vector of scalars, I have thus far used tf.gather and performed each multiplication individually in a python for loop, and then converted the list of results to a tensor and summed them across the zeroth axis. Like so:
coefficients = tf.Variable(tf.constant(initial_value, shape=[n]))
components = []
for i in range(n):
components.append(tf.scalar_mul(tf.gather(coefficients, i), tensors[i]))
combination = tf.reduce_sum(tf.convert_to_tensor(components), axis=0)
This works fine, but does not scale well at all. My application requires computing n linear combinations, meaning I have n^2 gather and multiply operations. With large values of n the computation time is poor and the memory usage of the program is unreasonably large.
Is there a more natural way of computing a linear combination like this in Tensorflow that would be faster and less resource intensive?
Use broadcasting. Assuming coefficients has shape (n,) and tensors shape (n,...) you can simply use
coefficients[:, tf.newaxis, ...] * tensors
here, you would need to repeat tf.newaxis as many times as tensors has dimenions besides the one of size n. So e.g. if tensors has shape (n, a, b) you would use coefficients[:, tf.newaxis, tf.newaxis]
This will turn coefficients into a tensor with the same number of dimensions as tensors, but all dimensions except the first one are of size 1, so they can be broadcast to the shape of tensors.
Some alternatives:
Define coefficients as a variable with the correct number of dimensions in the first place (a little ugly in my opinion).
Use tf.reshape to reshape coefficients to (n, 1, ...) instead if you don't like the indexing syntax.
Use tf.transpose to shift the dimension of size n to the end of tensors. Then the dimensions align for broadcasting without needing to add dimensions to coefficients.
Also see the numpy docs on broadcasting -- it works essentially the same way in Tensorflow.
There is a new PyPI module called TWIT, Tensor Weighted Interpolative Transfer, that will do this fast. It is written in C for the core operations.

what is the equivalent of theano.tensor.clip in pytorch?

I want to clip my tensor (not gradient) values to some range. Is there any function in pytorch like there is a function theano.tensor.clip() in theano?
The function you are searching for is called torch.clamp. You can find the documentation here

Resources