Compute a convolution with weights that where computed by another function - pytorch

I would like to perform a conv2d but a special one because I compute the weights of a conv2d layer in a function for each sample. Can I simply assign the weights of a given layer for each sample?
I am implementing the Spatially variant convolution of this paper : https://arxiv.org/abs/1804.00389
Thanks

It is actually very simple : torch.nn.functional.conv2d !

Related

Do I need to apply the Softmax Function ANYWHERE in my multi-class classification Model?

I am currently turning my Binary Classification Model to a multi-class classification Model. Bare with me.. I am very knew to pytorch and Machine Learning.
Most of what I state here, I know from the following video.
https://www.youtube.com/watch?v=7q7E91pHoW4&t=654s
What I read / know is that the CrossEntropyLoss already has the Softmax function implemented, thus my output layer is linear.
What I then read / saw is that I can just choose my Model prediction by taking the torch.max() of my model output (Which comes from my last linear output. This feels weird because I Have some negative outputs and i thought I need to apply the SOftmax function first, but It seems to work right without it.
So know the big confusing question I have is, when would I use the Softmax function? Would I only use it when my loss doesnt have it implemented? BUT then I would choose my prediction based on the outputs of the SOftmax layer which wouldnt be the same as with the linear output layer.
Thank you guys for every answer this gets.
For calculating the loss using CrossEntropy you do not need softmax because CrossEntropy already includes it. However to turn model outputs to probabilities you still need to apply softmax to turn them into probabilities.
Lets say you didnt apply softmax at the end of you model. And trained it with crossentropy. And then you want to evaluate your model with new data and get outputs and use these outputs for classification. At this point you can manually apply softmax to your outputs. And there will be no problem. This is how it is usually done.
Traning()
MODEL ----> FC LAYER --->raw outputs ---> Crossentropy Loss
Eval()
MODEL ----> FC LAYER --->raw outputs --> Softmax -> Probabilites
Yes you need to apply softmax on the output layer. When you are doing binary classification you are free to use relu, sigmoid,tanh etc activation function. But when you are doing multi class classification softmax is required because softmax activation function distributes the probability throughout each output node. So that you can easily conclude that the output node which has the highest probability belongs to a particular class. Thank you. Hope this is useful!

Pytorch how use a linear activation function

In Keras, I can create any network layer with a linear activation function as follows (for example, a fully-connected layer is taken):
model.add(keras.layers.Dense(outs, input_shape=(160,), activation='linear'))
But I can't find the linear activation function in the PyTorch documentation. ReLU is not suitable, because there are negative values in my sample. How do I create a layer with a linear activation function in PyTorch?
If you take a look at the Keras documentation, you will see tf.keras.layers.Dense's activation='linear' corresponds to the a(x) = x function. Which means no non-linearity.
So in PyTorch, you just define the linear function without adding any activation layer:
torch.nn.Linear(160, outs)
activation='linear' is equivavlent to no activation at all.
As can be seen here, it is also called "passthrough", meaning the it does nothing.
So in pytorch you can simply not apply any activation at all, to be in parity.
However, as already told by #Minsky, hidden layer without real activation, i.e. some non-linear activation is useless. It is like changing the weights which is anyway done during the network taining.
As already answered you don't need a linear activation layer in pytorch. But if you need to include it, you can write a custom one, that passes the output as follows.
class linear(torch.nn.Module):
# a linear activation function based on y=x
def forward(self, output):return output
The you can call it like any other activation function.
linear()(torch.tensor([1,2,3])) == nn.ReLU()(torch.tensor([1,2,3]))

Keras custom layer function

I am following the self attention in Keras in the following link: How to add attention layer to a Bi-LSTM
I am new to python , what does the shape=(input_shape[-1],1) in self.add_weight and shape=(input_shape[1],1) in bias means?
The shape argument sets the expected input dimensions which the model will be fed. In your case, it is just going to be whatever the last dimension of the input shape is for the weight layer and the second dimension of the input shape for the bias layer.
Neural networks take in inputs of fixed size so while building a model, it is important that you hard code the input dimensions for each layer.

What is the difference between density of input layer and input_dim in Keras library?

I have a question about the terms of MLP in Keras.
what does the density of a layer mean?
is it the same as the number of neurons? if it is, so what's the role of input_dim?
I have never head of the "density" of a layer in the context of vanilla feed forward networks. I would assume it refers to the number of neurons, but really it depends on context.
Input layer with a certain dimension and the first hidden layer with input_dim argument are both equivalent ways to handle input in Keras.

LSTM with variable sequences & return full sequences

How can I set up a keras model such that the final LSTM layer outputs a prediction for each time step while having variable sequence lengths as input?
I'd then like to provide labels for each of the timesteps after a dense layer with linear activation.
When I try to add a reshape or a dense layer to the LSTM model that is returning the full sequence and has a masking layer to take care of variable sequence lengths, it says:
The reshape and the dense layers do not support masking.
Would this be possible to do?
You can use the TimeDistributed layer wrapper for this. This applies the layer you want to each timestep. In your case, you could also just use TimeDistributedDense.

Resources