Keras custom layer function - keras

I am following the self attention in Keras in the following link: How to add attention layer to a Bi-LSTM
I am new to python , what does the shape=(input_shape[-1],1) in self.add_weight and shape=(input_shape[1],1) in bias means?

The shape argument sets the expected input dimensions which the model will be fed. In your case, it is just going to be whatever the last dimension of the input shape is for the weight layer and the second dimension of the input shape for the bias layer.
Neural networks take in inputs of fixed size so while building a model, it is important that you hard code the input dimensions for each layer.

Related

Keras CNN that uses matrix (single channel) input as its argument. And the layer between the input and first is MLP

Using Keras, I want to model CNN that uses matrix (single channel) input as its argument,
but the layer between the input and first is MLP. The convolution is carried out to the outputs of the first hidden layer
How can I code this?

What is the difference between density of input layer and input_dim in Keras library?

I have a question about the terms of MLP in Keras.
what does the density of a layer mean?
is it the same as the number of neurons? if it is, so what's the role of input_dim?
I have never head of the "density" of a layer in the context of vanilla feed forward networks. I would assume it refers to the number of neurons, but really it depends on context.
Input layer with a certain dimension and the first hidden layer with input_dim argument are both equivalent ways to handle input in Keras.

Can I use a 3D input on a Keras Dense Layer?

As an exercise I need to use only dense layers to perform text classifications. I want to leverage words embeddings, the issue is that the dataset then is 3D (samples,words of sentence,embedding dimension). Can I input a 3D dataset into a dense layer?
Thanks
As stated in the keras documentation you can use 3D (or higher rank) data as input for a Dense layer but the input gets flattened first:
Note: if the input to the layer has a rank greater than 2, then it is flattened prior to the initial dot product with kernel.
This means that if your input has shape (batch_size, sequence_length, dim), then the dense layer will first flatten your data to shape (batch_size * sequence_length, dim) and then apply a dense layer as usual. The output will have shape (batch_size, sequence_length, hidden_units). This is actually the same as applying a Conv1D layer with kernel size 1, and it might be more explicit to use a Conv1D layer instead of a Dense layer.

Keras LSTM error: Input from layer reshape is incompatible with layer lstm

Using RapidMiner I want to implement an LSTM to classify patternes in a time series. Input data is a flat table. My first layer in the Keras operator is a core reshape from exampleset_length x nr_of_attributes to batch x time-steps x features. In the reshape parameter I specifically enter three figures because I want a specific amount of features and time-steps. The only way to achieve this is to specify also batch size, so in total three figures. But when I add a RNN LSTM layer an error is returned: Input is incompatible with layer lstm expected ndim=n found ndim=n+1. What’s wrong?
When specifying 'input_shape' for the LSTM layer, you do not include the batch size.
So your 'input_shape' value should be (timesteps, input_dim).
Source: Keras RNN Layer, the parent layer for LSTM

LSTM with variable sequences & return full sequences

How can I set up a keras model such that the final LSTM layer outputs a prediction for each time step while having variable sequence lengths as input?
I'd then like to provide labels for each of the timesteps after a dense layer with linear activation.
When I try to add a reshape or a dense layer to the LSTM model that is returning the full sequence and has a masking layer to take care of variable sequence lengths, it says:
The reshape and the dense layers do not support masking.
Would this be possible to do?
You can use the TimeDistributed layer wrapper for this. This applies the layer you want to each timestep. In your case, you could also just use TimeDistributedDense.

Resources