I have built a variational autoencoder using 2D convolutions (Conv2D) in the encoder and decoder. I'm using Keras. In total I have 2 layers with 32 and 64 filters each and a a kernel size of 4x4 and stride 2x2 each. My input images are (64, 80, 1). I'm using the MSE loss. Now, I would like to visualize the individual convolutional layers (i.e. what they learn) as done here.
So, first I load my model using load_weights() function and then I call visualize_layer(encoder, 'conv2d_1') from above mentioned code where conv2d_1 is the layer name of the first convolutional layer in my encoder.
When I do so I'm getting the error message
tensorflow.python.framework.errors_impl.UnimplementedError: Fused conv implementation does not support grouped convolutions for now.
[[{{node conv2d_1/BiasAdd}}]]
When I use the VGG16 model as in the example code it works. Does somebody know how I can adapt the code to work for my case?
Related
I am following the self attention in Keras in the following link: How to add attention layer to a Bi-LSTM
I am new to python , what does the shape=(input_shape[-1],1) in self.add_weight and shape=(input_shape[1],1) in bias means?
The shape argument sets the expected input dimensions which the model will be fed. In your case, it is just going to be whatever the last dimension of the input shape is for the weight layer and the second dimension of the input shape for the bias layer.
Neural networks take in inputs of fixed size so while building a model, it is important that you hard code the input dimensions for each layer.
As an exercise I need to use only dense layers to perform text classifications. I want to leverage words embeddings, the issue is that the dataset then is 3D (samples,words of sentence,embedding dimension). Can I input a 3D dataset into a dense layer?
Thanks
As stated in the keras documentation you can use 3D (or higher rank) data as input for a Dense layer but the input gets flattened first:
Note: if the input to the layer has a rank greater than 2, then it is flattened prior to the initial dot product with kernel.
This means that if your input has shape (batch_size, sequence_length, dim), then the dense layer will first flatten your data to shape (batch_size * sequence_length, dim) and then apply a dense layer as usual. The output will have shape (batch_size, sequence_length, hidden_units). This is actually the same as applying a Conv1D layer with kernel size 1, and it might be more explicit to use a Conv1D layer instead of a Dense layer.
Using RapidMiner I want to implement an LSTM to classify patternes in a time series. Input data is a flat table. My first layer in the Keras operator is a core reshape from exampleset_length x nr_of_attributes to batch x time-steps x features. In the reshape parameter I specifically enter three figures because I want a specific amount of features and time-steps. The only way to achieve this is to specify also batch size, so in total three figures. But when I add a RNN LSTM layer an error is returned: Input is incompatible with layer lstm expected ndim=n found ndim=n+1. What’s wrong?
When specifying 'input_shape' for the LSTM layer, you do not include the batch size.
So your 'input_shape' value should be (timesteps, input_dim).
Source: Keras RNN Layer, the parent layer for LSTM
I have created a CNN in Keras with 12 Convolutional layers each followed by BatchNormalization, Activation and MaxPooling. A sample of the layer is:
model.add(Conv2D(256, (3, 3), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=2))
I start with 32 feature maps and end with 512. If I add MaxPooling after every Conv Layer like in the code above, I get an error in the final layer:
ValueError: Negative dimension size caused by subtracting 2 from 1 for 'max_pooling2d_11/MaxPool' (op: 'MaxPool') with input shapes: [?,1,1,512].
If I omit one MaxPooling in any layer the model compiles and starts training. I am using Tensorflow as backend and I have the right input shape of the image in the first layer.
Are there any suggestions why this may happening?
If your spatial dimensions are 256x256, then you cannot have more than 8 Max-Pooling layers in your network. As 2 ** 8 == 256, after downsampling by a factor of two, eight times, your feature maps will be 1x1 in the spatial dimensions, meaning you cannot perform max pooling as you would get a 0x0 or negative dimensions.
Its just an obvious limitation of Max Pooling but not always discussed in papers.
This can also be caused by having your input image in the wrong format
if you're using (3,X,Y) and it expects (X,Y,3) then the down sampling occurs on the colour channels and causes issues.
I wanted to retrain the fully connected layers of VGG 16 for big gray level images (1800x1800), using Keras with Then backend.
So I've:
created a new VGG with a single color channel and loaded the weights from of the original VGG.
add trainable=False to all the convolution layers (the pooling and padding are not trainable by definition)
delete the two first dense layers to keep only the output layer with two neurons
increase drastically the max pooling dimensions and strides because I work with inputs 1800x1800 (no choice). The dimensions drop very quickly to match the original VGG dimensions.
reduce the batch size in order to reduce the memory required.
But when I start the training, I face a CNMEM_STATUS_OUT_OF_MEMORY error. I use NVIDIA K40, so I have 12Go of memory.
Any idea how to fix it?