Weird output for weights/filters in CNN - python-3.x

My task is to visualize the plotted weights in a cnn layer, now when I passed parameters, filters = 32 and kernel_size = (3, 3), I am expecting the output to be 32 matrices each of 3x3 size by using .get_weights() function(to extract weights and biases), but I am getting a very weird nested output,
the output is as follows:
a = model.layers[0].get_weights()
a[0][0][0]
array([[ 2.87332404e-02, -2.80513391e-02,
**... 32 values ...**,
-1.55516148e-01, -1.26494586e-01, -1.36454999e-01,
1.61165968e-02, 7.63138831e-02],
[-5.21791205e-02, 3.13560963e-02, **... 32 values ...**,
-7.63987377e-02, 7.28923678e-02, 8.98564830e-02,
-3.02852653e-02, 4.07049060e-02],
[-7.04478994e-02, 1.33816227e-02,
**... 32 values ...**, -1.99537817e-02,
-1.67200342e-01, 1.15980692e-02]], dtype=float32)
I want to know that why I am getting this type of weird output and how can I get the weights in the perfect shape. Thanks in advance.

Weights in neural network are values that represent connection strength between input nodes and output nodes(or nodes in next layer).
Conv2D layer's weights usually have shape of (H, W, I, O), where:-
H is kernel height
W is kernel width
I is number of input channels
O is number of output channels
Conv2D weights can be interpreted as connection strength between a patch of input channels and nodes in output filter/feature map. This way you would have weights of shape(H, W) between each Input channels and each Output Channels. It should be noted that the weights are shared among different patches of the same channel.
Consider the following convolution of (8, 8, 1) input with (2, 2) kernel and output with (8, 8, 1). The weights of this layer has shape (2, 2, 1, 1)
The same input can be used to produce 2 feature map using 2 (2, 2) filters as follows. Now the shape of the weights would be (2, 2,1, 2).
Hope this will clarify how to interpret the shape of convolutional layers.

The shape of the kernel weights from a Conv2D layer is (kernel_size[0], kernel_size[1], n_input_channels, filters). So in your case
a = model.layers[0].get_weights()
print(a[0].shape)
# should print (3,3,z,32) if your input has shape (x, y, z)
If you want to print the weights from one of the filters, you can do
a[0][:,:,:,0]

Related

Convolutional layer: does the filter convolves also trough the nlayers_in or it take all the dimensions?

In the leading DeepLearning libraries, does the filter (aka kernel or weight) in the convolutional layer convolves also across the "channel" dimension or does it take all the channels at once?
To make an example, if the input dimension is (60,60,10) (where the last dimension is often referred as "channels") and the desired output number of channels is 5, can the filter be (5,5,5,5) or should it be (5,5,10,5) instead ?
It should be (5, 5, 10, 5). Conv2d operation is just like Linear if you ignore the spatial dimensions.
From TensorFlow documentation [link]:
Given an input tensor of shape batch_shape + [in_height, in_width, in_channels] and a filter / kernel tensor of shape [filter_height, filter_width, in_channels, out_channels], this op performs the following:
Flattens the filter to a 2-D matrix with shape [filter_height * filter_width * in_channels, output_channels].
Extracts image patches from the input tensor to form a virtual tensor of shape [batch, out_height, out_width, filter_height * filter_width * in_channels].
For each patch, right-multiplies the filter matrix and the image patch vector.
It takes all channels at once, so 5×5×10×5 should be right.
julia> using Flux
julia> c = Conv((5,5), 10 => 5); # make a layer, 10 channels to 5
julia> c.weight |> summary
"5×5×10×5 Array{Float32, 4}"
julia> c(randn(Float32, 60, 60, 10, 1)) |> summary # check it works
"56×56×5×1 Array{Float32, 4}"
julia> Conv(rand(Float32, (5,5,5,5))) # different weight size
Conv((5, 5), 5 => 5) # 630 parameters

what the Conv1D with kernel size equal to 1 do?

I read an example of using LSTM with CONV1.
(Took it from: CNN LSTM)
Conv1D(filters=64, kernel_size=1, activation='relu')
I understand that the dimension of the convolutional is 1 (one dim with size 1))
what is the value of the convolution ? (what is the value of the matrix 1*1 ?)
I can't figure out what is the filters=64 ? what does it mean ?
Is the relu activation function work on the output of the convolutional ? (from what I read it seems like that, but I'm not sure)
what is the motivation to use convolutional with kernel_size = 1, as we do here ?
filters
filters = 64 means number of separate filters used is 64.
Each filter will output 1 channel. i.e. here 64 filters operate on input to produce 64 different channels(or vectors). Hence filters parameter determines number of output channels.
kernel_size
kernel_size determines the size of the convolution window. Suppose kernel_size = 1 then each kernel will have dimension of in_channels x 1. Hence each kernel weight will be in_channels x 1 dimension tensor.
activation = relu
That means relu activation will be applied on the output of convolution operation.
kernel_size = 1 convolution
Used to reduce depth channels with applying non-linearity. It will do something like weighted average across the channels while keeping receptive field.
In your eg: filters = 64, kernel_size = 1, activation = relu
Suppose input feature map has size of 100 x 10(100 channels). Then the layer weight will of dimension 64 x 100 x 1. The output size will be 64 x 10.

PyTorch: Convolving a single channel image using torch.nn.Conv2d

I am trying to use a convolution layer to convolve a grayscale (single layer) image (stored as a numpy array). Here is the code:
conv1 = torch.nn.Conv2d(in_channels = 1, out_channels = 1, kernel_size = 33)
tensor1 = torch.from_numpy(img_gray)
out_2d_np = conv1(tensor1)
out_2d_np = np.asarray(out_2d_np)
I want my kernel to be 33x33 and the number of output layers should be equal to the number of input layers, which is 1 as the image's RGB channels are summed. Whenout_2d_np = conv1(tensor1) is run it yields the following runtime error:
RuntimeError: Expected 4-dimensional input for 4-dimensional weight 1 1 33 33, but got 2-dimensional input of size [246, 248] instead
Any idea on how I can solve this? I specifically want to use the torch.nn.Conv2d() class/function.
Thanks in advance for any help!
pytorch's Conv2d expects its 2D inputs to actually have 4 dimensions: mini-batch dim, channel dim, and the two spatial dimensions.
Your input tensor has only two spatial dimensions and it lacks the mini-batch and channel dimensions. In your case these two dimensions are actually singelton dimensions (dimensions with size=1).
try:
conv1(tensor1[None, None, ...])

Output shape of a convolutional layer with multiple filters

I built a ConvNet in keras and these are 2 of the layers
model.add(Conv2D(8 , 3 , input_shape = (28,28,1)))
model.add(Activation(act))
model.add(Conv2D(16 , 3))
model.add(Activation(act))
The output of the first layer of size 26x26x8 which I completely understand since there are 8 filters of size 3x3 and each of them is applied to produce a separate feature map hence 26x26x8
The output of the second layer is of size 24x24x16 which I do not understand. Shouldn't the output be of size 24x24x128 since each of the filters of the second layer will act on each feature map of the output of the first layer ?
Basically, I do not understand how the output of a layer is fed to the input of the other layer
No, it's a convolutions over volume. Each filter is applied for all channels.
I would have loved it if someone had taken the time to actually write out the mathematics. But I'm guessing no one knew what the actual operations were. The ambiguous language "applied on all channels" was the same thing the OP thought was going on. A commenter above used this language to mean they were summed over all channels. Not clear.
I had the same question as the OP. I found the answer. The Conv2D layer's convolution operation in Keras creates a filter which has the same final dimension as the input.
Say you have an input, X, of shape (6, 6, 3), a tensor of size 6×6 in 3 channels (colors or whatever). Then creating a 2D convolution layer with
conv = Conv2D(2, 3, input_shape=(6, 6, 3))
will create 2 filters of size (3, 3, 3), f1 and f2. Then applying each filter the correct way to an input would look like f1ijk Xijk, where the i and j are summed over all relevant indexes for the location and k, the color channel, is summed over all values, i.e. 1, 2, and 3 here. This produces an output of size (4, 4, 1) for each filter. Together the two filters produce an output of size (4, 4, 2).
If we had assumed, as the OP seems to have, that each filter of 3-channel tensors was only of the shape (3, 3, 1) then you'd be confused as to how to handle its application to a 3-dimensional tensor, which might cause someone who cares about the actual operations to think that the filters would be applied as a tensor product, creating a significantly higher dimension of output from the layer.

Does 1D Convolutional layer support variable sequence lengths?

I have a series of processed audio files I am using as input into a CNN using Keras. Does the Keras 1D Convolutional layer support variable sequence lengths? The Keras documentation makes this unclear.
https://keras.io/layers/convolutional/
At the top of the documentation it mentions you can use (None, 128) for variable-length sequences of 128-dimensional vectors. Yet at the bottom it declares that the input shape must be a
3D tensor with shape: (batch_size, steps, input_dim)
Given the following example how should I input sequences of variable length into the network
Lets say I have two examples (a and b) containing X 1 dimensional vectors of length 100 that I want to feed into the 1DConv layer as input
a.shape = (100, 100)
b.shape = (200, 100)
Can I use an input shape of (2, None, 100)? Do I need to concatenate these tensors into c where
c.shape = (300, 100)
Then reshape it to be something
c_reshape.shape = (3, 100, 100)
Where 3 is the batch size, 100, is the number of steps, and the second 100 is the input size? The documentation on the input vector is not very clear.
Keras supports variable lengths by using None in the respective dimension when defining the model.
Notice that often input_shape refers to the shape without the batch size.
So, the 3D tensor with shape (batch_size, steps, input_dim) suits perfectly a model with input_shape=(steps, input_dim).
All you need to make this model accept variable lengths is use None in the steps dimension:
input_shape=(None, input_dim)
Numpy limitation
Now, there is a numpy limitation about variable lengths. You cannot create a numpy array with a shape that suits variable lengths.
A few solutions are available:
Pad your sequences with dummy values until they all reach the same size so you can put them into a numpy array of shape (batch_size, length, input_dim). Use Masking layers to disconsider the dummy values.
Train with separate numpy arrays of shape (1, length, input_dim), each array having its own length.
Group your images by sizes into smaller arrays.
Be careful with layers that don't support variable sizes
In convolutional models using variable sizes, you can't for instance, use Flatten, the result of the flatten would have a variable size if this were possible. And the following Dense layers would not be able to have a constant number of weights. This is impossible.
So, instead of Flatten, you should start using GlobalMaxPooling1D or GlobalAveragePooling1D layers.

Resources