I read an example of using LSTM with CONV1.
(Took it from: CNN LSTM)
Conv1D(filters=64, kernel_size=1, activation='relu')
I understand that the dimension of the convolutional is 1 (one dim with size 1))
what is the value of the convolution ? (what is the value of the matrix 1*1 ?)
I can't figure out what is the filters=64 ? what does it mean ?
Is the relu activation function work on the output of the convolutional ? (from what I read it seems like that, but I'm not sure)
what is the motivation to use convolutional with kernel_size = 1, as we do here ?
filters
filters = 64 means number of separate filters used is 64.
Each filter will output 1 channel. i.e. here 64 filters operate on input to produce 64 different channels(or vectors). Hence filters parameter determines number of output channels.
kernel_size
kernel_size determines the size of the convolution window. Suppose kernel_size = 1 then each kernel will have dimension of in_channels x 1. Hence each kernel weight will be in_channels x 1 dimension tensor.
activation = relu
That means relu activation will be applied on the output of convolution operation.
kernel_size = 1 convolution
Used to reduce depth channels with applying non-linearity. It will do something like weighted average across the channels while keeping receptive field.
In your eg: filters = 64, kernel_size = 1, activation = relu
Suppose input feature map has size of 100 x 10(100 channels). Then the layer weight will of dimension 64 x 100 x 1. The output size will be 64 x 10.
Related
How we can calculate the shape of conv1d layer in PyTorch. IS there any command to calculate size and shape of these layers in PyTorch.
nn.Conv1d(depth_1, depth_2, kernel_size=kernel_size_2, stride=stride_size),
nn.ReLU(),
nn.MaxPool1d(kernel_size=2, stride=stride_size),
nn.Dropout(0.25)```
The output size can be calculated as shown in the documentation nn.Conv1d - Shape:
The batch size remains unchanged and you already know the number of channels, since you specified them when creating the convolution (depth_2 in this example).
Only the length needs to be calculated and you can do that with a simple function analogous to the formula above:
def calculate_output_length(length_in, kernel_size, stride=1, padding=0, dilation=1):
return (length_in + 2 * padding - dilation * (kernel_size - 1) - 1) // stride + 1
The default values specified are also the default values of nn.Conv1d, therefore you only need to specify what you also specify to create the convolution. It uses an integer division //, because the numerator might be not be divisible by stride, in which case it just gets rounded down (indicated by the brackets that are only closed at towards the bottom).
The same formula also applies to nn.MaxPool1d, but keep in mind that it automatically sets stride = kernel_size if stride is not specified.
I have tested the speed of separable_conv2d and normal conv2d implemented in TF, it seems the only depthwise_conv2d is faster than the normal conv2d but the performance of the dw_conv2d is poor obviously.
The separable_conv2d mentioned in MobileNet, its FLOPs is 1/9 of the normal conv when the kernel_size=3, but considering the Memory Access Cost the separable one cannot be 9 times faster than the normal one but in my experiment, the separable one is too much slower.
I modeling the experiment like this separable_conv2d is too slow. In this experiment, the separable_conv2d seems faster than the normal one when the depth_multiply=1, but when I use tf.nn to implement it as follows:
IMAGE_SIZE= 512
REPEAT = 100
KERNEL_SIZE = 3
data_format = 'NCHW'
#CHANNELS_BATCH_SIZE = 2048 # channe# ls * batch_size
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"
def normal_layers(inputs, nfilter, name=''):
with tf.variable_scope(name, reuse=tf.AUTO_REUSE):
shape = inputs.shape.as_list()
in_channels = shape[1]
filter = tf.get_variable(initializer=tf.initializers.random_normal,
shape=[KERNEL_SIZE, KERNEL_SIZE,
in_channels, nfilter], name='weight')
conv = tf.nn.conv2d(input= inputs, filter=filter, strides=
[1,1,1,1],padding='SAME',data_format=data_format,
name='conv')
return conv
def sep_layers(inputs, nfilter, name=''):
with tf.variable_scope(name, reuse=tf.AUTO_REUSE):
shape= inputs.shape.as_list()
in_channels = shape[1]
dw_filter=
tf.get_variable(initializer=tf.initializers.random_normal,
shape=[KERNEL_SIZE, KERNEL_SIZE,
in_channels, 1], name='dw_weight')
pw_filter =
tf.get_variable(initializer=tf.initializers.random_normal,
shape=[1,1,in_channels, nfilter],
name='pw_weight')
conv = tf.nn.depthwise_conv2d_native(input=inputs,
filter=dw_filter,
strides=[1,1,1,1],
padding='SAME',
data_format=data_format)
conv = tf.nn.conv2d(input=conv,
filter=pw_filter,
strides=[1,1,1,1],
padding='SAME',
data_format=data_format)
return conv
Each layer run in 100 times,
different from the link is I set the batch_size as a constant 10,
and the channels is in [32, 64, 128],
inputs is[batch_size, channels, img_size, img_size]
and the duration of them as follows:
Channels: 32
Normal Conv 0.7769527435302734s, Sep Conv 1.4197885990142822s
Channels: 64
Normal Conv 0.8963277339935303s, Sep Conv 1.5703468322753906s
Channels: 128
Normal Conv 0.9741833209991455s, Sep Conv 1.665834665298462s
It seems that when batch_size is a constant only change channel the time cost of the normal one and the separable one is growing gradually.
And when setting batch_size * channels as a constant
Inputs shape as [CHANNELS_BATCH_SIZE // channels, channels, imgsize, imgsize]
Channels: 32
Normal Conv 0.871959924697876s, Sep Conv 1.569300651550293s
Channels: 64
Normal Conv 0.909860372543335s, Sep Conv 1.604109525680542s
Channels: 128
Normal Conv 0.9196009635925293s, Sep Conv 1.6144189834594727s
What confuses me is the result is different from the result of the link above: the time cost of sep_conv2d does not have obvious change.
My Questions are:
What makes it different between the link above and the experiment by myself?
I am a newbie, so is there something wrong in my code to implement the separable_conv2d?
How to implement the separable_conv2d can be faster than the normal one in TF or in Pytorch?
Any help would be appreciated. Thank in advance.
My task is to visualize the plotted weights in a cnn layer, now when I passed parameters, filters = 32 and kernel_size = (3, 3), I am expecting the output to be 32 matrices each of 3x3 size by using .get_weights() function(to extract weights and biases), but I am getting a very weird nested output,
the output is as follows:
a = model.layers[0].get_weights()
a[0][0][0]
array([[ 2.87332404e-02, -2.80513391e-02,
**... 32 values ...**,
-1.55516148e-01, -1.26494586e-01, -1.36454999e-01,
1.61165968e-02, 7.63138831e-02],
[-5.21791205e-02, 3.13560963e-02, **... 32 values ...**,
-7.63987377e-02, 7.28923678e-02, 8.98564830e-02,
-3.02852653e-02, 4.07049060e-02],
[-7.04478994e-02, 1.33816227e-02,
**... 32 values ...**, -1.99537817e-02,
-1.67200342e-01, 1.15980692e-02]], dtype=float32)
I want to know that why I am getting this type of weird output and how can I get the weights in the perfect shape. Thanks in advance.
Weights in neural network are values that represent connection strength between input nodes and output nodes(or nodes in next layer).
Conv2D layer's weights usually have shape of (H, W, I, O), where:-
H is kernel height
W is kernel width
I is number of input channels
O is number of output channels
Conv2D weights can be interpreted as connection strength between a patch of input channels and nodes in output filter/feature map. This way you would have weights of shape(H, W) between each Input channels and each Output Channels. It should be noted that the weights are shared among different patches of the same channel.
Consider the following convolution of (8, 8, 1) input with (2, 2) kernel and output with (8, 8, 1). The weights of this layer has shape (2, 2, 1, 1)
The same input can be used to produce 2 feature map using 2 (2, 2) filters as follows. Now the shape of the weights would be (2, 2,1, 2).
Hope this will clarify how to interpret the shape of convolutional layers.
The shape of the kernel weights from a Conv2D layer is (kernel_size[0], kernel_size[1], n_input_channels, filters). So in your case
a = model.layers[0].get_weights()
print(a[0].shape)
# should print (3,3,z,32) if your input has shape (x, y, z)
If you want to print the weights from one of the filters, you can do
a[0][:,:,:,0]
I have a time-series of data and am running some very basic tests to get a feel for TensorFlow, Keras, Python, etc.
To setup the problem, I have a large amount of images whereby 7 images of data (with Cartesian dimensions 33 x 33) when accumulated should yield a single value. Therefore, the amount of 'x' data should be y*7 where y is the 'truth' data being trained with.
All of the training data is in entitled 'alldatax' which is a large matrix: [420420 x 33 x 33 x 7 x 1] where the dimensions are the total number of single images, x-dimension, y-dimension, number of images to be accumulated for a single 'truth' value, and then a final dimension necessary for 3D convolving.
The 'truth' matrix, alldatay, is a 1D matrix which is simply 420420 / 7 = 60060.
When running a simple convnet:
model = models.Sequential()
model.add(layers.InputLayer(input_shape=(33,33,7,1)))
model.add(layers.Conv3D(16,(3,3,1), activation = 'relu', input_shape = (33,33,7,1)))
model.add(layers.LeakyReLU(alpha=0.3))
model.add(layers.MaxPooling3D((2,2,1)))
model.add(layers.Conv3D(32,(3,3,1), activation = 'relu'))
model.add(layers.LeakyReLU(alpha=0.3))
model.add(layers.MaxPooling3D((2,2,1)))
model.add(layers.Flatten())
model.add(layers.Dense(512, activation = 'relu'))
model.add(layers.LeakyReLU(alpha=0.3))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(32, activation = 'relu'))
model.add(layers.LeakyReLU(alpha=0.3))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(1, activation = 'relu'))
model.compile(optimizer = 'adam', loss = 'mse')
model.fit(x = alldatax, y = alldatay, batch_size = 1000, epochs = 50, verbose = 1, shuffle = False)
I get an error: ValueError: Input arrays should have the same number of samples as target arrays. Found 420420 input samples and 60060 target samples.
What needs to change to get the convnet to realize it needs 7*x for every y value?
Something seems to be wrong in your calculations.
You state that the neural net should take seven 33x33 images as one input example, so you set the input shape of the first layer to (33,33,7,1) which is right. This means for every 33x33x7x1 input there should be exactly one y value.
Since all of your data all your data comprises 420420 33x33x7x1 images there should be 420420 y values, not 60060.
I am very confused by these two parameters in the conv1d layer from keras:
https://keras.io/layers/convolutional/#conv1d
the documentation says:
filters: Integer, the dimensionality of the output space (i.e. the number output of filters in the convolution).
kernel_size: An integer or tuple/list of a single integer, specifying the length of the 1D convolution window.
But that does not seem to relate to the standard terminologies I see on many tutorials such as https://adeshpande3.github.io/adeshpande3.github.io/A-Beginner's-Guide-To-Understanding-Convolutional-Neural-Networks/ and https://machinelearningmastery.com/sequence-classification-lstm-recurrent-neural-networks-python-keras/
Using the second tutorial link which uses Keras, I'd imagine that in fact 'kernel_size' is relevant to the conventional 'filter' concept which defines the sliding window on the input feature space. But what about the 'filter' parameter in conv1d? What does it do?
For example, in the following code snippet:
model.add(embedding_layer)
model.add(Dropout(0.2))
model.add(Conv1D(filters=100, kernel_size=4, padding='same', activation='relu'))
suppose the embedding layer outputs a matrix of dimension 50 (rows, each row is a word in a sentence) x 300 (columns, the word vector dimension), how does the conv1d layer transforms that matrix?
Many thanks
You're right to say that kernel_size defines the size of the sliding window.
The filters parameters is just how many different windows you will have. (All of them with the same length, which is kernel_size). How many different results or channels you want to produce.
When you use filters=100 and kernel_size=4, you are creating 100 different filters, each of them with length 4. The result will bring 100 different convolutions.
Also, each filter has enough parameters to consider all input channels.
The Conv1D layer expects these dimensions:
(batchSize, length, channels)
I suppose the best way to use it is to have the number of words in the length dimension (as if the words in order formed a sentence), and the channels be the output dimension of the embedding (numbers that define one word).
So:
batchSize = number of sentences
length = number of words in each sentence
channels = dimension of the embedding's output.
The convolutional layer will pass 100 different filters, each filter will slide along the length dimension (word by word, in groups of 4), considering all the channels that define the word.
The outputs are shaped as:
(number of sentences, 50 words, 100 output dimension or filters)
The filters are shaped as:
(4 = length, 300 = word vector dimension, 100 output dimension of the convolution)
Below code from the explanation can help do this. I went similar question and answered it myself.
from tensorflow.keras.layers import MaxPool1D
import tensorflow.keras.backend as K
import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import Conv1D
tf.random.set_seed(1) # nowadays instead of tf.set_random_seed(1)
batch,rows,cols = 3,8,3
m, n, k = batch, rows, cols
input_shape = (batch,rows,cols)
np.random.seed(132) # nowadays instead of np.set_random_seed = 132
data = np.random.randint(low=1,high=6,size=input_shape,dtype='int32')
data = np.float32(data)
data = tf.constant(data)
print("Data:")
print(K.eval(data))
print()
print(f'm,n,k:{input_shape}')
from tensorflow.keras.layers import Conv1D
#############################
# Understandin filters and kernel_size
##############################
num_filters=5
kernel_size= 3
'''
Few Notes about Kernel_size:
1. max_kernel_size == max_rows
2. since Conv1D, we are creating 1D Matrix of 1's with kernel_size
if kernel_size = 1, [[1,1,1..]]
if kernel_size = 2, [[1,1,1..][1,1,1,..]]
if kernel_size = 3, [[1,1,1..][1,1,1,..]]
I have chosen tf.keras.initializers.constant(1) to create a matrix of Ones.
Size of matrix is Kernel_Size
'''
y= Conv1D(filters=num_filters,kernel_size=kernel_size,
kernel_initializer=tf.keras.initializers.constant(1),
#glorot_uniform(seed=12)
input_shape=(k,n)
)(data)
#########################
# Checking the out outcome
#########################
print(K.eval(y))
print(f' Resulting output_shape == (batch_size, num_rows-kernel_size+1,num_filters): {y.shape}')
# # Verification
K.eval(tf.math.reduce_sum(data,axis=(2,1), # Sum along axis=2, and then along
axis=1,keep_dims=True)
###########################################
# Understanding MaxPool and Strides in
##########################################
pool = MaxPool1D(pool_size=3,strides=3)(y)
print(K.eval(pool))
print(f'Shape of Pool: {pool.shape}')