The formula for the number of MAC operations in CONV is given by Filter_height * filter_width * In_channels * out_height * out_width * out_channels.
I understand that multiplication and accumulation happens together every cycle. Does this formula also include the bias term addition?
Related
In Keras VAE implementation:
class Sampling(layers.Layer):
"""Uses (z_mean, z_log_var) to sample z, the vector encoding a digit."""
def call(self, inputs):
z_mean, z_log_var = inputs
batch = tf.shape(z_mean)[0]
dim = tf.shape(z_mean)[1]
epsilon = tf.keras.backend.random_normal(shape=(batch, dim))
return z_mean + tf.exp(0.5 * z_log_var) * epsilon
My question is about the "tf.exp(0.5 * z_log_var)" part: why we used the exponential and not just have the var as it is? I mean why not just: return z_mean + z_log_var * epsilon
I want to know why tf.exp(0.5 * z_log_var) and not just z_log_var?
I am using it for tabular data and not images. I mean, I am using dense layers and not Conv layers.
First, as you can guess from the name, the encoder outputs the log variance. Thereby, we ensure that the variance is always positive (exp(z_log_var) >= 0), because the e function is always positive.
The part that is done in this snippet is called the reparameterization trick (as discussed here). The key idea for normal distributions is that any normal distribution can be expressed by sampling from a standard gaussian distribution and shifting it as follows z = mu + sigma * epsilon with epsilon ~ N(0,1). For this, we need the standard deviation, exp(0.5 * z_log_var) converts the log variance to the standard deviation.
In the leading DeepLearning libraries, does the filter (aka kernel or weight) in the convolutional layer convolves also across the "channel" dimension or does it take all the channels at once?
To make an example, if the input dimension is (60,60,10) (where the last dimension is often referred as "channels") and the desired output number of channels is 5, can the filter be (5,5,5,5) or should it be (5,5,10,5) instead ?
It should be (5, 5, 10, 5). Conv2d operation is just like Linear if you ignore the spatial dimensions.
From TensorFlow documentation [link]:
Given an input tensor of shape batch_shape + [in_height, in_width, in_channels] and a filter / kernel tensor of shape [filter_height, filter_width, in_channels, out_channels], this op performs the following:
Flattens the filter to a 2-D matrix with shape [filter_height * filter_width * in_channels, output_channels].
Extracts image patches from the input tensor to form a virtual tensor of shape [batch, out_height, out_width, filter_height * filter_width * in_channels].
For each patch, right-multiplies the filter matrix and the image patch vector.
It takes all channels at once, so 5×5×10×5 should be right.
julia> using Flux
julia> c = Conv((5,5), 10 => 5); # make a layer, 10 channels to 5
julia> c.weight |> summary
"5×5×10×5 Array{Float32, 4}"
julia> c(randn(Float32, 60, 60, 10, 1)) |> summary # check it works
"56×56×5×1 Array{Float32, 4}"
julia> Conv(rand(Float32, (5,5,5,5))) # different weight size
Conv((5, 5), 5 => 5) # 630 parameters
How we can calculate the shape of conv1d layer in PyTorch. IS there any command to calculate size and shape of these layers in PyTorch.
nn.Conv1d(depth_1, depth_2, kernel_size=kernel_size_2, stride=stride_size),
nn.ReLU(),
nn.MaxPool1d(kernel_size=2, stride=stride_size),
nn.Dropout(0.25)```
The output size can be calculated as shown in the documentation nn.Conv1d - Shape:
The batch size remains unchanged and you already know the number of channels, since you specified them when creating the convolution (depth_2 in this example).
Only the length needs to be calculated and you can do that with a simple function analogous to the formula above:
def calculate_output_length(length_in, kernel_size, stride=1, padding=0, dilation=1):
return (length_in + 2 * padding - dilation * (kernel_size - 1) - 1) // stride + 1
The default values specified are also the default values of nn.Conv1d, therefore you only need to specify what you also specify to create the convolution. It uses an integer division //, because the numerator might be not be divisible by stride, in which case it just gets rounded down (indicated by the brackets that are only closed at towards the bottom).
The same formula also applies to nn.MaxPool1d, but keep in mind that it automatically sets stride = kernel_size if stride is not specified.
I am working on a multi-class semantic segmentation task, and would like to define a custom, weighted metric for calculating how well my NN is performing.
I am using U-net to segment my image into one of 8 classes, of which 1-7 are the particular classes and 0 is background. How do I use the standard custom metric template defined on the Keras metrics page, so that I only get the IoU of only channels 1-7, multiplied by a (1,7) weights array? I tried removing the background channel in the custom metric by using
y_true, y_pred = y_true[1:,:,:], y_pred[1:, :,:]
but it does not look like that's what I want. Any help will be appreciated.
The change that was necessary
def dice_coef_multilabel(y_true, y_pred, numLabels=CLASSES):
dice=0
for index in range(numLabels):
dice -= dice_coef(y_true[:,:,index], y_pred[:,:,index])
return dice
If needed, the dice coeff can be calcualted across channels by using two nested loops to loop over all the channel combinations. I'm also including the dice coefficient calculation.
def dice_coef(y_true, y_pred):
y_true_f = K.flatten(y_true)
y_pred_f = K.flatten(y_pred)
intersection = K.sum(y_true_f * y_pred_f)
return (2. * intersection + smooth) / (K.sum(y_true_f) + K.sum(y_pred_f) + smooth)
FWIW, this github link has various types of metrics were implemented in a channel-wise manner.
It’s known that sparse_categorical_crossentropy in keras can get the average loss function among each category. But what if only one certain category was I concerned most? Like if I want to define the precision(=TP/(TP+FP)) based on this category as loss function, how can I write it? Thanks!
My codes were like:
from keras import backend as K
def my_loss(y_true,y_pred):
y_true = K.cast(y_true,"float32")
y_pred = K.cast(K.argmax(y_pred),"float32")
nominator = K.sum(K.cast(K.equal(y_true,y_pred) & K.equal(y_true, 0),"float32"))
denominator = K.sum(K.cast(K.equal(y_pred,0),"float32"))
return -(nominator + K.epsilon()) / (denominator + K.epsilon())
And the error is like:
argmax is not differentiable
I don't recommend you to use precision as the loss function.
It is not differentiable that can't be set as a loss function for nn.
you can max it by predicting all the instance as class negative, that makes no sense.
One of the alternative solution is using F1 as the loss function, then tuning the probability cut-off manually for obtaining a desirable level of precision as well as recall is not too low.
You can pass to the fit method a parameter class_weight where you determine which classes are more important.
It should be a dictionary:
{
0: 1, #class 0 has weight 1
1: 0.5, #class 1 has half the importance of class 0
2: 0.7, #....
...
}
Custom loss
If that is not exactly what you need, you can create loss functions like:
import keras.backend as K
def customLoss(yTrue,yPred):
create operations with yTrue and yPred
- yTrue = the true output data (equal to y_train in most examples)
- yPred = the model's calculated output
- yTrue and yPred have exactly the same shape: (batch_size,output_dimensions,....)
- according to the output shape of the last layer
- also according to the shape of y_train
all operations must be like +, -, *, / or operations from K (backend)
return someResultingTensor
You cannot used argmax as it is not differentiable. That means that backprop will not work if loss function can't be differentiated.
Instead of using argmax, do y_true * y_pred.