Total Number of Parameters in CNN - conv-neural-network

I want to know how to calculate the number of parameters in CNN?
Like for a 1D convolutional layer with input size of 10 with filter size 3 and a stride of 1, how many different parameters should be there?
Thanks in advance!

Related

How to set up a validation set such that each minibatch contains enough labels for validation in pytorch?

I have a dataset as follows: X are features, D are categorical features (we assume that D can take 3 values only: 0, 1, 2), and response Y. I want to split the dataset into training set, validation set and test set according to a specific ratio. However, the sample size may not be large enough such that after splitting the dataset, the size of validation set is small. When we undergo minibatch validation, it may happen that the batch of validation does not contain all the categorical features (say the batch contain only treatment 0). I would like to have a minibatch which contains 3 categorical features. Otherwise, our customized loss function is not computable. How to do this to ensure that the minibatch contains 3 categorical features?

Gaussian Mixture model log-likelihood to likelihood-Sklearn

I want to calculate the likelihoods instead of log-likelihoods. I know that score gives per sample average log-likelihood and for that I need to multiply score with sample size but the log likelihoods are very large negative numbers such as -38567258.1157 and when I take np.exp(scores) , I get a zero. Any help is appreciated.
gmm=GaussianMixture(covariance_type="diag",n_components=2)
y_pred=gmm.fit_predict(X_test)
scores=gmm.score(X_test)

Scaling new data LSTM [duplicate]

While applying min max scaling to normalize your features, do you apply min max scaling on the entire dataset before splitting it into training, validation and test data?
Or do you split first and then apply min max on each set, using the min and max values from that specific set?
Lastly , when making a prediction on a new input, should the features of that input be normalized using the min, max values from the training data before being fed into the network?
Split it, then scale. Imagine it this way: you have no idea what real-world data looks like, so you couldn't scale the training data to it. Your test data is the surrogate for real-world data, so you should treat it the same way.
To reiterate: Split, scale your training data, then use the scaling from your training data on the testing data.

Arbitary choosen values as std/mean for normalizatio. why?

I have a question regarding the z-score normalization method.
This method uses the z-score to normalize the values of the dataset and needs a mean/std.
I know that you are normally supposed to use the mean/std of the dataset.
But I have seen multiple tutorials on pytorch.org and the net who just use the 0.5 for mean/std which seems completely arbitrary to me.
And I was wondering why they didn't use the mean/std of the dataset?
Example Tutorials where they just use 0.5 as mean/std:
https://pytorch.org/tutorials/beginner/dcgan_faces_tutorial.html
https://medium.com/ai-society/gans-from-scratch-1-a-deep-introduction-with-code-in-pytorch-and-tensorflow-cb03cdcdba0f
https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html#sphx-glr-beginner-blitz-cifar10-tutorial-py
If you use the std/mean of your dataset to normalize the same dataset you are going to have after the normalization a mean of 0 and an std of 1.
Where the min/max values of the normalized dataset are not in a certain range.
If you use mean/std of 0.5 as a parameter for normalization of your dataset you are going to have a dataset in the range of -1 to 1.
And the mean of the normalized dataset will be close to zero and the std of the normalized dataset will be close to 0.5.
So to answer my question you use 0.5 as mean/std when you want that your dataset is in a range of -1 to 1.
Which would be beneficial when using, for example, a tanh activation function in a neural network.

2DPooling in Keras doesn't pool last column

When performing 2DPooling in keras over an input with odd dimension, say 8x24x128, the output is appropriately 4x12x128 if 2x2 pooling is used. When the input has an odd dimension, say 8x25x128, the output is 4x12x128. The pooling does NOT operate on the last column (25) of the input. I would like to zero pad the input to 8x26x128 with an extraneous zero column. Is this possible?
In general terms: what is the proper etiquette for pooling over odd dimensional inputs?

Resources