keras prediction output's decimal point without scientifc notation? - keras

Through Keras, I have a question while studying at CNN.
How do I output the results from the predictive calculation of the model to the fixed decimal point?
The results of my current model are array([[6.527474e-05, 5.269228e-05, 9.998820e-01]], dtype=float32)
I want my model output -
(9.998820e-01) -> 0.9998820

Both are actually the same.
(9.998820e-01) and 0.9998820
you can enable this option, to display the numbers without scientific notation while you print the numpy arrays.
np.set_printoptions(suppress=True)

Related

Why does my variational autoencoder only produce positive values?

I copied this example to build a variational autoencoder (VAE). The example uses images, but I use it for a signal that contains negative values. After training, the autoencoder only reconstructs the positive part of the signal, it does not produce negative values. Can anyone spot where the problem is or explain why this is the case?
If you used the exact code as the one shown in the example you put the link in, then at the end of the decoder you have x = torch.sigmoid(self.decConv2(x)) which take the real number line and outputs numbers between [0, 1]. This is why the network is unable to output negative numbers.
If you want to change the model to output negative numbers as well, remove the sigmoid function.
This means of course that you also have to change the loss function with which you train your model since the BCE loss is only good for outputs in the range of [0, 1].
As a recommendation I would suggest anyone to use the BCE with logits loss and avoid using the sigmoid in the decoder since this method incorporates the sigmoid and the BCE loss in a more numerically stable manner.

Tensorflow Keras: Problems to handle variable length input, using generator?

We want to train our model on varying input dimensions. Every input in a given batch and across batches has different dimensions.
We cannot resize our input (since we’ll lose our microscopic features). Now, since we cannot resize our input, converting them into batches of numpy array becomes impossible. In order to handle this now I have made the list for the input and each list of element contained (height, width, 1). Height is variable size and width is constant.
Sometime my input excessively large. In order to do that I have plan to use model.fit_generator(). In this, We find the max height and width of input in a batch and pad every other input with zeros so that every input in the batch has an equal dimension. Now we can easily convert it to a numpy array or a tensor and pass it to the fit_generator(). The model automatically learns to ignore the zeros and learns features from the intended portion from the padded input. This way we have a batch with equal input dimensions but every batch has a different shape (due to difference in max height and width of input across batches).
Now until here, I described the things what I have learned and what I have plan to do with variable input data. But I am stuck with the following confusions:
1- I have plan to use CNN first and then LSTM on that. I am using tensorflow keras. There, we have the facility of padding and masking . However, As for as I know that LSTM can work on masking and padding ignore 0-padded values. However, I am concerned about the CNN (does CNN ignores 0-padded values), because my padded input will first feed to CNN. I have seen some discussion in the following links:
How to apply masking layer to sequential CNN model in Keras?
https://github.com/keras-team/keras/issues/411
In these link, they mentioned that Unfortunately masking is not yet supported by the Keras Conv layers. However, now we can see alot of development and advancements specifically in the form of tensorflow Keras. So I am wondering that now tensorflow keras can support masking input?
2- To use the generator, we can use custom keras generator. For that I went through a vary good tutorial. I made the mind to use this. But I am wondering is there any advance built-in facility in tensorflow keras to use generator and save me to write custom keras generator?

How does keras obtain a floating point value for loss from a vector of loss values?

I am training an ML model in Tensorflow 1.15 using a custom training loop and I want to print out the loss, however, it is a vector with a dimension equal to the batch size. What does Keras do when you call model.fit() to print the loss as a float? Does it reduce it to the mean of the vector, or perhaps it performs some other reduction?
The reason I am asking is that I want to make sure the loss I am logging is consistent throughout my models and others did not require a custom training loop.

What steps should I take next to improve my accuracy? Can data be the problem?

I built various ML models using sklearn for a binary classification problem. The data-set is provided to me by my professor for this comparative study.
my jupyter notebook and dataset can be found here
As I am getting very low accuracy, I fear that I must be doing something wrong while building the model. So I tested my decision tree on the inbuilt data-set in sklearn (breast cancer data-set) which is very similar to my data-set as both are binary classifications. Here I get an mean accuracy of 95 %. So I think right now that the problem might be my data-set. Can I get some help on how do I pre-process my data or any other steps that I might look into to improve accuracy.
Encode labels
Categorical data are variables that contain label values rather than numeric values.The number of possible values is often limited to a fixed set.
For example, users are typically described by country, gender, age group etc. We will use Label Encoder to label the categorical data. Label Encoder is the part of SciKit Learn library in Python and used to convert categorical data, or text data, into numbers, which our predictive models can better understand.
#Encoding categorical data values
from sklearn.preprocessing import LabelEncoder
labelencoder_Y = LabelEncoder()
Y = labelencoder_Y.fit_transform(Y)
Feature scaling
Most of the times, your dataset will contain features highly varying in magnitudes, units and range. But since, most of the machine learning algorithms use Eucledian distance between two data points in their computations. We need to bring all features to the same level of magnitudes. This can be achieved by scaling. This means that you’re transforming your data so that it fits within a specific scale, like 0–100 or 0–1. We will use StandardScaler method from SciKit-Learn library.
#Feature Scalingfrom sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
Choosing Right model
You kight also want to vhoose the appropriate model. You can't just use neural nets or so for all problems it's the no free luch theorem. For this you could use K-fold cross validation, AIC and BIC

sklearn NB classifier: How to get the actual probabilities of individual samples?

I am making a machine learning program which classifies words in one of the following categories: Hardware, Software, None_of_these. I make use of the Multinomial Naive Bayes classifier from sklearn.
The function predict() gives me the prediction of every word, however, I can't see the actual probability (float ranging for 0 to 1.0) that the word matches with the predicted categorie. I didn't find this on sklearn's site either.
Is there a function which gives me the probability of every sample?
Nevermind, I found the solution.:
predict_proba(X) Returns probability estimates for the test vector X.

Resources