When training deep semantic segmentation models, it is often convenient to visualize a sample of predictions on the validation set during the training. Right now I'm simply saving some predictions to disk on my training server. I'm looking to migrate this task to TensorBoard. Simply put, I wan't to visualize a set of predictions (say 5) over each epoch.
I know there is a simple way to do it in pure TensorFlow like tf.summary.image(..) but I don't see any easy way to incorporate this into the Keras TensorBoard callback.
Any guidance would be much appreciated.
Fabio Perez provided an answer which should do exactly what you're looking for here How to display custom images in TensorBoard using Keras?
Related
I'm using Windows 10 machine. Libraries: Keras with Tensorflow 2.0 Embeddings: Glove(100 dimensions).
I am trying to implement an LSTM architecture for multi-label text classification.
I am using different types of fine-tuning to achieve better results but with no luck so far.
The main problem I believe is the difference in class distributions of my dataset but after a lot of tries and errors, I couldn't implement stratified-k-split in Keras.
I am also experimenting with dropout layers, batch sizes, # of layers, learning rates, clip values, validation splits but I get a minimum boost or worst performance sometimes.
For metrics, I use mainly ROC and F1.
I also followed the suggestion from a StackOverflow member who said to delete some of my examples so I can balance my dataset but if I do that I will have a very low number of examples.
What would you suggest to me?
If someone can provide code based on my implementation for
stratified-k-split I would be grateful cause I have checked all the
online resources but can't implement it.
Any tips, suggestions will be really helpful.
Metrics Plots
Dataset form+Embedings form+train-test-split form
Dataset's labels distribution
My LSTM implementation
I'm using Windows 10 machine.
Libraries: Keras with Tensorflow 2.0
Embeddings:Glove(100 dimensions)
I am trying to implement an LSTM architecture for multi-label text classification.
My problem is that no matter how much fine-tuning I do, the results are really bad.
I am not experienced in DL practical implementations that's why I ask for your advice.
Below I will state basic information about my dataset and my model so far.
I can't embed images since I am a new member so they appear as links.
Dataset form+Embedings form+train-test-split form
Dataset's labels distribution
My Implementation of LSTM
Model's Summary
Model's Accuracy plot
Model's Loss plot
As you can see my dataset is really small (~6.000 examples) and maybe that's one reason why I cannot achieve better results. Still, I chose it because it's unbiased.
I'd like to know if there is any fundamental mistake in my code regarding the dimensions, shape, activation functions, and loss functions for multi-label text classification?
What would you recommend to achieve better results on my model? Also any general advice regarding optimizing, methods,# of nodes, layers, dropouts, etc is very welcome.
Model's best val accuracy that I achieved so far is ~0.54 and even if I tried to raise it, it seems stuck there.
There are many ways to get this wrong but the most common mistake is to get your model overfit the training data.
I suspect that 0.54 accuracy means that your model selects the most common label (offensive) for almost all cases.
So, consider one of these simple solutions:
Create balanced training data: like 400 samples from each class.
or sample balanced batches for training (exactly the same number of labels on each training batch)
In addition to tracking accuracy and loss, look at precision-recall-f1 or even better try plotting area under curve, maybe different classes need different thresholds of activation. (If you are using Sigmoid on last layer maybe one class could perform better with 0.2 activations and another class with 0.7)
first try simple model. embedding 1 layer LSTM than classify
how to tokenize text , is vocab size enough ?
try dice loss
I'm trying to do domain adversarial training using gradient-reversal procedure. I have a deep-learning model architecture consisting of 5 dense layers. Now I want to extract the gradient, reverse them and then update the weights. I am not sure how to extract the gradients and finally how to add them to update the weights. I have gone through some example codes but I am still pretty doubtful regarding using Keras backend. Any help with some toy code or example with explanation is really appreciated.
Thank you,
How can I test my pytorch model on validation data during training?
I know that there is the function myNet.eval() which apparantly switches of any dropout layers, but is it also preventing the gradients from being accumulated?
Also how would I undo the myNet.eval() command in order to continue with the training?
If anyone has some code snippet / toy example I would be grateful!
How can I test my pytorch model on validation data during training?
There are plenty examples where there are train and test steps for every epoch during training. An easy one would be the official MNIST example. Since pytorch does not offer any high-level training, validation or scoring framework you have to write it yourself. Commonly this consists of
a data loader (commonly based on torch.utils.dataloader.Dataloader)
a main loop over the total number of epochs
a train() function that uses training data to optimize the model
a test() or valid() function to measure the effectiveness of the model given validation data and a metric
This is also what you will find in the linked example.
Alternatively you can use a framework that provides basic looping and validation facilities so you don't have to implement everything by yourself all the time.
tnt is torchnet for pytorch, supplying you with different metrics (such as accuracy) and abstraction of the train loop. See this MNIST example.
inferno and torchsample attempt to model things very similar to Keras and provide some tools for validation
skorch is a scikit-learn wrapper for pytorch that lets you use all the tools and metrics from sklearn
Also how would I undo the myNet.eval() command in order to continue with the training?
myNet.train() or, alternatively, supply a boolean to switch between eval and training: myNet.train(True) for train mode.
I know that there is the function myNet.eval() which apparantly switches of any dropout layers, but is it also preventing the gradients from being accumulated?
It doesn't prevent gradients from accumulating.
But I think during testing, you do want to ignore gradients. In that case, you should mark the variable input to the network as volatile=True, and it will save some time and space used in forward calculation.
Also how would I undo the myNet.eval() command in order to continue with the training?
myNet.train()
after training since it cost a lot of time is there a way for me to continue my training and add samples using nusvc() and nearestneighbor() in scikitlearn?
For the SVM, you might be able to use the online learning abilities of the SGDClassifier class. To do so, you would need to use the partial_fit() function.