How to write Evaluation code for image classification? - image-classification

I'm having problems writing the code for the model to evaluate image data it has not seen and classified by itself. The model is trained but writing the code for the evaluation part is where I'm stuck. The model believes all the images are the same class. Where there should be two classes.This is the code I put down for the evaluation

Related

Hugging face transformer: model bio_ClinicalBERT not trained for any of the task?

This maybe the most beginner question of all :sweat:.
I just started learning about NLP and hugging face. The first thing I'm trying to do is to apply one the bioBERT models on some clinical note data and see what I do, before moving on to the fine-tuning the model. And it looks like "emilyalsentzer/Bio_ClinicalBERT" to be the closest model for my data.
But as I try to use it for any of the analyses I always get this warning.
Some weights of the model checkpoint at emilyalsentzer/Bio_ClinicalBERT were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.weight', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight']
From the hugging face course chapter 2 I understand this meant.
This is because BERT has not been pretrained on classifying pairs of sentences, so the head of the pretrained model has been discarded and a new head suitable for sequence classification has been added instead. The warnings indicate that some weights were not used (the ones corresponding to the dropped pretraining head) and that some others were randomly initialized (the ones for the new head). It concludes by encouraging you to train the model, which is exactly what we are going to do now.
So I went on to test which NLP task I can use "emilyalsentzer/Bio_ClinicalBERT" for, out of the box.
from transformers import pipeline, AutoModel
checkpoint = "emilyalsentzer/Bio_ClinicalBERT"
nlp_task = ['conversational', 'feature-extraction', 'fill-mask', 'ner',
'question-answering', 'sentiment-analysis', 'text-classification',
'token-classification',
'zero-shot-classification' ]
for task in nlp_task:
print(task)
process = pipeline(task=task, model = checkpoint)
And I got the same warning message for all the NLP tasks, so it appears to me that I shouldn't/advised not to use the model for any of the tasks. This really confuses me. The original bio_clinicalBERT model paper stated that they had good results on a few different tasks. So certainly the model was trained for those tasks. I also have similar issue with other models as well, i.e. the blog or research papers said a model obtained good results with a specific task but when I tried to apply with pipeline it gives the warning message. Is there any reason why the head layers were not included in the model?
I only have a few hundreds clinical notes (also unannotated :frowning_face:), so it doesn't look like it's big enough for training. Is there any way I could use the model on my data without training?
Thank you for your time.
This Bio_ClinicalBERT model is trained for Masked Language Model (MLM) task. This task basically used for learning the semantic relation of the token in the language/domain. For downstream tasks, you can fine-tune the model's header with your small dataset, or you can use a fine-tuned model like Bio_ClinicalBERT-finetuned-medicalcondition which is the fine-tuned version of the same model. You can find all the fine-tuned models in HuggingFace by searching 'bio-clinicalBERT' as in the link.

How do I evaluate documents on a trained model in Document Understanding

I am trying to extract data from pdf. I have trained a document understanding model (invoices) on my dataset and retrained it multiple times. Now I want to evaluate the model on some unseen documents. What is the procedure for that? This is what I think but I am not sure:
First import the evaluation docs into data manager as a separate batch and make sure to check the "make this an evaluation set" option.
Then label the docs. I use the predict option present in the data manager to make the labeling faster.
Export the evaluation data. This exported data will contain all the training and evaluation docs.
Run the evaluation pipeline on this dataset.
This is my understanding of the process band I may be wrong. Kindly let me know if I am in the right direction!
What I Tried: I tried the above procedure.
What happened: I think along with evaluating the evaluation docs it also evaluated the train docs as the evaluation metrics file contained all the train and evaluation docs.

Cascading two or more LSTM models

I am working on a case study where i want to make a comparison of performance between a standard LSTM model and a cascaded lstm models as provided in the picture (you could see the block diagram). I would like to know what function could be useful to stack these models. it worth mentioning that each output sequence is an input to the next block, i.e. the LSTM-1hr model has been cascaded with each other and the output block was separately trained in a supervised manner while freezing weights for the input block. The secondary block is initialized with the weights from the basic 1hr model.
the image shows the block diagram of the models that i want to build

Is it possible to customise tf.estimator for Unsupervised Learning with self-defined evaluation graph not having loss?

I am implementing a item2vec model using the idea of word2vec
with tf.estimator API for product recommendation.
There's no problem implementing training part with tf.estimator. The process is same as word2vec, and I see each transactions as a sentence. Only difference is how to generate training input:(target_item, context_item) pairs. After training the pseudo-classification problem, I could use trained embedding vector for each items to measure relationship between them.
The problem is, for evaluation part, it is not a typical supervised learning evaluation, ie. with eval data as input, going through the same graph, we obtain predictions and accuracy.
The evaluation input data I would like to use, is in a totally different format from training input data.
Format of Eval input data: (target_item, {context_item1, context_item2, ...}). With this, I could obtain top_k nearest items for each context_items and then see if the target_item is in the collection of these nearest items, so that I could obtain a hit-ratio from it.
However, tf.estimator.EstimatorSpec() for mode = MODE.EVAL requires a loss as input. So, does it mean evaluation can only reuse part of the training graph? What could I do if I don't have a loss function for evaluation in my case, as the evaluation does not go through the classification anymore?
Many thanks.

Using a pytorch model for inference

I am using the fastai library (fast.ai) to train an image classifier. The model created by fastai is actually a pytorch model.
type(model)
<class 'torch.nn.modules.container.Sequential'>
Now, I want to use this model from pytorch for inference. Here is my code so far:
torch.save(model,"./torch_model_v1")
the_model = torch.load("./torch_model_v1")
the_model.eval() # shows the entire network architecture
Based on the example shown here: http://pytorch.org/tutorials/beginner/data_loading_tutorial.html#sphx-glr-beginner-data-loading-tutorial-py, I understand that I need to write my own data loading class which will override some of the functions in the Dataset class. But what is not clear to me is the transformations that I need to apply at test time? In particular, how do I normalize the images at test time?
Another question: is my approach of saving and loading the model in pytorch fine? I read in the tutorial here: http://pytorch.org/docs/master/notes/serialization.html that the approach that I have used is not recommended. The reason is not clear though.
Just to clarify: the_model.eval() not only prints the architecture, but sets the model to evaluation mode.
In particular, how do I normalize the images at test time?
It depends on the model you have. For instance, for torchvision modules, you have to normalize the inputs this way.
Regarding on how to save / load models, torch.save/torch.load "saves/loads an object to a disk file."
So, if you save the_model, it will save the entire model object, including its architecture definition and some other internal aspects. If you save the_model.state_dict(), it will save a dictionary containing the model state (i.e. parameters and buffers) only. Saving the model can break the code in various ways, so the preferred method is to save and load only the model state. However, I'm not sure if fast.ai "model file" is actually a full model or the state of a model. You have to check this so you can correctly load it.

Resources