Does training a tflite model require images annotated? - python-3.x

I am trying to implement TFLite model for food detection and segmentation. This is the model i chose suitable for my food images dataset: [https://tfhub.dev/s?deployment-format=lite&q=inception%20resnet%20v2].
I searched over google to understand how the images are required to be annotated, but only end up in confusion. I understand the dataset is converted to TFRecords and then fed to the pretrained model. But for training the model with custom dataset, does not it require an annotation file? I dont see any info about this on TF hub either.
Please can anyone help me on this!

The answer to your question is depends on what model do you plan to train.
In the case of a model for food detection and segmentation you do need annotations when training. If you do not provide the model with labeled training data as it is a supervised learning model it cannot learn from them.
If you were to train an autoencoder the data does not need to be annotated. Hope the keywords used in this answer help you out search for more information about the topic.

Related

Hugging face transformer: model bio_ClinicalBERT not trained for any of the task?

This maybe the most beginner question of all :sweat:.
I just started learning about NLP and hugging face. The first thing I'm trying to do is to apply one the bioBERT models on some clinical note data and see what I do, before moving on to the fine-tuning the model. And it looks like "emilyalsentzer/Bio_ClinicalBERT" to be the closest model for my data.
But as I try to use it for any of the analyses I always get this warning.
Some weights of the model checkpoint at emilyalsentzer/Bio_ClinicalBERT were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.weight', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight']
From the hugging face course chapter 2 I understand this meant.
This is because BERT has not been pretrained on classifying pairs of sentences, so the head of the pretrained model has been discarded and a new head suitable for sequence classification has been added instead. The warnings indicate that some weights were not used (the ones corresponding to the dropped pretraining head) and that some others were randomly initialized (the ones for the new head). It concludes by encouraging you to train the model, which is exactly what we are going to do now.
So I went on to test which NLP task I can use "emilyalsentzer/Bio_ClinicalBERT" for, out of the box.
from transformers import pipeline, AutoModel
checkpoint = "emilyalsentzer/Bio_ClinicalBERT"
nlp_task = ['conversational', 'feature-extraction', 'fill-mask', 'ner',
'question-answering', 'sentiment-analysis', 'text-classification',
'token-classification',
'zero-shot-classification' ]
for task in nlp_task:
print(task)
process = pipeline(task=task, model = checkpoint)
And I got the same warning message for all the NLP tasks, so it appears to me that I shouldn't/advised not to use the model for any of the tasks. This really confuses me. The original bio_clinicalBERT model paper stated that they had good results on a few different tasks. So certainly the model was trained for those tasks. I also have similar issue with other models as well, i.e. the blog or research papers said a model obtained good results with a specific task but when I tried to apply with pipeline it gives the warning message. Is there any reason why the head layers were not included in the model?
I only have a few hundreds clinical notes (also unannotated :frowning_face:), so it doesn't look like it's big enough for training. Is there any way I could use the model on my data without training?
Thank you for your time.
This Bio_ClinicalBERT model is trained for Masked Language Model (MLM) task. This task basically used for learning the semantic relation of the token in the language/domain. For downstream tasks, you can fine-tune the model's header with your small dataset, or you can use a fine-tuned model like Bio_ClinicalBERT-finetuned-medicalcondition which is the fine-tuned version of the same model. You can find all the fine-tuned models in HuggingFace by searching 'bio-clinicalBERT' as in the link.

Pytorch based Bert NER for transfer learning/retraining

I trained an Bert-based NER model using Pytorch framework by referring the below article.
https://www.depends-on-the-definition.com/named-entity-recognition-with-bert/.
After training the model using this approach, I saved that model using torch.save() method. Now, I want to retrain the model with new dataset.
Can someone please help me on how to perform retraining/transfer learning as I'm new to NLP and transformers.
Thanks in advance.
First, you can read the doc on pytorch loading, the model that can be very helpful to retrain the save model in a new dataset. I will provide the example of loading a save model. That can be very helpful to you
Original Doc :- https://pytorch.org/tutorials/beginner/saving_loading_models.html
Example code :- https://pythonguides.com/pytorch-load-model/
This two link are very helpful to train the new dataset on save model

Why some weights of GPT2Model are not initialized?

I am using the GPT2 pre-trained model for a research project and when I load the pre-trained model with the following code,
from transformers.models.gpt2.modeling_gpt2 import GPT2Model
gpt2 = GPT2Model.from_pretrained('gpt2')
I get the following warning message:
Some weights of GPT2Model were not initialized from the model checkpoint at gpt2 and are newly initialized: ['h.0.attn.masked_bias', 'h.1.attn.masked_bias', 'h.2.attn.masked_bias', 'h.3.attn.masked_bias', 'h.4.attn.masked_bias', 'h.5.attn.masked_bias', 'h.6.attn.masked_bias', 'h.7.attn.masked_bias', 'h.8.attn.masked_bias', 'h.9.attn.masked_bias', 'h.10.attn.masked_bias', 'h.11.attn.masked_bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
From my understanding, it says that the weights of the above layers are not initialized from the pre-trained model. But we all know that attention layers ('attn') are so important in GPT2 and if we can not have their actual weights from the pre-trained model, then what is the point of using a pre-trained model?
I really appreciate it if someone could explain this to me and tell me how I can fix this.
The masked_bias was added but the huggingface community as a speed improvement compared to the original implementation. It should not negatively impact the performance as the original weights are loaded properly. Check this PR for further information.

Is there a pretrained model that can detect and classify if a human is in a photo?

I am trying to find a pre-trained model that will classify images based on if there is a human present in the photo or not.
You can use the models trained on the COCO dataset for this.
For example, for Pytorch you can have a look at the official documentation concerning the provided models here.
There are more variety of models if you give it a simple search both for Pytorch and other frameworks.
You can check out the COCO homepage if you need more information concerning the dataset and the tasks it supports.
You may also find These useful:
Detecting people using Yolo-OpenCV
Yolo object detection in pytorch
Another Yolo implementation in Pytorch
Similar question on ai.stackexchange
You can also utilize frameworks such as Detectorn2, mmdetection for these tasks.(Or Tensorflow's ObjectDetectionAPI , ect)

BERT weight calculation

I am trying to understand the BERT weight calculation. Please suggest me some article which can help me to understand the internal workings of BERT. I have read articles from Medium.
https://towardsdatascience.com/deconstructing-bert-distilling-6-patterns-from-100-million-parameters-b49113672f77
https://towardsdatascience.com/deconstructing-bert-part-2-visualizing-the-inner-workings-of-attention-60a16d86b5c1
I am doing a small project to understand the Bert pretraining and fine-tuning from different sources. My idea is to calculate the weights of each token in their own sources and find avg of all weights to get a global model. Then this global model can be used to fine-tune in different sources.
How can I find these weights, and how can average these weights from multiple sources?
can I visualise it? Then how?
Also, note that I am trying to use Tensorflow version of the Bert implementation and planning to fine-tune for the NER task.

Resources