How to post-train BERT model on custom dataset - nlp

I want to get the BERT word embeddings which will be used in another down-stream task later. I have a corpus for my custom dataset and want to further pre-train the pre-trained Huggingface BERT base model. I think this is called post-training. How can I do this using Huggingface transformers? Can I use transformers.BertForMaskedLM?

Related

How to convert the output of pretrained Huggingface transformer model from classification to regression for fine-tuning on my data?

I am using a transformer model that was extended from huggingface (DNABERT). This is a pretrained classification model whose output I would like to convert to regression, then fine-tune that model on my own data. I imagine this process would be roughly the same for any BERT-based huggingface classification model. How would I go about doing this?

How to use pre-trained FastText embeddings with existing Seq2Seq model?

I'm new in NLP and I am trying to understand how to use pre-trained word embeddings like fastText with the existing Seq2Seq model. The Seq2Seq model I'm working with is the following. The encoder is simple and the decoder is Pointer Generator Network with CRF on the top. Both of them use an embedding layer.
The question: If I have my own dataset & vocab, how do I use both my own vocab and the one from the fastText? Do I have to use fastText weights in both the encoder and decoder?

Backpropagation in bert

i would like to know when people say pretrained bert model, is it only the final classification neural network is trained
Or
Is there any update inside transformer through back propagation along with classification neural network
During pre-training, there is a complete training if the model (updation of weights). Moreover, BERT is trained on Masked Language Model objective and not classification objective.
In pre-training, you usually train a model with huge amount of generic data. Thus, it has to be fine-tuned with the task-specific data and task-specific objective.
So, if your task is classification on a dataset X. You fine-tune BERT accordingly. And now, you will be adding a task-specific layer (classification layer, in BERT they have used dense layer over [CLS] token). While fine-tuning, you update the pre-trained model weights as well as the new task-specific layer.

Unsupervised finetuning of BERT for embeddings only?

I would like to fine-tuning BERT for a specific domain on unlabeled data and get the output layer to check the similarity between them. How can I do it? Do I need to fine-tuning first a classifier task (or question answer, etc..) and get the embeddings? Or can I just use a pre-trained Bert model without task and fine-tuning with my own data?
There is no need to fine-tune for classification, especially if you do not have any supervised classification dataset.
You should continue training BERT the same unsupervised way it was originally trained, i.e., continue "pre-training" using the masked-language-model objective and next sentence prediction. Hugginface's implementation contains class BertForPretraining for this.

Use pretrained models to further train current corpus

Is it possible to leverage the pretrained model e.g. GLOVE and use it to further train a corpus.
Any example will be very helpful.

Resources