StanfordNLP training iteration for CRF classifier - nlp

I know its a simple question. But I just want to make sure. Are all the samples from training dataset used in each iterations in CRF classifier?

Yes during the training process all training examples are used during each iteration.

Related

Accuracy of fine-tuning BERT varied significantly based on epochs for intent classification task

I used Bert base uncased as embedding and doing simple cosine similarity for intent classification in my dataset (around 400 classes and 2200 utterances, train:test=80:20). The base BERT model performs 60% accuracy in the test dataset, but different epochs of fine-tuning gave me quite unpredictable results.
This is my setting:
max_seq_length=150
train_batch_size=16
learning_rate=2e-5
These are my experiments:
base model accuracy=0.61
epochs=2.0 accuracy=0.30
epochs=5.0 accuracy=0.26
epochs=10.0 accuracy=0.15
epochs=50.0 accuracy=0.20
epochs=75.0 accuracy=0.92
epochs=100.0 accuracy=0.93
I don't understand while it behaved like this. I expect that any epochs of fine-tuning shouldn't be worse than the base model because I fine-tuned and inferred on the same dataset. Is there anything I misunderstand or should care about?
Well, generally you'll not be able to feed in all the data in your training set at once (I am assuming you have a huge-dataset that you'll have to use mini-batches). Hence, you split it into mini-batches. So, the accuracy that is displayed is strongly infuluenced a lot by the last mini-batch, or the last training step of the epoch.

How to adopt multiple different loss functions in each steps of LSTM in Keras

I have a set of sentences and their scores, I would like to train a marking system that could predict the score for a given sentence, such one example is like this:
(X =Tomorrow is a good day, Y = 0.9)
I would like to use LSTM to build such a marking system, and also consider the sequential relationship between each word in the sentence, so the training example shown above is transformed as following:
(x1=Tomorrow, y1=is) (x2=is, y2=a) (x3=a, y3=good) (x4=day, y4=0.9)
When training this LSTM, I would like the first three time steps using a softmax classifier, and the final step using a MSE. It is obvious that the loss function used in this LSTM is composed of two different loss functions. In this case, it seems the Keras does not provide the way to address my problem directly. In addition, I am not sure whether my method to build the marking system is correct or not.
Keras support multiple loss functions as well:
model = Model(inputs=inputs,
outputs=[lang_model, sent_model])
model.compile(optimizer='sgd',
loss=['categorical_crossentropy', 'mse'],
metrics=['accuracy'], loss_weights=[1., 1.])
Based on your explanation, I think you need a model that first, predict a token based on previous tokens, in NLP domain it usually called Language model, and then compute a score which I assume it is a sentiment (it is applicable to other domain).
To do so, you can train your language model with LSTM and pick the last output of LSTM for your ranking task. To this end, you need to define two loss function: categorical_crossentropy for the language model and MSE for the ranking task.
This tutorial would be helpful: https://www.pyimagesearch.com/2018/06/04/keras-multiple-outputs-and-multiple-losses/

Anomaly detection in Text Classification

I have built a text classifier using OneClassSVM.
I have the training set which corresponds to only one label i.e("Yes") and I don't have the other("NO") label data. My task is to build a classifier which classifies the new unseen sentence(test data) as 1 if it is very similar to the training data. Else, it classifies as -1 i.e,(anomaly).
I have used Word2Vec to build the word embeddings for my training data. Then, I am using word-vector averaging with OneClassSVM to build a anomaly detector classifier.
This classifier is currently giving accuracy of about 50%-55%. I have to enhance this further to build a robust classifier.
Any suggestions to this problem would be helpful...
I'd suggest a very different approach since you have no training examples for the negative class at all.
You could train a language model on your training data. At inference time, you score the input with the language model, and classify it according to some threshold on the perplexity of the input sentence according to the LM.

How to continue training svm and knn in scikit-learn?

after training since it cost a lot of time is there a way for me to continue my training and add samples using nusvc() and nearestneighbor() in scikitlearn?
For the SVM, you might be able to use the online learning abilities of the SGDClassifier class. To do so, you would need to use the partial_fit() function.

How to achive the importance of each variable for SVM after classification?

I have two classes and several variables. After training the SVM, it gives me a good accuracy on prediction of testing data classes. Does anybody know how can I find out which of my variables are less important in the prediction done by SVM ? I'm nearly new in SVM and I'm just familiar with console interface and matlab interface of SVM. Is there any option to achive the importance of variables for SVM after training or prediction phase ?

Resources