Fastest supervized Tutorial: Precision Very Low - text

Hi I am running the same tutorial on the supervised module of fastest in the tutorial
https://fasttext.cc/docs/en/supervised-tutorial.html
In the tutorial we get the below results:
model = fasttext.train_supervised(input="cooking.train", lr=0.5, epoch=25, wordNgrams=2, bucket=200000, dim=50, loss='ova')
model.test("cooking.valid", k=-1)
(3000L, 0.702, 0.2)
Where the precision is 0.702 and recall is 0.2.
But when I run the same tutorial at my I get the below results:
(3000, 0.003146031746031746, 1.0)
Has something changed with the newer version of fastext which is not documented yet.
I am using fasttext version 0.9.2

Related

How do I convert this code from Tensorflow version 1 to version 2?

I am trying to run a code on tensorflow that was previously written on tensorflow version 1 using MNIST dataset, but I keep getting errors because some commands are not compatible with tensorflow version 2 which is what I am currently running. My question is how do I rewrite this code in tensorflow version 2 so that I can get the same output as the one below?
Some of the options I have tried were installing Keras, an older version of TensorFlow and I also tried different commands to download the MNIST dataset.
This is one of the codes that I tried but it didn't work
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
Tensorflow official site have a detailed article for this migration.
In TensorFlow 2.x, You can get most of the above code operations done with minimal code or with a set of defined APIs for the same task as -
importing the built-in TensorFlow dataset(tf.keras.datasets),
model
building with defined layers(tf.keras.layers),
model compilation
with the built-in loss(BinaryCrossentropy, SparseCategoricalCrossentropy etc), optimizers(adam, sgd and with
the available accuracy metrics.
Please have a look at this link for your same model in TensorFlow 2.x where you will get detailed understanding on each operation by replacing the fashion_mnist dataset with the mnist dataset.
Also you can see there is significant improvement in accuracy of the model using TF 2.x as shown here:
test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=2)
print('\nTest accuracy:', test_acc)
Output: (Attaching the gist for the same above code in TF 2.x as a reference)
313/313 - 1s - loss: 0.0922 - accuracy: 0.9779 - 646ms/epoch - 2ms/step
Test accuracy: 0.9779000282287598
You can take help from this migration guide for how to convert other similar APIs from TF 1.x code to TF 2.x.

FastText 0.9.2 - why is recall 'nan'?

I trained a supervised model in FastText using the Python interface and I'm getting weird results for precision and recall.
First, I trained a model:
model = fasttext.train_supervised("train.txt", wordNgrams=3, epoch=100, pretrainedVectors=pretrained_model)
Then I get results for the test data:
def print_results(N, p, r):
print("N\t" + str(N))
print("P#{}\t{:.3f}".format(1, p))
print("R#{}\t{:.3f}".format(1, r))
print_results(*model.test('test.txt'))
But the results are always odd, because they show precision and recall #1 as identical, even for different datasets, e.g. one output is:
N 46425
P#1 0.917
R#1 0.917
Then when I look for the precision and recall for each label, I always get recall as 'nan':
print(model.test_label('test.txt'))
And the output is:
{'__label__1': {'precision': 0.9202150724134941, 'recall': nan, 'f1score': 1.8404301448269882}, '__label__5': {'precision': 0.9134956983264135, 'recall': nan, 'f1score': 1.826991396652827}}
Does anyone know why this might be happening?
P.S.: To try a reproducible example of this behavior, please refer to https://github.com/facebookresearch/fastText/issues/1072 and run it with FastText 0.9.2
It looks like FastText 0.9.2 has a bug in the computation of recall, and that should be fixed with this commit.
Installing a "bleeding edge" version of FastText e.g. with
pip install git+https://github.com/facebookresearch/fastText.git#b64e359d5485dda4b4b5074494155d18e25c8d13 --quiet
and rerunning your code should allow to get rid of the nan values in the recall computation.

SVM qp solver in sklearn

I study SVM and I will implement svm using python sklearn.svm.SVC.
As i know SVM problem can be represented a QP(Quadratic Programming)
So here i was wondering which QP solver is used to solve the SVM QP problem in sklearn svm.
I think it may be SMO or coordinate descent algorithm.
Please let me know what the exact algorithm is used in sklearn svm
Off-the-shelf QP-solvers have been used in the past, but for many years now dedicated code is used (much faster and more robust). Those solvers are not (general) QP-solvers anymore and are just build for this one use-case.
sklearn's SVC is a wrapper for libsvm (proof).
As the link says:
Since version 2.8, it implements an SMO-type algorithm proposed in this paper:
R.-E. Fan, P.-H. Chen, and C.-J. Lin. Working set selection using second order information for training SVM. Journal of Machine Learning Research 6, 1889-1918, 2005.
(link to paper)

KNearestNeighbour is not running in multithread in Scikit-Learn

I'm succesfully using scikit-learn on my machine. I'm experimenting with an anaconda implemnetation (that relies on MKL for multithreading) and an openblas implementation.
I'd really like to use a parallel version of k-nearest neighbour classifier, and according to https://github.com/scikit-learn/scikit-learn/pull/4009 , sklearn should have merged this changes 1 year ago, in version 0.17.
Multithreading works successfully for PCA, and all numpy operations. I can tell multithreading is working due to high number of threads I can see when I do dot products and PCA. When I lunch KNN is taking around 10 minutes.
I’m classifying a high dimensional dataset of MNIST (image digits). So I’m doing PCA to get vector of dimension 35-50, and then I’m doing a nonlinear expansion, so I’m getting vector of dimension 600-100. That’s why I need parallelism so badly.
My version of sklearn is:
print('The scikit-learn version is {}.'.format(sklearn.version))
The scikit-learn version is 0.18.1.
I'm using python3 and this is a sample of the code:
def classify_knn(train, test, train_labels):
clf = KNeighborsClassifier(algorithm='ball_tree')
clf = clf.fit(train, train_labels)
return clf.predict(test)
I've tried with and without 'ball_tree'. No one should using python 2.7 in 2017 and neither do I.
Just passing as a parameter
n_jobs = -1
solved the issue.

Caffe vs Theano MNIST example

I'm trying to learn (and compare) different deep learning frameworks, by the time they are Caffe and Theano.
http://caffe.berkeleyvision.org/gathered/examples/mnist.html
and
http://deeplearning.net/tutorial/lenet.html
I follow the tutorial to run those frameworks on MNIST dataset. However, I notice a quite difference in term of accuracy and performance.
For Caffe, it's extremely fast for the accuracy to build up to ~97%. In fact, it only takes 5 mins to finish the program (using GPU) which the final accuracy on test set of over 99%. How impressive!
However, on Theano, it is much poorer. It took me more than 46 minutes (using same GPU), just to achieve 92% test performance.
I'm confused as it should not have so much difference between the frameworks running relatively same architectures on same dataset.
So my question is. Is the accuracy number reported by Caffe is the percentage of correct prediction on test set? If so, is there any explanation for the discrepancy?
Thanks.
The examples for Theano and Caffe are not exactly the same network. Two key differences which I can think of are that the Theano example uses sigmoid/tanh activation functions, while the Caffe tutorial uses the ReLU activation function, and that the Theano code uses normal minibatch gradient descent while Caffe uses a momentum optimiser. Both differences will significantly affect the training time of your network. And using the ReLU unit will likely also affect the accuracy.
Note that Caffe is a deep learning framework which already has ready-to-use functions for many commonly used things like the momentum optimiser. Theano, on the other hand, is a symbolic maths library which can be used to build neural networks. However, it is not a deep learning framework.
The Theano tutorial you mentioned is an excellent resource to understand how exactly convolutional and other neural networks work on a basic level. However, it will be cumbersome to implement all the state-of-the-art tweaks. If you want to get state-of-the-art results quickly you are better off using one of the existing deep learning frameworks. Apart from Caffe, there are a number of frameworks based on Theano. I know of keras, blocks, pylearn2, and my personal favourite lasagne.

Resources