Understand the difference between pyspark.ml.regression 'IsotonicRegression' vs 'IsotonicRegressionModel' - apache-spark

I am trying to calibrate the output of an pyspark GradientBoostingClassifier model to probabilities and want to try this option.
I have run an IsotonicRegression like this:
from pyspark.ml.regression import IsotonicRegression, IsotonicRegressionModel
model = IsotonicRegression().fit(train_data)
predictions_train=model.transform(test_data)
But I am unable to perform fit using IsotonicRegressionModel because when I try this:
irm = IsotonicRegressionModel()
model_irm =irm.fit(train_data)
I'm getting the following error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[70], line 4
1 # Trains an isotonic regression model.
2 irm = IsotonicRegressionModel()
----> 3 model_irm=irm.fit(train_data)
AttributeError: 'IsotonicRegressionModel' object has no attribute 'fit'
I would like to run second option to identify the difference between IsotonicRegression vs IsotonicRegressionModel.
Thanks in advance if anyone can help me understand this difference.
Im using spark.version 3.1.3

Related

NameError: name 'KE' is not defined

I am following this tutorial: https://blog.paperspace.com/mask-r-cnn-in-tensorflow-2-0/ in order to train a custom dataset for object detection. When I run the code for training (under paragraph: "Train Mask R-CNN in TensorFlow 1.0"), I get this error on colab:
NameError Traceback (most recent call last)
<ipython-input-31-794112aa6465> in <module>()
6 import mrcnn.config
7
----> 8 import mrcnn.model
9
10 class KangarooDataset(mrcnn.utils.Dataset):
/content/drive/MyDrive/How_to_Train_an_Object_Detection_Model_with_Keras/Mask_RCNN/mrcnn/model.py in <module>()
255
256
--> 257 class ProposalLayer(KE.Layer):
258 """Receives anchor scores and selects a subset to pass as proposals
259 to the second stage. Filtering is done based on anchor scores and
NameError: name 'KE' is not defined
After searching I tried to check that RCNN is ok with this: Import Matterport's Mask-RCNN model from github - error:ZipImportError: bad local file header with the solution that the guy in the end suggests. I have also found this: NameError: name 'K' is not defined so I tried this command:
from keras import backend as KE
(instead of K, I put KE) but it didn't work!
Do you have any idea how to fix that error?
Ok, I tried this github repository instead the original MaskRCNN: https://github.com/akTwelve/Mask_RCNN with the latest tensorflow (2.7.0) + Keras (2.7.0) installed on colab. It seems to overcome the above problem I described...I do not know why..!

Runtime error on during execution of cnn model for image recognition with fastai library

I am training a cnn model to recognise images. However, I get an error when running this code:
from fastai.vision.all import *
path = untar_data(URLs.PETS)/‘images’
def is_cat(x): return x[0].isupper()
dls = ImageDataLoaders.from_name_func(
path, get_image_files(path), valid_pct=0.2, seed=42,
label_func=is_cat, item_tfms=Resize(224))
learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fine_tune(1)
error:
During handling of the above exception, another exception occurred:
RuntimeError Traceback (most recent call last)
in
----> 1 learn.fine_tune(1)
RuntimeError: DataLoader worker (pid(s) 12456, 4440, 3268, 448) exited unexpectedly
The error happens at the last line (was a longer error but SO does not let me submit all of that).
I am not running on GPU (as suggested on internet) because I havent really got how to tell jupiter notebook to do that.
Can you help?
Thanks, Luigi
you can add num_workers=0
Example
ImageDataLoaders.from_name_func(path, files, label_func, item_tfms=Resize(224),**num_workers=0**)

Not able to use fastai's pretrained_model=URLs.WT103

Trying to use fastai's language_model_learner:
learn = language_model_learner(data_lm, pretrained_model=URLs.WT103, drop_mult=0.7)
Error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-17-811dec5cedeb> in <module>
----> 1 learn = language_model_learner(data_lm,
pretrained_model=URLs.WT103, drop_mult=0.7)
AttributeError: type object 'URLs' has no attribute 'WT103'
Try drop the parameter of pretrained_model, like
learn = language_model_learner(data_lm, arch = AWD_LSTM, pretrained = True, drop_mult=0.7)
It works fine for me.
I faced the similar issue while i was trying to fine tune the pretrained language model today. It looks like they have changed the data link and instead of using URLs.WT103 you can use URLs.WT103_FWD or URLs.WT103_BWD.
Also add the value for 'arch' parameter as AWD_LSTM and pretrained to True which wil by default use the weights for pretrained WT103_FWD.
Seems API has been changed. Try
learn = language_model_learner(data_lm, AWD_LSTM, drop_mult=0.7)
as it's suggested in the official guideline.
More details on the language_model_learner() are here.

tf.contrib.metrics.f1_score can not be imported

I'm trying to calculate the F1 score using tf.contrib.metrics.f1_score, but it gives me an error. I know how to calculate it using precision and recall but i want to use this function.
I have tried it on ubuntu 16.04 LTS with tensorflow version 1.9.0 with gpu suport and no gpu suport
from tensorflow.contrib.metrics import f1_score as ms
i get this error:
ImportError: Traceback (most recent call last)
<ipython-input-6-627f14191ea2> in <module>()----> 1 from tensorflow.contrib.metrics import f1_score as ms
ImportError: cannot import name 'f1_score'
AND
from tensorflow.contrib import metrics as ms
ms.f1_score
I get this error:
AttributeError Traceback (most recent call last)
<ipython-input-8-c19f57465581> in <module>()
1 from tensorflow.contrib import metrics as ms
----> 2 ms.f1_score
AttributeError: module 'tensorflow.contrib.metrics' has no attribute 'f1_score'
I expect ms.f1_score would load
If you are sure that you have tf.contrib available and this doesn't work for you, maybe you will need to reinstall tensorflow use pip install -U tensorflow or use the -GPU if you are using that version.
If it fails, go to the place where tensorflow is installed and manually check if it is available or not, if it is available, make sure that you don't have a file in the same directory (Current working directory) named as tensorflow.py or tf.py
After that you should get
Update: As pointed by User #grwlf
Since TensorFlow 2.0, tf.contrib modules were moved to the Addons repo. See github.com/tensorflow/addons. There, F1 mesure is available as F1Score from tensorflow_addons.metrics import F1Score
You can find the documentation of f1_score here
Since it is a function, maybe you can try out:
from tensorflow.contrib import metrics as ms
ms.f1_score(labels,predictions)
Which will return a scalar tensor of the best f1 scores across different thresholds.
Example from tensorflow docs:
def model_fn(features, labels, mode):
predictions = make_predictions(features)
loss = make_loss(predictions, labels)
train_op = tf.contrib.training.create_train_op( total_loss=loss, optimizer='Adam')
eval_metric_ops = {'f1': f1_score(labels, predictions)}
return tf.estimator.EstimatorSpec( mode=mode, predictions=predictions, loss=loss, train_op=train_op, eval_metric_ops=eval_metric_ops, export_outputs=export_outputs)
estimator = tf.estimator.Estimator(model_fn=model_fn)
Hope this answers your question.

PySpark - Word2Vec load model, can't use findSynonyms to get words

I have trained a Word2Vec model with PySpark and saved it. When loading the model .findSynonyms method does not work.
model = word2vec.fit(text)
model.save(sc, 'w2v_model')
new_model = Word2VecModel.load(sc, 'w2v_model')
new_model.findSynonyms('word', 4)
Getting the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/spark/python/pyspark/mllib/feature.py", line 487, in findSynonyms
words, similarity = self.call("findSynonyms", word, num)
ValueError: too many values to unpack
I found the following, but not sure how the issue was fixed: https://issues.apache.org/jira/browse/SPARK-12016
Please let me know if there are any work arounds!
Many thanks.
Looks like it's fixed on 1.6.1 but not on 1.5.2.
The error is not about findSynonyms but about Word2VecModel.load.
I checked it works on 1.6.1.; no error while loading the model and calling findSynonyms method.
I guess v. 1.5.2 is not fixed yet.

Resources