Cannot import sparkdl into databricks notebook - apache-spark

As I said in the title, I cannot import sparkdl module into databricks.
I followed step by step this official tutorial sparkdl tutorial but when I simply put some code like:
from sparkdl import readImages
I get an error that says I cannot import readImages from sparkdl, the same for other classes such DeepImageFeaturizer.
I checked and It seems there is some sort of problem with the versions of keras,tensorflow,sparkdl,h5py and the maven library spark-deep-learning so I'm looking for a stable combination beetween them.
Taking into account that from Databricks I can only install tensorflow from 2.5.0 up to newer version.
With the following versions:
tensorflow v 2.5.0, keras 2.2.4, 1.4.0-spark-deep-learning,h5py v 3.7.0
I Got the following error while importing DeepImageFeaturizer
from sparkdl import DeepImageFeaturizer ImportError: cannot import name 'resnet50' from 'keras.applications' (/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/keras/applications/__init__.py)
Any help or suggestion will be really appreciated

Related

Tensorflow federated can't be imported on google collabs notebook

I wrote following codes on a new google collabs notebook:
!pip install --quiet --upgrade tensorflow-federated-nightly
import tensorflow as tf
import tensorflow_federated as tff
And I got these error messages while importing tensorflow_federeated:
/usr/local/lib/python3.7/dist-packages/keras/api/_v1/keras/experimental/__init__.py in <module>()
8 from keras.feature_column.sequence_feature_column import SequenceFeatures
9 from keras.layers.rnn.lstm_v1 import PeepholeLSTMCell
---> 10 from keras.optimizers.learning_rate_schedule import CosineDecay
11 from keras.optimizers.learning_rate_schedule import CosineDecayRestarts
12 from keras.premade_models.linear import LinearModel
ModuleNotFoundError: No module named 'keras.optimizers.learning_rate_schedule'; 'keras.optimizers' is not a package
These errors seem to be spawning from the modules installed on the colabs itself, instead of my code.
Any idea on what can be done to fix this?
Collab Defaults to 3.7 according to a similar problem But although the solution to upgrade to 3.9 did indeed upgrade to python 3.9, TFF still didn't work for me, even when I installed locally. So, find a different path.

ModuleNotFoundError: No module named 'PIL' when I want to import sparkdl in databricks

I am trying to implement a deep learning pipeline, I need to import sparkdl package in databricks (community edition).
My other installed libraries include:
spark-deep-learning:1.4.0-spark2.4-s_2.11,
h5py,
keras==2.2.4,
tensorflow==1.15.0,
wrapt.
When I run
from sparkdl import DeepImageFeaturizer
I keep getting the error of ModuleNotFoundError: No module named 'PIL'.
Update: Installing Pillow solves the problem.
Make sure you have installed all the libraries as the prerequisites:
Create a spark-deep-learning library with the Source option Maven and
Coordinate 1.4.0-spark2.4-s_2.11.
Create libraries with the Source
option PyPI and Package tensorflow==1.12.0,keras==2.2.4, h5py==2.7.0,
wrapt.
Reference: https://docs.azuredatabricks.net/_static/notebooks/deep-learning/deep-learning-pipelines-1.4.0.html

How to import WordEmbeddingSimilarityIndex function from gensim module?

When i try to import WordEmbeddingSimilarityIndex, it's giving me the following error:
>> from gensim.models import WordEmbeddingSimilarityIndex
ImportError: cannot import name 'WordEmbeddingSimilarityIndex
The same issue occurs for SparseTermSimilarityMatrix function:
>> from gensim.similarities import SparseTermSimilarityMatrix
ImportError: cannot import name 'SparseTermSimilarityMatrix
Note: I have installed and imported gensim, gensim.models and gensim.similarities. But still it's giving me the ImportError while importing the above mentioned functions.
Can you tell me what I am doing wrong, please?
Fix is change "models" to "similarities"
from gensim.similarities import WordEmbeddingSimilarityIndex
it works in gensim 4.0.1
Try to check the version of gensim that you are using. Usually, the older versions of gensim cause this issue.
from gensim.models import WordEmbeddingSimilarityIndex
print(gensim.__version__)
if the gensim version is 3.6.x or older update it to 3.7.x or latest version by running the below command. Once you update gensim version should get rid of this issue.
pip install --upgrade gensim

Can't import GMM function from sckits.learn

I'm getting error ImportError: No module named gmm when I'm using from scikits.learn.gmm import GMM..
I installed scikits using windows installer and no error..
How I can fix it?
That link is very old, the module name was renamed to sklearn as you have installed version 0.16.1 you should be using
from sklearn.mixture import GMM
as per the docs

Not able to import PolynomialFeatures, make_pipeline in Scikit-learn

I'm not able to import the following modules in an ipython notebook:
from sklearn.preprocessing import PolynomialFeatures
from sklearn.pipeline import make_pipeline
The following error pops up
ImportError: cannot import name PolynomialFeatures
The same error also appears for make_pipeline.
I'm a newbie in scikit-learn, please help out.
I'm using the miniconda installation of python and the version number for scikit-learn is 0.14.1.
Polynomial Features is included for next version of scikit-learn and is not available in 0.14.1. Please update to 0.15-git if you want to use it. The same holds for make pipeline.
To get the bleeding edge version:
git clone git://github.com/scikit-learn/scikit-learn.git
python setup.py build_ext --inplace
Please read: http://scikit-learn.org/stable/developers/index.html#git-repo
You have to check your current version of scikit:
import sklearn
print sklearn.__version__
if it is less than 0.15.0, then you have to upgrade it. In addition to an excellent answer of Abhishek, you can follow official installation process (which is described for various OS).
If you are using pyCharm, it can be done even simpler: File -> Settings -> Project Interpreter and then select your package and click upgrade
(I selected another one, be cause my scikitlearn is the newest)

Resources