I am trying to replicate a deep learning project from https://medium.com/linagora-engineering/making-image-classification-simple-with-spark-deep-learning-f654a8b876b8 . I am working on spark version 1.6.3. I have installed keras and tensorflow. But everytime i try to import from sparkdl it throws an error. I am working on Pyspark. When I run this:-
from sparkdl import readImages
I get this error:-
File "C:\Users\HP\AppData\Local\Temp\spark-802a2258-3089-4ad7-b8cb-
6815cbbb019a\userFiles-c9514201-07fa-45f9-9fd8-
c8a3a0b4bf70\databricks_spark-deep-learning-0.1.0-spark2.1-
s_2.11.jar\sparkdl\transformers\keras_image.py", line 20, in <module>
ImportError: cannot import name 'TypeConverters'
Can someone pls help?
Its not a full fix, as i have yet to be able to import things from sparkdl in jupyter notebooks aswell, but!
readImages is a function in pyspark.ml.image package
so to import it you need to:
from pyspark.ml.image import ImageSchema
to use it:
imagesDF = ImageSchema.readImages("/path/to/imageFolder")
This will give you a dataframe of the images, with column "image"
You can add a label column as such:
labledImageDF = imagesDF.withColumn("label", lit(0))
but remember to import functions from pyspark.sql to use lit function
from pyspark.sql.functions import *
Hope this at least partially helps
Related
I tried the below line of code, but it is giving me the below error
y = rnn_cell_impl._linear(slot_inputs, attn_size, True)
AttributeError: module 'tensorflow.python.ops.rnn_cell_impl' has no attribute '_linear'
I am currently using Tensorflow version 2.10, I tried with all possible solutions by using
#from tensorflow.contrib.rnn.python.ops import core_rnn_cell
or
#from tensorflow.keras.layers import RNN
still no solution.
Can someone help me with the same?
I am trying to follow this tutorial on Google Cloud Platform,
https://github.com/GoogleCloudPlatform/ai-platform-samples/blob/master/notebooks/samples/tables/census_income_prediction/getting_started_notebook.ipynb, however, I am running into issues when I try to import the autoML module, specifically the below two lines
# AutoML library.
from google.cloud import automl_v1beta1 as automl
import google.cloud.automl_v1beta1.proto.data_types_pb2 as data_types
The first line works, but for the 2nd one, I get the error: ModuleNotFoundError: No module named 'google.cloud.automl_v1beta1.proto'. It seems for some reason there is no module called proto and I cannot figure out how to resolve this. There are a couple of posts regarding the issue of not being able to find module google.cloud. In my case I am able to import automl_v1beta1 from google.cloud but not proto.data_types_pb2 from google.cloud.automl_v1beta1
I think you can:
from google.cloud import automl_v1beta1 as automl
import google.cloud.automl_v1beta1.types as data_types
Or:
import google.cloud.automl_v1beta1 as automl
import google.cloud.automl_v1beta1.types as data_types
But (!) given the import errors, there may be other changes to the SDK in the code that follows.
I saw this video on youtube : using plotting a dataframe using iplot method imported from plotly.offline module.
I ran this code on intellij but got an error saying :
latin-1' codec can't encode characters in position 0-9: ordinal not in range(256)
i looked up for a solution but couldn't anything. Then i ran this code in jupyter notebook and it worked just fine
can anyone explain this.
source code -
import pandas as pd
import numpy as np
import chart_studio.plotly as py
from plotly.offline import *
import cufflinks as cf
init_notebook_mode(connected=True)
cf.go_offline()
df=pd.DataFrame(np.random.randn(50,4),columns=['a','b','c','d'])
df.iplot()
I'm learning sklearn, but I can't use fetch_openml(). It says,
ImportError: cannot import name 'fetch_openml' from 'sklearn.datasets'
In the new version of sklearn, it's even easier to fetch open ML Datasets. For example, you can add import and fetch mnist dataset as:
from sklearn.datasets import fetch_openml
X, y = fetch_openml('mnist_784', version=1, return_X_y=True, as_frame=False)
print(X.shape, y.shape)
For more details check official example.
You can use this:
from sklearn.datasets import fetch_openml
Apparently, fetch_mldata has been deprecated in the newer sklearn. Use load_digits to achieve loading the MNIST data.
To solve this problem in jupyter follow these steps:
Download file mnist-original from " https://osf.io/jda6s/"
after download file copy it into C:\Users\YOURUSERNAME\scikit_learn_data\mldata
in notebook jupyter do:
from sklearn.datasets import fetch_mldata
mnist = fetch_mldata('mnist-original')
I get following error
object GaussianMixture is not a member of package org.apache.spark.ml.clustering
when I try to do following import from spark-shell
import org.apache.spark.ml.clustering.GaussianMixture
As this is part of Spark, I don't think any dependencies need to be added. Please help me with this issue.
I belive the GaussianMixture uses the mllib package. Try to import:
import org.apache.spark.mllib.clustering.GaussianMixture