I am running Apache Toree for Pyspark Notebook. I had anaconda 3.5 and jupyter hub installed on unix machines. When I am invoking pyspark from Jupyter notebook it's starting with Python 2.7 instead of Anaconda 3.5.
Requesting your help in changing python version.
Please see I had already tried changing python version via os.environ but it didn't worked.
Followed Below steps for configuring Toree with Python-3:
Installed a new kernel with spark home and python path.
jupyter toree install --spark_home="spark_path" --kernel_name=tanveer_kernel1 --interpreters=PySpark,SQL --python="python_path"
After doing above there were issues with Driver Python version and Executor Python version. Corrected Python Version in spark-env.sh by adding
export PYSPARK_PYTHON="/usr/lib/anaconda3/bin/python"
export PYSPARK_DRIVER_PYTHON="/usr/lib/anaconda3/bin/python"
Restarted spark services.
Related
I have recently shifted from windows 10 to linux (ubuntu 22.04). I have recently installed anaconda + fastai using terminal to learn ML. I want to upgrade my python (3.9) to python 3.11.
What are the steps to follow (considering , i am a beginner in linux ) ?
Also It should not cause bad issues to my fastai library.
Thankyou.
I tried this method :
python-3-11-released-how-install-ubuntu /
At last , my terminal shows this , after i selected python3.11 auto mode.
But after this, when i run jupyter notebook and checked the python version , it shows python 3.9
What should i do ? please Help.
If juypter notebook was installed using conda, then you can run:
conda install ipython jupyter
and restart juypter notebook for it to take effect.
On bringing up octave 4.4.0 for anaconda 3 Jupyter Notebook on windows 10
the 1st octave-gui.exe system error is:
The code execution cannot proceed becuase liboctave-5.dll was not found. Reinstalling the program may fix this problem.
Octave 4.4.0 runs in the gui by itself but not with jupyter Notebook.
Using Python 3.6.5 which runs in Jupyter Notebook.
I was able to run Octave 4.0.1 in Jupyter Notebook.
I am running into some issues with packages I can access in anaconda prompt vs what is available in Jupyter Notebook.
I recently created an environment for python 3.6. I installed a few packages, including pandas, using:
conda install pandas
And all is well:
When I check conda list, I also see pandas installed for py36.
conda list
Now, I run the following line to start my jupyter notebook from this same location:
jupyter notebook
I then run into an error when I try to import pandas. I then try to install pandas with no luck.
My guess is this stems from having downloaded anaconda for python 2.7, but I assume there is a way to get things working in both versions of python for jupyter notebook.
To add, I am able to use pandas when I use python 2.7 (referred to as "root"):
activate root
jupyter notebook
Any help would be greatly appreciated!
Apache Toree is looking for the spark home directory (defaults to "/usr/local/spark", but when it can't find the directory due to spark having been installed via Homebrew, it throws an exception.
jupyter toree install
FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/spark/python/lib'
Where is the spark home when spark is installed via homebrew?
The directory Apache Toree is looking for when spark is installed via homebrew is in /usr/local/Cellar:
jupyter toree install --spark_home /usr/local/Cellar/apache-spark/2.1.0/libexec
/usr/local/Cellar/apache-spark/2.1.0/libexec/
It specifically wants the "libexec" directory where it can go into the "python/lib" sub-directory.
If that doesn't work, you might additionally need to pass in a --user flag.
jupyter toree install --user --spark_home /usr/local/Cellar/apache-spark/2.2.0/libexec
Kinda like from this github issue, this jupyter documentation, and this other stack question.
I've installed jupyter notebook over python 3.5.2 on ubuntu server 16.04
I also have installed apache toree to run spark jobs from jupyter.
I run:
pip3 install toree
jupyter toree install --spark_home=/home/arik/spark-2.0.1-bin-hadoop2.7/ # My Spar directory
The output was a success:
[ToreeInstall] Installing Apache Toree version 0.1.0.dev8
[ToreeInstall] Apache Toree is an effort undergoing incubation at the
Apache Software Foundation (ASF), sponsored by the Apache Incubator
PMC.
Incubation is required of all newly accepted projects until a further
review indicates that the infrastructure, communications, and decision
making process have stabilized in a manner consistent with other
successful ASF projects.
While incubation status is not necessarily a reflection of the
completeness or stability of the code, it does indicate that the
project has yet to be fully endorsed by the ASF.
Additionally, this release is not fully compliant with Apache release
policy and includes a runtime dependency that is licensed as LGPL v3
(plus a static linking exception). This package is currently under an
effort to re-license (https://github.com/zeromq/jeromq/issues/327).
[ToreeInstall] Creating kernel Scala [ToreeInstall] Removing existing
kernelspec in /usr/local/share/jupyter/kernels/apache_toree_scala
[ToreeInstall] Installed kernelspec apache_toree_scala in
/usr/local/share/jupyter/kernels/apache_toree_scala
and i though that everthing was successful but everytime i create an apache toree notebook i see the following:
It says Kernel busy and all of my commands are ignored..
I couldn't find anything about this issue online.
Alternatives to toree would also be accepted.
Thank you
Toree unfortunately does not work with Scala 2.11. Either you can downgrade to scala 2.10 with spark or use more recent version of toree(still in beta). The way I made it work with spark 2.1 and Scala 2.11:
#!/bin/bash
pip install -i https://pypi.anaconda.org/hyoon/simple toree
jupyter toree install --spark_home=$SPARK_HOME --user #will install scala + spark kernel
jupyter toree install --spark_home=$SPARK_HOME --interpreters=PySpark --user
jupyter kernelspec list
jupyter notebook #launch jupyter notebook
Look at this post and this post for more info.
It will eventually look like this: