Azure Databricks cluster init script - Install wheel from mounted storage - azure

I have a python wheel uploaded to an azure storage account that is mounted in a databricks service. I'm trying to install the wheel using a cluster init script as described in the databricks documentation.
My storage is definitely mounted and my file path looks correct to me. Running the command display(dbutils.fs.ls("/mnt/package-source")) in a notebook yields the result:
path: dbfs:/mnt/package-source/parser-3.0-py3-none-any.whl
name: parser-3.0-py3-none-any.whl
I have tried to install the wheel from a cluster init file using this command:
/databricks/python/bin/pip install "dbfs:/mnt/package-source/parser-3.0-py3-none-any.whl"
but the cluster fails to start. It's logs give me an error saying it can't find the file:
WARNING: Requirement 'dbfs:/mnt/package-source/parser-3.0-py3-none-any.whl' looks like a filename, but the file does not exist
ERROR: Could not install packages due to an EnvironmentError: [Errno 2] No such file or directory: '/dbfs:/mnt/package-source/parser-3.0-py3-none-any.whl'
I have also tried it this way:
/databricks/python/bin/pip install /mnt/package-source/parser-3.0-py3-none-any.whl
but I get a similar error:
WARNING: Requirement '/mnt/package-source/parser-3.0-py3-none-any.whl' looks like a filename, but the file does not exist
ERROR: Could not install packages due to an EnvironmentError: [Errno 2] No such file or directory: '/mnt/package-source/parser-3.0-py3-none-any.whl'
I've even tried using relative paths such as ../../mnt/package-source/... but to no avail. Can anyone tell me what I'm doing wrong please?
Related question: Azure Databricks cluster init script - install python wheel

I got it working using a relative path. It turns out ../../mnt/ wasn't the correct path. It worked using ../../../dbfs/mnt/. It just took a bit of exploring the file system using the bash ls command to find it.
For anyone else experiencing the same problem, I suggest starting with something like this in a notebook:
%%sh
ls ../../../

Related

I cant run the server in my project with django

I created a project in django normally, but when I try to run the server (python manager.py runserver) I get the following error: C:\Users\Fredy\AppData\Local\Programs\Python\Python39\python.exe: can't open file 'C:\Downloads\django_one\manager.py': [Errno 2] No such file or directory
The strange thing is that the manager.py file is there. I went back to other projects that I had already created that were working and the same error occurred.
Does anyone know how I fix this?
use "python manage.py runserver". If you look at the files, it has a "manage.py" and not "manager.py"
Usually this happens because you are not in the root directory of the project. Judging by the screenshot it looks like you are, but worth a double check.
There are a few possible solutions:
chmod +x manage.py (manage.py may have lost execute permissions)
Did you forget to activate the virtual environment?
Your installation of Django may not support Python 3.9 (Django 3.2 supports Python versions 3.6, 3.7, 3.8, 3.9, 3.10)
Some sources to look into:
python: can't open file 'manage.py': [Errno 2] No such file or directory
https://www.pythonanywhere.com/forums/topic/756/
https://docs.djangoproject.com/en/3.2/ref/django-admin/
https://docs.djangoproject.com/en/3.2/faq/install/
I hope this gives some direction. Good Luck

Could not install packages due to an OSError: [Errno2] No such file or directory

When I try to install Tensorflow this message appears. I use the latest version of python and pip.
ERROR: Could not install packages due to an OSError: [Errno 2] No such file or directory: 'C:\\Users\\julia\\AppData\\Local\\Packages\\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\\LocalCache\\local-packages\\Python39\\site-packages\\tensorflow\\include\\external\\com_github_grpc_grpc\\src\\core\\ext\\filters\\client_channel\\lb_policy\\grpclb\\client_load_reporting_filter.h'
It appears on all the older versions of Tensorflow. On the newest release appears this:
ERROR: Could not find a version that satisfies the requirement tensorflow==1.2.0
ERROR: No matching distribution found for tensorflow==1.2.0
Try following this tutorial: https://www.howtogeek.com/266621/how-to-make-windows-10-accept-file-paths-over-260-characters/.
In my case, only deleting manually the "leftovers" of the package in the environment directory fixed this problem.
path : ... "\venv\Lib\site-packages\ {package name}"
make sure you backup before deleting any file.

Can't make action calls through anaconda py35 env in spark HdInsight

As per the documentation - https://learn.microsoft.com/en-us/azure/hdinsight/spark/apache-spark-python-package-installation
we had installed several external python modules through new anaconda env 'py35_data_prof'. However as soon as we invoke any rdd action calls like rdd.count() or rdd.avg() in our python code, spark2 throws -
Cannot run program "/usr/bin/anaconda/envs/py35_data_prof/bin/python": error=2, No such file or directory
enter image description here
FYI, The python indicated in error path - '/usr/bin/anaconda/envs/py35_data_prof/bin/python' is actually a symlink rather than python dir.
I have been looking up the HDInsight docs but can't seem to find the fix. Please let us know if there is a way around it.
The error message “Cannot run program "/usr/bin/anaconda/envs/py35_data_prof/bin/python": error=2, No such file or directory” clearly says the unable to find/locate the package installed. Make sure the package is installed with the all the requirements mentioned below.
• Create Python virtual environment using conda.
• Install external Python packages in the created virtual environment if needed.
• Change Spark and Livy configs and point to the created virtual environment.
I would request you to follow the each and every step mentioned here: “Safely install external Python packages”.
Hope this helps.

Unable to deploy a Ceph manager daemon with ceph-deploy: Error EACCES: access denied

I am trying to set up a Ceph storage cluster using the quick start guide found here: http://docs.ceph.com/docs/master/start/quick-ceph-deploy/
When I try to deploy a manager daemon using this command:
ceph-deploy mgr create enickel7
I get this error:
[ceph_deploy.mgr][ERROR ] OSError: [Errno 2] No such file or directory: '/var/lib/ceph/mgr/ceph-enickel7'
[ceph_deploy][ERROR ] GenericError: Failed to create 1 MGRs
(enickel7 is the name of the node I'm using - the Ceph documentation calls the nodes node1, node2, and node3.) I tried to manually create the directory /var/lib/ceph/mgr, then ran the command again. Then I got this error:
[enickel7][ERROR ] Error EACCES: access denied
[enickel7][ERROR ] exit code from command was: 13
[ceph_deploy.mgr][ERROR ] could not create mgr
[ceph_deploy][ERROR ] GenericError: Failed to create 1 MGRs
Does anyone know what this error means, or how to fix it? ceph-deploy definitely has sudo permissions, and the mgr directory has the same permissions as other directories in /var/lib/ceph.
Thank you for your time!
It's because your ceph version is not Luminous >=12.2.0, you must use ceph-deploy to install ceph as the document said, the default version installed by ceph-deploy is 10.2.10 Jewel for now.
If you want to create a manager daemon process, you need to upgrade your ceph to Luminous 12.2.1. The doc is here: http://docs.ceph.com/docs/master/release-notes/#v12-2-1-luminous
I just ran into this same issue on ubuntu 16.04 trying to deploy kraken with ceph-deploy version 1.5.39.
Ceph-deploy automatically created the directories for me but they were not owned correctly. It looks like the keyring it created in /var/lib/ceph/bootstrap-mgr along with that directory is owned by root. I chowned it to ceph. and that got me past that error.
In your case I would guess that the directory is owned by your user instead of "ceph". I hope this helps.
please test a below command:
chown ceph:ceph /var/lib/ceph
and
what ceph version used?
please use a latest version (mimic 13.2)
and ceph-deploy 2
Faced the same issue. As Michael Meepo said it was version problem.
On admin node I registered the ceph repo for luminous & installed ceph-deploy.
But when I tried to use it ceph-deploy installed the default version (Jewel) on remote node.
To install specific version you should ask for it:
ceph-deploy install master --release luminous
To use the ceph-deploy version matching your distribution's, as from https://github.com/ceph/ceph-deploy page, use ceph repositories. For instance, as Debian stretch provides Jewel (Ceph v. 10), use the following repository: http://ceph.com/debian-jewel by creating a /etc/apt/source.list.d/ceph-deploy.list file containing:
deb http://download.ceph.com/debian-jewel/ stretch main
Install the keys:
wget -q -O- 'https://download.ceph.com/keys/release.asc' | sudo apt-key add -
Then proceed with
apt-get install ceph-deploy
From there it should work as expected.

Unable to open cqlsh Apache cassandra - ImportError: No module named cqlshlib

I am new to cassandra ! Have downloaded the apacahe cassandra 2.1.2 package and initialy was able to connect to cqlsh but then after installing CCM i am unable to connect , will get the following error
Traceback (most recent call last):
File "bin/cqlsh", line 124, in <module>
from cqlshlib import cql3handling, cqlhandling, pylexotron,sslhandling, copy
ImportError: No module named cqlshlib
Thanks in advance !
Spent a couple of days, scouring the net moving renaming copying packages .
Easiest workaround for this error that worked :
pip install cqlsh
You could export PYTHONPATH, to include site package folder where cqlshlib exists
First find the path where cqlshlib exists
find /usr/lib/ -name cqlshlib
Export the path using below variable name
export PYTHONPATH=/usr/lib/python2.7/site-packages/
I have tried their ways, but failed. And I think cqlsh just cannot find the exact path to cqlshlib.so;
I solved it this way:
Centos6.7 ,
datastax3.9,
my cqlshlib path:/usr/local/lib/python2.7/site-packages/
vim /usr/bin/cqlsh.py
and add the path of cqlshlib after import sys, the file looks like:
...
import sys
...
from uuid import UUID
sys.path.append("/usr/local/lib/python2.7/site-packages") #add this sentence`
Then I execute cqlsh, it works.
If you're in the cassandra directory, run:
bin/cqlsh
If you check the cqlsh you're running with which cqlsh i suspect you're hitting the ccm one and missing something in your path.
Just start cqlsh with root,
sudo cqlsh <ipaddress>
The other answers correctly diagnose the problem.
You need to find the correct cqlshlib.
I had installed cassandra with apt get to Ubuntu, so the correct path for me was
/usr/local/apache-cassandra-3.11.3/pylib
I had also messed things up by previously doing pip install cqlsh This is NOT supported by the apache team!
Like another answer here, I hacked the cqlsh.py file in /usr/bin
My successful hack was to replace the commented out line with the line below it.
#cqlshlibdir = os.path.join(CASSANDRA_PATH, 'pylib')
cqlshlibdir = "/usr/local/apache-cassandra-3.11.3/pylib"
I have spent nearly 1 day to solve this problem. The reason is that there is a mismatch between /usr/lib/python2.7/site-packages/ and /usr/local/lib/python2.7/site-packages/ (for my specific folder tree).
The command to use is as the following:
mv /usr/lib/python2.7/site-packages/* /usr/local/lib/python2.7/site-packages/
rmdir /usr/lib/python2.7/site-packages
ln -s /usr/local/lib/python2.7/site-packages /usr/lib/python2.7/site-packages
I guess you will find 2 /site-packages/ also.
Just for reference for others.
Workaround:
I assume that you have already installed Cassandra and cqlshlib has been installed in /usr/lib/python2.7/site-packages/
`ln -s /usr/lib/python2.7/site-packages/cqlshlib /usr/local/lib/python2.7/site-packages/cqlshlib`
(replace /usr/lib/python2.7/site-packages with your python directory).
More Detail:
One possibility is that your default python is not in /usr/bin/. Say it has been installed in /usr/local/bin/. However, Cassandra seems to install cqlshlib in /usr/lib/python2.7/site-packages for some reason. As a result, the default python cannot find cqlshlib package when you run cqlsh command.
As Cassandra supports python2 and you are on python3 and don't wanna mix both follow the below trick worked for me.
Install python2 but don't add it to the environmental path variable
navigate to the bin folder and start the Cassandra server .\cassandra -f
open another terminal and create virtual env in Cassandra home directory using below command virtualenv -p C:\Python27\python.exe .\venv
activate the virtual env with cmd .\venv\Scripts\activate.ps1
start the cqlsh using the cmd .\cqlsh.bat in virtual env

Resources