install databricks inside azure self-hosted pool in windows

install databricks inside azure self-hosted pool in windows - databricks

The main idea is installed databricks, so I use the following script:
- task: CmdLine#2
displayName: "Install databricks cli"
inputs:
script: |
pip install databricks-cli --user
pip install databricks
workingDirectory: $(projectRoot)/${{ parameters.IrropsMLService }}
- script: |
databricks --version
But I faced with problem:
'databricks' is not recognized as an internal or external command,
operable program or batch file.
Could you help me to resole the problem?

First remove databricks connect and also to uninstall any pyspark installation.
And then follow the installation guide.
Uninstall PySpark. This is required because the databricks-connect package conflicts with PySpark.
pip uninstall pyspark
Install the Databricks Connect client.
pip install -U "databricks-connect==7.3.*" # or X.Y.* to match your cluster version.
Note - Always specify databricks-connect==X.Y.* instead of databricks-connect=X.Y, to make sure that the newest package is installed.
For more information follow this link

Related

command not found: newrelic-lambda

Trying to use the newrelic cli to integrate my aws account to new relic, but have run into this hiccup. I'm following these steps:https://docs.newrelic.com/docs/serverless-function-monitoring/aws-lambda-monitoring/enable-lambda-monitoring/account-linking
After installing pip3 install newrelic-lambda-cli, I then try to run the command
newrelic-lambda integrations install --nr-account-id YOUR_NR_ACCOUNT_ID \ --nr-api-key YOUR_NEW_RELIC_USER_KEY
and get the following error:
zsh: command not found: newrelic-lambda
I then check python to see if I have installed the package, and I am seeing that it is installed:
Anyone know why I'm unable to find the newrelic-lambda command?

Just uninstalled and reinstalled and it is working now. I believe it was because my boto3 was of a lower level
pip uninstall newrelic-lambda
pip install newrelic-lambda

Not able to create condo environment in SageMaker studio

Sagemaker studio terminal shows default python version as "Python 3.7.10"
I am trying to create conda environment with python3.9 version as below.
conda create --name custom_python_39 python=3.9
But I get the following error:
RemoveError: 'setuptools' is a dependency of conda and cannot be removed from
conda's operating environment.
But pip list doesn't show setuptools.
I already tried the following
conda update conda -n base
conda update --force conda
But I get the following erros:
bash-4.2$ conda update conda -n base
Collecting package metadata: done
Solving environment: /
The environment is inconsistent, please check the package plan carefully
The following packages are causing the inconsistency:
- conda-forge/noarch::pyopenssl==22.0.0=pyhd8ed1ab_0
- conda-forge/noarch::requests==2.27.1=pyhd8ed1ab_0
- conda-forge/noarch::urllib3==1.26.9=pyhd8ed1ab_0
- conda-forge/noarch::argon2-cffi==21.3.0=pyhd8ed1ab_0
- defaults/linux-64::argon2-cffi-bindings==21.2.0=py37h7f8727e_0
- conda-forge/noarch::bleach==3.1.4=pyh9f0ad1d_0
- conda-forge/linux-64::brotlipy==0.7.0=py37h5e8e339_1001
- conda-forge/linux-64::conda==4.6.14=py37_0
- conda-forge/linux-64::ipykernel==5.2.0=py37h43977f1_1
- conda-forge/linux-64::ipython==7.13.0=py37hc8dfbb8_2
- conda-forge/linux-64::jedi==0.16.0=py37hc8dfbb8_1
- conda-forge/noarch::jinja2==2.11.3=pyhd8ed1ab_1
- conda-forge/noarch::jsonschema==3.2.0=pyhd8ed1ab_3
- conda-forge/noarch::nbconvert==5.6.1=pyhd8ed1ab_2
- conda-forge/noarch::notebook==6.4.1=pyha770c72_0
- conda-forge/noarch::prompt-toolkit==3.0.5=py_1
- conda-forge/noarch::pygments==2.6.1=py_0
/ Killed

M1 Mac Snowflake connector for python - error: incompatible version of 'pyarrow' installed

I am getting an error message (please see screenshoot) while running in Terminal the command below:
import snowflake.connector as sf
Can someone help with this ?
Thank you.

The snowflake connector for python not working on Apple Silicon (M1) is a known problem.
There are some workaround available (see here)
Please find the one that worked for me below:
Python 3.9
clean conda (I am using miniforge) environment (i.e., conda create -n py9 python=3.9)
adding dependencies via pip in the following order
pip install snowflake-sqlalchemy
pip install sqlalchemy
pip install snowflake-sqlalchemy
Python 3.10
clean conda environment (i.e., conda create -n py10 python=3.10)
just pip install snowflake-sqlalchemy
UPDATE: below the workaround endorsed by snowflake official docs (3.8 version)
CONDA_SUBDIR=osx-64 conda create -n snowpark python=3.8 numpy pandas -c https://repo.anaconda.com/pkgs/snowflake
conda activate snowpark
conda config --env --set subdir osx-64

Issue while installing airflow in conda virtual environment

I have followed a number of steps but could not get airflow up and running in my conda virtual environment.
below are the steps I have followed.
Issue below command in my conda environment.
pip install apache-airflow==1.10.10 --constraint https://raw.githubusercontent.com/apache/airflow/1.10.10/requirements/requirements-python3.7.txt.
There were issues regarding VC++ not present.This issue has been resolved post installation of VC++ and added it in my environment variable. I have even added AIRFLOW_HOME in my environment variable.
Unfortunately below command is not working.
airflow initdb
I have not been able to start the webserver as well.
airflow webserver -p 8080
my system is not aware of airflow.When I am issuing 'airflow initdb' commmand,I am getting below error.
(biometric) C:\Users\royan\Anaconda3\envs\biometric\lib\site-packages\airflow\bin>airflow initdb
'airflow' is not recognized as an internal or external command,
operable program or batch file.
Please suggest the steps I need to perform to resolve this error.

I believe the easiest way to install Airflow in an Anaconda environment is to use the conda-forge repository:
conda create -n airflow
conda activate airflow
conda config --env --add channels conda-forge
conda config --env --set channel_priority strict
conda install airflow

EMR Notebooks running Spark - how to install additional libraries from a private github branch

I would like to install a python library into the EMR Notebook virtualenv as in sc.install_pypi_package("arrow==0.14.0", "https://pypi.org/simple").
The python library is not released as a pypi package, but rather sits on a custom branch on a private github repository. How can I refer to the git repo and provide the relevant git credentials for AWS EMR for this to work?
Would this library be available also to the Spark EMR cluster (UDF functions) too, or would it be available just for the jupyter notebook ?

You can install it when initializing the EMR Cluster using Bootstrap Actions. This way the library will be available within Spark Cluster and the Jupiter Notebook.
In bootsrap script, you could use pip to get the lib from GitHub:
pip install -e git+https://github.com/some_repo.git
See pip_install git for how to clone from GitHub using pip.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

install databricks inside azure self-hosted pool in windows - databricks

Related

command not found: newrelic-lambda

Not able to create condo environment in SageMaker studio

M1 Mac Snowflake connector for python - error: incompatible version of 'pyarrow' installed

Issue while installing airflow in conda virtual environment

EMR Notebooks running Spark - how to install additional libraries from a private github branch

Categories

Resources