How to execute python commands from a conda .yaml specification file? - python-3.x

I am trying to list conda dependencies using a .yaml file for an AzureML environment. I do not want to use a custom docker image just for a few variations. I wonder if there is a way to instruct the build to run python commands using the .yaml file. Here are excerpts of what I have tried as of now:
name: classifer_environment
dependencies:
- python=3.6.2
- pip:
- azureml-defaults>=1.0.45
- nltk==3.4.5
- spacy
- command:
- bash -c "python -m nltk.downloader stopwords"
- bash -c "python -m spacy download en_core_web_sm"
I also tried this:
name: classifer_environment
dependencies:
- python=3.6.2
- pip:
- azureml-defaults>=1.0.45
- nltk==3.4.5
- spacy
- python:
- nltk.downloader stopwords
- spacy download en_core_web_sm
I do not have much clarity about yaml specifications. Both the specifications fail with the following messages respectively in the build logs:
"Unable to install package for command."
"Unable to install package for python."

This might be a neat feature to have, but for now it's not a thing - at least not directly in the YAML like this.
Instead, the unit of computation in Conda is the package. That is, if you need to run additional scripts or commands at environment creation, it can be achieved by building a custom package and including this package in the YAML as a dependency. The package itself could be pretty much empty, but whatever code one needs to run would be included via some installation scripts.

Related

How to install conda packages in the aarch64 from x86-64 architecture

May I know how can I get the installed packages from a particular environment from x86-64 architecture linux and
how can I create a new conda environment in aarch64 architecture using the same package?
First in the x86-64 architecture linux machine called L2, I export the package
conda list --export > envconda.txt
When I open the envconda.txt, it is
# This file may be used to create an environment using:
# $ conda create --name <env> --file <this file>
# platform: linux-64
_libgcc_mutex=0.1=main
_r-mutex=1.0.0=anacondar_1
I changed the platform : linux-64 to linux-aarch64 because I am going to install the packages in the aarch64 architecture.
# This file may be used to create an environment using:
# $ conda create --name <env> --file <this file>
# platform: linux-aarch64
_libgcc_mutex=0.1=main
_r-mutex=1.0.0=anacondar_1
In aarch64 linux machine called L1, I create a conda environment
conda create -n envtest --file envconda.txt
Collecting package metadata (current_repodata.json): done
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed
PackagesNotFoundError: The following packages are not available from current channels:
- setuptools==36.4.0=py36_1
- kiwisolver==1.1.0=pypi_0
- pyyaml==3.13=pypi_0
- jedi==0.10.2=py36_2
- libgcc==5.2.0=0
- jsonschema==2.6.0=py36_0
- ptyprocess==0.5.2=py36_0
- prompt_toolkit==1.0.15=py36_0
- libstdcxx-ng==9.1.0=hdf63c60_0
- tqdm==4.36.1=pypi_0
- tomli==1.2.3=pypi_0
- astor==0.7.1=pypi_0
- argparse==1.4.0=pypi_0
- pycparser==2.19=pypi_0
- testpath==0.3.1=py36_0
- cudnn==7.6.5=cuda10.2_0
- asn1crypto==0.22.0=py36_0
- dataclasses==0.8=pypi_0
- platformdirs==2.4.0=pypi_0
- krbcontext==0.10=pypi_07
- decorator==4.1.2=py36_0
- lazy-object-proxy==1.7.1=pypi_0
- gsl==2.2.1=0
- pexpect==4.2.1=py36_0
- icu==54.1=0
- freetype==2.5.5=2
- bleach==1.5.0=py36_0
- matplotlib==3.1.1=pypi_0
- wheel==0.29.0=py36_0
- cudatoolkit==10.2.89=hfd86e86_1
- glib==2.50.2=1
- kneed==0.7.0=pypi_0
- sqlite==3.13.0=0
- importlib-metadata==1.7.0=pypi_0
- python==3.6.2=0
- jpeg==9b=0
- pango==1.40.3=1
- fontconfig==2.12.1=3
- resampy==0.2.2=pypi_0
- nbformat==4.4.0=py36_0
- pixman==0.34.0=0
- scikit-learn==0.21.3=pypi_0
- termcolor==1.1.0=pypi_0
- typed-ast==1.5.4=pypi_0
- keras-applications==1.0.8=pypi_0
- harfbuzz==0.9.39=2
- libffi==3.2.1=1
- jupyter_client==5.1.0=py36_0
- gssapi==1.6.9=pypi_0
- curl==7.54.1=0
- keras==2.2.4=pypi_0
- isort==5.10.1=pypi_0
- simplegeneric==0.8.1=py36_1
- joblib==0.14.0=pypi_0
- pypandoc==1.6.3=pypi_0
- python-dateutil==2.8.2=pypi_0
- ipython_genutils==0.2.0=py36_0
- pyparsing==2.4.2=pypi_0
- ca-certificates==2022.6.15=ha878542_0
- krb5==1.13.2=0
- path.py==10.3.1=py36_0
- markdown==3.0.1=pypi_0
- requests-kerberos==0.12.0=pypi_0
- hdfs==2.5.8=pypi_0
- traitlets==4.3.2=py36_0
- tornado==4.5.2=py36_0
- librosa==0.7.0=pypi_0
- pyasn1==0.4.8=pypi_0
- blas==1.0=mkl
- zlib==1.2.11=0
- libogg==1.3.2=h14c3975_1001
- mkl==2017.0.3=0
- terminado==0.6=py36_0
- libflac==1.3.1=hf484d3e_1002
- python-levenshtein==0.12.2=pypi_0
- werkzeug==0.14.1=pypi_0
- pyspark==2.3.2=pypi_0
- urllib3==1.26.9=pypi_0
- bzip2==1.0.6=3
- html5lib==0.9999999=py36_0
- pywavelets==1.1.1=pypi_0
- zeromq==4.1.5=0
- pykerberos==1.2.1=pypi_0
Current channels:
- https://repo.anaconda.com/pkgs/main/linux-aarch64
- https://repo.anaconda.com/pkgs/main/noarch
- https://repo.anaconda.com/pkgs/r/linux-aarch64
- https://repo.anaconda.com/pkgs/r/noarch
- https://conda.anaconda.org/conda-forge/linux-aarch64
- https://conda.anaconda.org/conda-forge/noarch
To search for alternate channels that may provide the conda package you're
looking for, navigate to
https://anaconda.org
and use the search bar at the top of the page.
May I know how can I install the packages successfully in the aarch64 architecture?
Last but not least, when I install package using pip install numpy, I got this error Illegal instruction (core dumped)
For this issue, may I know how can I solve this also in linux aarch64 architecture?
Very unlikely this will work for multiple reasons:
Package support off the major platforms (osx-64, linux-64, win-64) is sparse, especially further back in time. A concrete example is cudatoolkit, which only has linux-aarch64 builds starting with version 11.
Overgeneralized environment. The more packages included in an environment, the more difficult it becomes to solve it, and solving across platforms aggravates this problem. I would, for example, remove any Jupyter-related packages completely. In the future, try to plan ahead to have dedicated environments to specific projects, and only install the packages that are absolutely required.
Some packages are completely incompatible. For example mkl is architecture specific.
Nevertheless, if you want to attempt recreating an approximation of the environment, there are some options. First, one cannot achieve this with conda list --export - that simply does not handle environments that have packages from PyPI installed.
PyPI-centric Approach
Because so much of the environment is from PyPI, my first inclination is to recommend abandoning the Conda components and going a pip route. That is, use
pip list --format=freeze > requirements.txt
to capture the Python packages, then create a new environment with something like:
environment.yaml
name: foo
channels:
- conda-forge
- nodefaults
dependencies:
- python=3.6
- pip
- pip:
- -r requirements.txt
With both requirements.txt and environment.yaml in the same folder, the environment is created with
## "foo" is arbitrary - pick something descriptive
conda env create -n foo -f environment.yaml
Retaining some Conda packages
You could also try keeping some parts from Conda by mixing together a conda env export and the previous pip list. Specifically, export a minimal environment definition, with
conda env export --from-history > environment.yaml
Edit this file to include a specific version of Python, remove any packages that are not available for linux-aarch64 (like mkl), and add the pip: section, as above:
environment.yaml
#...
dependencies:
- python=3.6
# ...
- pip
- pip:
- -r requirements.txt
This is then used with:
conda env create -n foo -f environment.yaml
Expect to iterate several times to discover what cannot be found for the platform. I would strongly recommend using mamba instead of Conda in order to minimize this solving time.

Webjobs Running Error (3587fd: ERR ) from zipfile

I have the following small script in a file named fgh.py which I have been attempting to schedule as a webjob
import pandas as pd
df=pd.DataFrame({'a':[1,2,2],'b':[5,6,9]})
df['x']=df.a.sub(df.b)
print(df)
Using #Peter Pan post. I have created a virtual environment, done a pip install pandas. From the virtual environment, the script runs and executes as required.It however does not execute when loaded in Azure Webjobs. I suspect issues arise from the interface between the run,bat file and the Azure python console but have limited understanding of Azure to resolve the issue
In kudus, I have used this post to install python.
Running where python in cmd command in https://myapp.scm.azurewebsites.net/DebugConsole I get;
Additionally from https://arcgistrial.scm.azurewebsites.net/DebugConsole I get the following when I run cmd command python -V
In my run.bat file, I have tried to use either of the directories above without success.
Whether I make my run.bat file D:\home\python364x64\python.exe fgh.py or D:python364x64\python.exe fgh.py I get the following error;
I have gone a head and installed pandas and checked if successful by trying to install numpy
All this has not helped. I have been on this for a couple of days and it has to work somehow. Any help?
(Things are not quite straightforward in old Webjobs to run python task with dependencies. It has been quite some time, the world has moved on to Azure Function :))
However, since you still need to stick to Webjobs, below are the steps I followed which worked. I am using a batch file (.cmd) to run the python script due to the pre-requisites.
By default webjob supports python 2.7 at this moment. So, add python3 from 'extension' in your web app, In this case it was 3.6.4 x64 for me. This will add in path D:\home\python364x64\. How did I know? Kudus console :)
Create a requirements.txt file which contains pandas and numpy (note I had to explicitly add numpy version 1.19.3 due to an issue with latest 1.19.4 in Windows host at the time of this writing). Basically I used your fgh.py which depends on pandas which in turn depends on numpy.
pandas==1.1.4
numpy==1.19.3
Create a run.cmd file having the following content. Note 1st line is not needed. I was just checking python version.
D:\home\python364x64\python --version
D:\home\python364x64\python -m pip install --user --upgrade pip
D:\home\python364x64\python -m pip install --user certifi
D:\home\python364x64\python -m pip install --user virtualenv
D:\home\python364x64\python -m virtualenv .venv
.venv\Scripts\pip install -r requirements.txt
.venv\Scripts\python fgh.py
Zip fgh.py, run.bat and the requirements.txt files into a single zip. Below is the content of my zip.
Upload the zip for the webjob.
Run the job :)
Ignore the error "ModuleNotFoundError: No module named 'certifi'", not needed.
The key to solving the problem is that you need to create your venv environment on azure.
Step 1. Run successfully in local.
Step 2. Compress your webjob file.
Step 3. Upload webjob zip file.
Because the test environment has python1 before, I will create a webjob of python2 later.
Step 4. Log in kudu.
① cd ..
② find Python34, click it.
③ python -m venv D:\home\site\wwwroot\App_Data\jobs\continuous\python2\myenv
④ Find myenv folder.
⑤ active myenv, input .\activate.bat.
D:\home\site\wwwroot\App_Data\jobs\continuous\python2\myenv\Scripts>.\activate.bat
⑥ Back to python2 folder, and input pip install pandas.
⑦ Then input python aa.py.

tox does not run sphinx

I want to run the following command from tox.
python setup.py build_sphinx -b html
I have configured my setup.py to build the docs when I run the above command from the console (I checked that from the console, that command makes doc).
Then I have edited my tox.ini as follows:
.....
[testenv:sphinx]
command = python setup.py build_sphinx -b html
setup.cfg as follows:
[build_sphinx]
project = project_name
source-dir = module_name/doc
build-dir = module_name/doc/build
But when I run tox -e sphinx, tox exits with a successful message, but no docs generated.
Can somebody help me with this?
I'd not recommend using setuptools to build documentation. Consider instead using sphinx directly as tox itself does at https://github.com/tox-dev/tox/blob/master/tox.ini#L48-L53 Alternatively please post the exact output of the run with -vvv, or make the project publicly available for us to try it too.

Export / import conda environment and package including local files

I want to make my analysis reproducible and want to use conda to make sure, specific software of a specific version is used. To do so, I set up an environment including some programs built from local source and scripts exporting some environment variables.
I exported the environment and built a package from the local files (basically following a procedure described here, post #2: >link<):
conda env export > myenv.yml
conda package --pkg-name myenv --pkg-version 0.1 --pkg-build 1
On a different machine, I imported the environment without problems using
conda env create -f myenv.yml
source activate myenv
However, I got some trouble when trying to install the package:
conda install myenv-0.1-1.tar.bz2
ERROR conda.core.link:_execute_actions(337): An error occurred while installing package '<unknown>::myenv-0.1-1'.
FileNotFoundError(2, 'No such file or directory')
Attempting to roll back.
FileNotFoundError(2, 'No such file or directory')
So I read a bit about channels and tried setting up a local channel with the package:
mkdir -p own_pkg/linux-64
mkdir -p own_pkg/noarch
mv myenv-0.1-1.tar.bz2 own_pkg/linux-64/
conda index own_pkg/linux-64 own_pkg/noarch
updating: myenv-0.1-1.tar.bz2
I added the following to ~/.condarc
channels:
- defaults
- file://my/path/to/own_pkg
And then tried again to install but still:
conda install myenv
Fetching package metadata .............
PackageNotFoundError: Packages missing in current channels:
- myenv
We have searched for the packages in the following channels:
- https://repo.continuum.io/pkgs/main/linux-64
- https://repo.continuum.io/pkgs/main/noarch
- https://repo.continuum.io/pkgs/free/linux-64
- https://repo.continuum.io/pkgs/free/noarch
- https://repo.continuum.io/pkgs/r/linux-64
- https://repo.continuum.io/pkgs/r/noarch
- https://repo.continuum.io/pkgs/pro/linux-64
- https://repo.continuum.io/pkgs/pro/noarch
- file://my/path/to/own_pkg/linux-64
- file://my/path/to/own_pkg/noarch
Even so in /my/path/to/own_pkg/linux-64 the files .index.json, repodata.json etc. exist and the package is named and the tar.bz2 file referenced therein.
Can someone explain to me what I am doing wrong and/or what the appropriate workflow is to achieve my goal?
Thanks!
More information:
Source machine:
Linux Ubuntu 16.04
conda version 4.4.7
conda-build version 3.2.1
Target machine:
Scientific Linux 7.4
conda version 4.3.29
conda-build version 3.0.27

Bash and Conda: Installing non-conda packages in conda environment with executable bash script

I am writing a bash script with the objective of hosting it on a computing cluster. I want the script to create a conda environment for whichever user executes it, so that everyone on our team can quickly set-up the same working environment.
I realize this is a bit overkill for the number of commands necessary but I wanted to practice some bash scripting. Here is my script so far:
#!/bin/bash
# Load anaconda
module load Anaconda/4.2.0
# Create environment
conda create -n ADNI
# Load environment
source activate ADNI
# Install image processing software
pip install med2image
echo 'A working environment named ADNI has been created.'
echo 'Please run `source activate ADNI` to work in it.'
This script creates the environment successfully. However, once I load the environment after running the script, I run conda list to see which packages are loaded within it and get the following output:
(ADNI) MLG-BH0039:ADNI_DeepLearning johnca$ conda list
# packages in environment at /Users/johnca/miniconda3/envs/ADNI:
#
(ADNI) MLG-BH0039:ADNI_DeepLearning johnca$
This gives me the impression that the environment has no packages loaded in it. Is this correct? If so, how can I alter the script so that the desired packages successfully install into the specified environment.
Thanks!
I managed to find a better way to automate this process by creating an environment.yml file with all the desired packages. This can include pip packages as well. My file looks like this:
name: ADNI
channels:
- soumith
- defaults
dependencies:
- ca-certificates=2017.08.26=h1d4fec5_0
- certifi=2017.11.5=py36hf29ccca_0
- cffi=1.11.2=py36h2825082_0
- freetype=2.8=hab7d2ae_1
- intel-openmp=2018.0.0=hc7b2577_8
- jpeg=9b=h024ee3a_2
- libffi=3.2.1=hd88cf55_4
- libgcc=7.2.0=h69d50b8_2
- libgcc-ng=7.2.0=h7cc24e2_2
- libgfortran-ng=7.2.0=h9f7466a_2
- libpng=1.6.32=hbd3595f_4
- libstdcxx-ng=7.2.0=h7a57d05_2
- libtiff=4.0.9=h28f6b97_0
- mkl=2018.0.1=h19d6760_4
- numpy=1.13.3=py36ha12f23b_0
- olefile=0.44=py36h79f9f78_0
- openssl=1.0.2n=hb7f436b_0
- pillow=4.2.1=py36h9119f52_0
- pip=9.0.1=py36h6c6f9ce_4
- pycparser=2.18=py36hf9f622e_1
- python=3.6.0=0
- readline=6.2=2
- scipy=1.0.0=py36hbf646e7_0
- setuptools=36.5.0=py36he42e2e1_0
- six=1.11.0=py36h372c433_1
- sqlite=3.13.0=0
- tk=8.5.18=0
- wheel=0.30.0=py36hfd4bba0_1
- xz=5.2.3=h55aa19d_2
- zlib=1.2.11=ha838bed_2
- pytorch=0.2.0=py36hf0d2509_4cu75
- torchvision=0.1.9=py36h7584368_1
- pip:
- cycler==0.10.0
I can then automate creating the environment by referencing this file, as in:
#!/bin/bash
# Load anaconda
module load Anaconda/4.2.0
# Create environment
conda env create -f adni_env.yml
echo ' '
echo 'A working environment named ADNI has been created or updated.'
echo 'If working on the cadillac server please `module load Anaconda/4.2.0`.'
echo 'Then run `source activate ADNI` to work within the environment.'
echo ' '
I hope this can help anyone in the future who may have similar issues.
The command
conda create -n ADNI
creates an environment with no packages installed, not even Python or pip. Therefore, despite activating the environment, you are still using some other pip that appears on your PATH. You need to install pip or Python into the environment first, either when the environment is created or afterwards with the conda install command
conda create -n ADNI python=3.6
will install Python, which brings along pip when the environment is created or
conda create -n ADNI
conda install -n ADNI python=3.6
will install Python afterwards.
In the best case, you would use conda to install that package. It isn't all that difficult to create a conda package from a pip package and upload it to a channel on Anaconda.org so your team can access it.

Resources