Running Ludwig on AML Compute: docker image failing to build on gmpy - azure-machine-learning-service

I'm currently trying to create a TensorFlow estimator to run Ludwig's training model on Azure ML Compute with various pip and conda packages like so:
estimator= TensorFlow(source_directory= project_folder,
compute_target=compute_target, script_params=script_params,
entry_script='./train.py', pip_packages=dependencies, conda_packages =
["tensorflow"], use_gpu =True)
One of the pip packages is gmpy, but it will not install and throws an error: fatal error: gmp.h: No such file or directory compilation terminated. error: command 'gcc' failed with exit status 1.
This prevents Ludwig from installing and causes the imagine to fail to build
When I run Ludwig locally in a python virtual environment on Ubuntu, I'm able to work around this issue by running “sudo apt-get install libgmp3-dev” instead of pip install gmpy. When I try adding Gmpy2 as a library to the estimator, it throws the same error, and it seems that libgmp3-dev doesn't have a pip or conda equivalent. I tried adding the gmpy and gmpy2 .whl files directly to the environment but the wheel files were not recognized as compatible.
Is there some way to add RUN sudo apt-get install libgmp3-dev to the dockerfile so that the docker container made by the estimator has this already installed without needing to create a custom dockerfile? I noticed that the TensorFlow estimator class has an "environment_definition" flag that can take a DockerSection but I can't find any examples of how they work.

Looks like they have gmpy2 on conda-forge channel
https://anaconda.org/conda-forge/gmpy2
you can also should be able to reference dependency from # git

Related

Unable to install tokenizers in Mac M1

I installed the transformers in the Macbook Pro M1 Max
Following this, I installed the tokenizers with
pip install tokenizers
It showed
Collecting tokenizers
Using cached tokenizers-0.12.1-cp39-cp39-macosx_12_0_arm64.whl
Successfully installed tokenizers-0.12.1
It seems to use the correct architecture for the whl file
When I import it I get
'/Users/myname/miniforge3/envs/tf/lib/python3.9/site-packages/tokenizers/tokenizers.cpython-39-darwin.so' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64e'))
I see that this problem used to happen to others before. Any thoughts on how to fix this?
James Briggs method works but produces the following error
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for tokenizers
Failed to build tokenizers
ERROR: Could not build wheels for tokenizers, which is required to install pyproject.toml-based projects
The Issue
After installing Rust and Cargo, we must source the environment file. This is the missing step in the previous answer.
The Solution
The workaround to solving this is to type the following in the terminal, right after installing Rust:
source "$HOME/.cargo/env"
Then, you can install transformers with the following code snippet:
pip install transformers
If using Anaconda we switch to a terminal window and create a new ARM environment like so:
CONDA_SUBDIR=osx-arm64 conda create -n ml python=3.9 -c conda-forge
now get in to ml envoriment
conda activate ml
run inside the env
conda env config vars set CONDA_SUBDIR=osx-arm64
needs to restart env
conda deactivate
get into to env
conda activate ml
PyTorch Installation
To get started we need to install PyTorch v1.12. For now, this is only available as a nightly release.
pip3 install -U --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu
Side note: The transformers library uses tokenizers built in Rust (it makes them faster). Because we are using this new ARM64 environment we may get ERROR: Failed building wheel for tokenizers. If so, we install Rust (in the same environment) with:
curl — proto ‘=https’ — tlsv1.2 -sSf https://sh.rustup.rs | sh
restart your env
conda deactivate
conda activate ml
than you can install transformer comes with tokenizers or only install tokenizers
pip install tokenizers or pip install transformer
thanks to James Briggs
You can try
conda install -c huggingface transformers
I got this error too. Solved it after a lot of trial & error.
The Problem: my brew was still running on Rosetta. Fixed that by uninstalling, cleaning and reinstalling. So everything seemed to run fine. Except this problem still kept cropping up
Until I discovered that pip is quite agressive in caching. So it caches the build even if the architecture changed. Solution: pip cache purge. Or remove the whole cache directory which you find with pip cache info
After testing most of the solutions provided I finally got it working by doing
brew install ffmpeg
sudo pip install tokenizers
🛠️🚀

Docker Ubuntu 20.04 - Cannot uninstall 'terminado' and problems with pip

I'am trying to reproduce the Docker Installation of the Book "Mining the Social Web" (Russel/Klassen) on Ubuntu 20.04. I set up Docker and tried to create the Docker container from the respository directly (repo2docker https://github.com/mikhailklassen/Mining-the-Social-Web-3rd-Edition) for opening the Jupyter Notebook, but I got ERRORS. Before I installed Python3 and pip3 (couldn't install just Python and pip).
Got this multiple inside the running code:
WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
(cannot fix the problem with the link inside)
and this ERROR at the End of Code:
ERROR: Cannot uninstall 'terminado'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.
Removing intermediate container 71cfe8e913dd
The command '/bin/sh -c ${KERNEL_PYTHON_PREFIX}/bin/pip install --no-cache-dir -r "binder/requirements.txt"' returned a non-zero code:1
Maybe somebody can help me? thx a lot!
Solution to your issue, don't use docker as it won't be able to uninstall the tornado package which is a pure distutil package due to pip install instructions. Use below solution:
I work on virtual envs, and would recommend you to do the same.
Clone the repo
navigate to /binder
execute pip install --ignore-installed -r requirements.txt
navigate to /notebooks
Execute jupyter notebook
From the answer: https://stackoverflow.com/a/67134670/1290868
For me it was this line in requirements.txt;
...
jupyterlab>=1.0
....
removed the version part (">=1.0") and it worked;
...
jupyterlab
....
I was tying to update jupyter notebook / jupyterlab and had issues with terminado, not sure what it is. Even after uninstalling it, it was avl (not sure why).
Hence, I uninstalled jupyterlab and installed it back again.

installation of assimulo and sundials - error

I want to use Assimulo and Sundials for the solution of differential algebraic equations in Python and therefore I am trying to install it on Ubuntu.
For the installation of Sundials, I followed the installation instructions and as I understand it worked well.
% cmake -DCMAKE_INSTALL_PREFIX=/usr/local/lib/sundials-3.1.1/ ~/opt/sundials/sundials-3.1.1
% make
% make install
Then I tried to install Assimulo with the command pip3 install Assimulo, but I get an error message. I also tried to follow the instructions on Installation - Assimulo 3.0 documentation by downloading the installation files and install it with the following command. It results in the same error message.
sudo python3 setup.py install --sundials-home=/usr/local/lib/sundials-3.1.1
This is the error message I get:
target build/src.linux-x86_64-3.6/assimulo/thirdparty/hairer/dopri5module.c does not exist:
Assuming dopri5module.c was generated with "build_src --inplace" command.
error: 'assimulo/thirdparty/hairer/dopri5module.c' missing
What is wrong and how can I fix it? Any help would be appreciated!
I got the same error when installing on macos via pip install assimulo, after pip-installing numpy and cython.
For me, using a conda env did the trick:
Creating the conda env: conda create -n your_name_goes_here
conda activate your_name_goes_here
conda install python=3.6 (I noticed you can also use 3.7)
conda install -c conda-forge assimulo
I also had the same error message. As suggested in the other answer, you can get a compiled package from Conda. But if you want to compile from source yourself, it looks to me that PyPI source tarball doesn't contain all needed files. At least some *.pyf files are missing. So, I used SVN repo instead:
svn checkout https://svn.jmodelica.org/assimulo/tags/Assimulo-3.0/ assimulo
By compiling this source tree, I managed to get pass the original error you had, but I'm now having another build error that I don't know yet how to solve:
ssimulo/solvers/sundials.c: In function '__pyx_f_8assimulo_7solvers_8sundials_5CVode_initialize_cvode':
assimulo/solvers/sundials.c:33274:31: error: too many arguments to function 'CVodeCreate'
__pyx_v_self->cvode_mem = CVodeCreate(__pyx_t_3, __pyx_t_4);

Unable to install spacy on AWS Sagemaker

I'm trying to load spacy into SageMaker. I run the following in Jupyter notebook instance
!pip install spacy
I end up getting this error
gcc: error trying to exec 'cc1plus': execvp: No such file or directory
error: command 'gcc' failed with exit status 1
and this as well
gcc: error: murmurhash/mrmr.cpp: No such file or directory
error: command 'gcc' failed with exit status 1
How can I resolve this issue withing Sagemaker?
I was experiencing similar problems when I started using SageMaker so I developed this open source project https://github.com/Kenza-AI/sagify (sagify), it's a CLI tool that can help you train and deploy your own Machine Learning/Deep Learning models on SageMaker in a very easy way. I managed to train and deploy all of my ML models whatever library I was using (Keras, Tensorflow, scikit-learn, LightFM, spacy, etc). Essentially, you can specify all your dependencies in the classic pythonic way, i.e. in a requiments.txt, and sagify will read them and install them on a Docker image. Then, this Docker image can be executed on SageMaker for training and deployment.
From https://stackoverflow.com/a/38733918/3276830
Fix gcc
sudo apt-get update
sudo apt-get install --reinstall build-essential
I am not sure about second error, maybe murmurhash/mrmr.cpp does not exist?
You can try following commands to install spacy, using Jupyter cell selecting Python3 kernel
!conda update --all -y
!conda install -n python3 -y -c conda-forge spacy
and then restart the kernel.
After restarting the kernel you should be able to import spacy. Or you can issue the same above commands using Jupyter terminal, just remove ! mark when issuing commands from the above.

Python 3.5.2 Windows x86-64 web-based, but installer not installing pip

I am trying to install TensorFlow. The installation instruction for Windows (https://www.tensorflow.org/install/install_windows) have as first step to install Python 3.5.2. And I'm doing the 'TensorFlow with CPU support only'.
Python was successfully installed in my computer as I can run it via the Start menu.
However, when I try to do the 2nd step of the installation instructions in order to install TensorFlow, this step says to:
To install TensorFlow, start a terminal. Then issue the appropriate pip3 install command in that terminal. To install the CPU-only version of TensorFlow, enter the following command:
C:\> pip3 install --upgrade tensorflow
But I'm getting an error when I perform the above statement, the error is
'pip' is not recognized as an internal or external command, oprable program or batch file.
I looked at several postings in StackOverflow and tried the commands provided in one of the postings, but I would get the same type of error.
So, how is 'pip3' installed? from what I read, it is supposed to be installed together with the installation, but obviously that did not happen.
How do I install it? I need to install TensorFlow and it seems that it needs to be done via the pip3 installation tool.
Thank you in advance for your help!
Either set the system environment path variable to include the python 3.5.x path in it, or just cd into the correct python folder to run pip3 from there.
The folder in windows 10 should be something like this:
C:\Users\YOUR_USERNAME\AppData\Local\Programs\Python\Python35\Scripts
Open the terminal, cd to that path (change YOUR_USERNAME to the correct user) and then just run the following command:
pip3 install --upgrade tensorflow
and if you want the gpu version:
pip3 install --upgrade tensorflow-gpu
Pip3 is already installed when you install Python, so there is no need to do anything else.

Resources