Dockerfiles build performance issues - python-3.x

The following Dockerfiles takes more than 30 minutes to build.
FROM python:3.7-alpine
COPY . /app
WORKDIR /app
RUN apk add --no-cache python3-dev libstdc++ && \
apk add --no-cache g++ && \
ln -s /usr/include/locale.h /usr/include/xlocale.h && \
pip install --upgrade pip && \
pip3 install --upgrade pip && \
pip3 install allure-behave && \
pip3 install -r requirements.txt
entrypoint ["sh", "testsuite.sh"]
Requirement file:
behave==1.2.6
boto3==1.8.2
botocore==1.11.9
pandas==0.25.0
Is that normal?

Related

How to install gifsicle in aws lambda using docker image

this is my dockerfile:
FROM public.ecr.aws/lambda/python:3.8-arm64
COPY requirements.txt ./
RUN yum update -y && \
yum install -y gifsicle && \
pip install -r requirements.txt
COPY . .
CMD ["app.handler"]
I'm getting the following error:
#8 200.6 No package gifsicle available.
#8 200.7 Error: Nothing to do
I finally managed to do it by building the package from source. My Dockerfile:
FROM public.ecr.aws/lambda/python:3.8-arm64
RUN yum -y install install make gcc wget gzip
RUN wget https://www.lcdf.org/gifsicle/gifsicle-1.93.tar.gz
RUN tar -xzf gifsicle-1.93.tar.gz
RUN cd gifsicle-1.93 && \
./configure && \
make && \
make install
COPY requirements.txt ./
RUN yum update -y && \
pip install -r requirements.txt
COPY . .
CMD ["app.handler"]

install PyTorch CPU-only in Dockerfile

I am fairly new to Docker and containerisation. I am wanting to decrease the size of my_proj docker container in production.
I prefer installing packages and managing dependencies via Poetry.
How can I specify using CPU-only PyTorch in a Dockerfile?
To do this via. bash terminal, it would be:
poetry add pytorch-cpu torchvision-cpu -c pytorch
(or conda install...)
My existing Dockerfile:
FROM python:3.7-slim as base
RUN apt-get update -y \
&& apt-get -y --no-install-recommends install curl wget\
&& rm -rf /var/lib/apt/lists/*
ENV ROOT /home/worker/python/my_proj
WORKDIR $ROOT
ARG ATLASSIAN_TOKEN
ARG POETRY_HTTP_BASIC_AZURE_PASSWORD
ARG ACCESS_KEY
ENV AWS_ACCESS_KEY_ID=$ACCESS_KEY
ARG SECRET_KEY
ENV AWS_SECRET_ACCESS_KEY=$SECRET_KEY
ARG REPO
ENV REPO_URL=$REPO
ENV PYPIRC_PATH=$ROOT/.pypirc
ENV \
PYTHONFAULTHANDLER=1 \
POETRY_VERSION=1.1.4 \
POETRY_HOME=/etc/poetry \
XDG_CACHE_HOME=/home/worker/.cache \
POETRY_VIRTUALENVS_IN_PROJECT=true \
MPLCONFIGDIR=/home/worker/matplotlib \
PATH=/home/worker/python/my_proj/.venv/bin:/usr/local/bin:/etc/poetry/bin:$PATH
ADD https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py ./
RUN python get-poetry.py && chmod +x /etc/poetry/bin/poetry
RUN --mount=type=cache,target=/root/.cache pip install twine keyring artifacts-keyring
RUN --mount=type=cache,target=/root/.cache apt update && apt install gcc -y
FROM base as ws
ARG WS_APIKEY
ARG WS_PROJECTVERSION=
ARG WS_PROJECTNAME=workers-python-my_proj
ARG WS_PRODUCTNAME=HALO
COPY --chown=worker:worker . .
RUN --mount=type=cache,uid=1000,target=/home/worker/.cache poetry install --no-dev
COPY --from=openjdk:15-slim-buster /usr/local/openjdk-15 /usr/local/openjdk-15
ENV JAVA_HOME /usr/local/openjdk-15
ENV PATH $JAVA_HOME/bin:$PATH
RUN --mount=type=cache,uid=1000,target=/home/worker/.cache ./wss_agent.sh
FROM base as test
COPY . .
RUN poetry config experimental.new-installer false
RUN poetry install
RUN cd my_proj && poetry run invoke deployconfluence_server_pass=$ATLASSIAN_TOKEN
FROM base as package
COPY . .
RUN poetry build
RUN python -m pip install --upgrade pip && \
pip install twine keyring artifacts-keyring && \
twine upload -r $REPO_URL --config-file $PYPIRC_PATH dist/* --skip-existing
FROM base as build
COPY . .
RUN poetry config experimental.new-installer false
RUN poetry install --no-dev
RUN pip3 --no-cache-dir install --upgrade awscli
RUN aws s3 cp s3://....tar.gz $ROOT/my_proj # censored url
RUN mkdir $ROOT/my_proj/bert-base-cased && cd $ROOT/my_proj/bert-base-cased && \
wget https://huggingface.co/bert-base-cased/resolve/main/config.json && \
wget https://huggingface.co/bert-base-cased/resolve/main/tokenizer.json && \
wget https://huggingface.co/bert-base-cased/resolve/main/tokenizer_config.json
FROM python:3.7-slim as production
ENV ROOT=/home/worker/python/my_proj \
VIRTUAL_ENV=/home/worker/python/my_proj/.venv\
PATH=/home/worker/python/my_proj/.venv/bin:/home/worker/python/my_proj:$PATH
COPY --from=build /home/worker/python/my_proj/pyproject.toml /home/worker/python/
COPY --from=build /home/worker/python/my_proj/.venv /home/worker/python/my_proj/.venv
COPY --from=build /home/worker/python/my_proj/my_proj /home/worker/python/my_proj
WORKDIR $ROOT
ENV PYTHONPATH=$ROOT:/home/worker/python/
ENTRYPOINT [ "primary_worker", "--mongo" ]
Installing it via pip should work:
RUN pip3 install torch==1.9.0+cpu torchvision==0.10.0+cpu torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html

How can I install my python code in a docker container?

I have a python 3.6 app that I would like to install on a docker container. I installed the app in a virtual env (miniconda3) on my pc (windows 10) and need to do the same in a docker container (python3.6 base image (linux)). I installed the app locally using a setup.py file using python setup.py install, and an egg file as well as a separate directory with the application's code were created in site-packages. When I tried to do the same in the docker container, the installation creates only an egg file in site packages, and the app can't be imported. The other installed packages are fine.
Dockerfile:
FROM python:3.6
WORKDIR /opt
# create a virtual environment and add it to PATH so that it is
applied for all future RUN and CMD calls
ENV VIRTUAL_ENV=/opt/venv
RUN python3 -m venv $VIRTUAL_ENV
ENV PATH="$VIRTUAL_ENV/bin:$PATH"
# Install Mono for pythonnet.
RUN apt-get update \
&& apt-get install --yes \
apt-transport-https \
git \
dirmngr \
clang \
gnupg \
ca-certificates \
# Dependency for pyodbc.
unixodbc-dev \
&& apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-
keys 3FA7E0328081BFF6A14DA29AA6A19B38D3D831EF \
&& echo "deb http://download.mono-project.com/repo/debian
stretch/snapshots/5.20 main" | tee /etc/apt/sources.list.d/mono-
official-stable.list \
&& apt-get update \
&& apt-get install --yes \
&& rm -rf /var/lib/apt/lists/*
COPY requirements.txt .
COPY src ./src
COPY setup.py ./setup.py
COPY config.json ./config.json
COPY BCUtility.dll ./BCUtility.dll
COPY settings.ini ./settings.ini
COPY redis_config.json ./redis_config.json
COPY sql_config.json ./sql_config.json
RUN python3 -m venv $VIRTUAL_ENV \
# From here on, use virtual env's python.
&& venv/bin/pip install --upgrade pip \
&& venv/bin/pip install --no-cache-dir --upgrade pip setuptools wheel \
&& venv/bin/pip install --no-cache-dir -r requirements.txt \
# Dependency for pythonnet.
&& venv/bin/pip install --no-cache-dir pycparser \
&& venv/bin/pip install -U --no-cache-dir "pythonnet==2.5.1" \
# install the app
&& venv/bin/python setup.py install
cmd /opt/venv/bin/python src/app.py
Any idea where I'm going wrong?

Setting specific python version in docker file with specfic non-python base image

I want to create a docker image with specifically python 3.5 on a specific base image which is the nvidia/cuda (9.0-base image) the latter has no python environment.
The reason I need specific versions is to support running cuda10.0 python3.5 and a gcc version<7 to compile the driver all together on the same box
When I try and build the docker environments (see below) I always end up with the system update files which load python3.6
The first version I run (below) runs a system update dependencies which installs python 3.6 I have tried many variants to avoid this but always end up 3.6 in the final image.
Any suggestions for getting this running with python3.5 are welcome
Thanks
FROM nvidia/cuda
RUN apt-get update && apt-get install -y libsm6 libxext6 libxrender-dev python3.5 python3-pip
COPY . /app
WORKDIR /app
RUN pip3 install -r requirements.txt
ENTRYPOINT [ "python3" ]
CMD [ "app.py" ]
Another variant (below) I have tried is with virtualenv and here again I can't seem to force a python 3.5 environment
FROM nvidia/cuda
RUN apt-get update && apt-get install -y --no-install-recommends libsm6 libxext6 libxrender-dev python3.5 python3-pip python3-virtualenv
ENV VIRTUAL_ENV=/opt/venv
RUN python3 -m virtualenv --python=/usr/bin/python3 $VIRTUAL_ENV
ENV PATH="$VIRTUAL_ENV/bin:$PATH"
COPY . /app
WORKDIR /app
RUN pip3 install -r requirements.txt
ENTRYPOINT [ "python3" ]
CMD [ "app.py" ]
You can try using conda. I used several stages to minimize final container and to speedup/cache local builds.
# first stage
FROM nvidia/cuda:11.1-base-ubuntu18.04 as builder
RUN apt-get update && apt-get install -y curl wget gcc build-essential
# install conda
RUN wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-4.5.12-Linux-x86_64.sh -O ~/miniconda.sh && \
/bin/bash ~/miniconda.sh -b -p /opt/conda
# create env with python 3.5
RUN /opt/conda/bin/conda create -y -n myenv python=3.5
# install requirements
WORKDIR /app
COPY requirements.txt /app
ENV PATH=/opt/conda/envs/myenv/bin:$PATH
RUN pip install -r requirements.txt
RUN pip uninstall -y pip
####################
# second stage (note: FROM container must be the same as builder)
FROM nvidia/cuda:11.1-base-ubuntu18.04 as runner
# copy environment data including python
COPY --from=builder /opt/conda/envs/myenv/bin /opt/conda/envs/myenv/bin
COPY --from=builder /opt/conda/envs/myenv/lib /opt/conda/envs/myenv/lib
# do some env settings
ENV PATH=/opt/conda/envs/myenv/bin:$PATH
ENV LC_ALL=C.UTF-8
ENV LANG=C.UTF-8
####################
# final image
from runner
WORKDIR /app
COPY ./run.py /app
CMD [ "python", "run.py"]
You can install from PPA and use it as usual:
FROM nvidia/cuda
RUN apt-get update && apt-get install -y --no-install-recommends software-properties-common \
libsm6 libxext6 libxrender-dev curl \
&& rm -rf /var/lib/apt/lists/*
RUN echo "**** Installing Python ****" && \
add-apt-repository ppa:deadsnakes/ppa && \
apt-get install -y build-essential python3.5 python3.5-dev python3-pip && \
curl -O https://bootstrap.pypa.io/get-pip.py && \
python3.5 get-pip.py && \
rm -rf /var/lib/apt/lists/*
COPY requirements.txt requirements.txt
RUN pip3.5 install -r requirements.txt
CMD ["python3.5", "app.py"]

Is sklearn compatible with Linux-alpine?

I get an error when I try to build an alpine based docker image that includes the sklearn package.
I've tried a few variations of pip installation, different package combinations, and outdated versions of sklearn to see if they are compatible. I've also run the container in -it mode and tried to install the package manually from there. When I remove the sklearn line, the Dockerfile builds and the container runs just fine. Sklearn works in an Ubuntu:latest Dockerfile I've built, but I'm trying to reduce my footprint, so I was hoping to get it to work on alpine...
Here's my Dockerfile code:
FROM alpine:latest
RUN apk upgrade --no-cache \
&& apk update \
&& apk add --no-cache \
musl \
build-base \
python3 \
python3-dev \
postgresql-dev \
bash \
git \
&& pip3 install --no-cache-dir --upgrade pip \
&& pip3 install sklearn \
&& rm -rf /var/cache/* \
&& rm -rf /root/.cache/*
And here's the error I'm getting:
ERROR: Command "/usr/bin/python3.6 /usr/lib/python3.6/site-packages/pip/_vendor/pep517/_in_process.py prepare_metadata_for_build_wheel /tmp/tmpqjsz0004" failed with error code 1 in /tmp/pip-install-xlvbli9u/scipy
Alpine Linux doesn't support PEP 513. I found that something like this works:
RUN apk add --no-cache gcc g++ gfortran lapack-dev libffi-dev libressl-dev musl-dev && \
mkdir scipy && cd scipy && \
wget https://github.com/scipy/scipy/releases/download/v1.3.2/scipy-1.3.2.tar.gz && \
tar -xvf scipy-1.3.2.tar.gz && \
cd scipy-1.3.2 && \
python3 -m pip --no-cache-dir install .

Resources