How to install google-cloud-bigquery on python-alpine based docker? - python-3.x

I'm trying to build a docker with python 3 and google-cloud-bigquery with the following docker file:
FROM python:3.10-alpine
RUN pip3 install google-cloud-bigquery
WORKDIR /home
COPY *.py /home/
ENTRYPOINT ["python3", "-u", "myscript.py"]
But getting errors on the pip3 install google-cloud-bigquery (too long for here)..
What's missing for installing this on python-alpine?

Looks like an incompatibility issue with the latest version of google-cloud-bigquery (>3) and numpy:
ERROR: Could not build wheels for numpy, which is required to install pyproject.toml-based projects
Try specifying a previous version, this works for me:
RUN pip3 install google-cloud-bigquery==2.34.4

Actually it seems like not a problem with numpy, which builds smoothly with all the dependency libs install, but rather with pyarrow, which does not support alpine+pip build. I've found a workaround by using alpine pre-built version of pyarrow. It is much easier than building pyarrow from source. This build works for me just fine:
FROM python:3.10.6-alpine3.16
RUN apk add --no-cache build-base linux-headers \
py3-apache-arrow=8.0.0-r0
# Copying pyarrow to site-package of actual python path. Alpine python path
# and python's docker hub path are different.
RUN mv /usr/lib/python3.10/site-packages/* \
/usr/local/lib/python3.10/site-packages/
RUN rm -rf /usr/lib/python3.10
RUN --mount=type=cache,target=/root/.cache/pip \
pip install google-cloud-bigquery==3.3.2
Update python version, alpine version and py3-apache-arrow version to install later versions. This is the latest one on the time of writing.
And make sure to remove build dependencies (build-base, linux-headers) for your release docker. I prefer multistage dockers for this.

Related

Why am I getting ModuleNotFoundError after installing Negbio?

Building a docker image, I've installed Negbio in my Dockerfile using:
RUN git clone https://github.com/ncbi-nlp/NegBio.git xyz && \
python xyz/setup.py install
When I try to run my Django application at localhost:1227 I get:
No module named 'negbio' ModuleNotFoundError exception
When I run pip list I can see negbio. What am I missing?
As per your comment, It wouldn't install with pip and hence not installing via pip.
Firstly, to make sure the https://github.com/ncbi-nlp/NegBio is properly installed via python setup.py install, you need to install it's dependencies via pip install -r requirements first. So either ways, you are doomed to have pip inside Docker.
For example, this is the sample Dockerfile that would install the negbio package properly:
FROM python:3.6-slim
RUN mkdir -p /apps
WORKDIR /apps
# Steps for installing the package via Docker:
RUN apt-get update && apt-get -y upgrade && apt-get install -y git gcc build-essential
RUN git clone https://github.com/ncbi-nlp/NegBio.git
WORKDIR /apps/NegBio
RUN pip install -r requirements.txt
RUN python setup.py install
ENV PATH=~/.local/bin:$PATH
EXPOSE 8000
CMD ["python", "manage.py", "runserver", "0.0.0.0:8000"]
So wouldn't harm if you actually install it via requirements.txt
I would do it like this:
requirements.txt --> have all your requirements added here
negbio==0.9.4
And make sure it's installed on the fly inside the docker using RUN pip install -r requirements.txt
I ultimately resolved my issue by going to an all Anaconda environment. Thank you for everyone's input.

How to Have Pycrypto at Docker Properly Working?

I am using Pychromeless repo with success at AWS lambda.
But now I need to use pycrypto dependency, but I am getting
configure: error: no acceptable C compiler found in $PATH
 
when running make docker-build
(after placing pycrypto==2.6.1 at requirements.txt file).
There's this thread and someone said about the same problem:
 
"The gcc compiler is not in your $PATH. It means either you dont have gcc installed or it's not in your $PATH variable".
So tried placing apt-get install build-essential at Dockerfile, but I got
/bin/sh: apt-get: command not found
Then, I tried with yum install gcc
only to get
The command '/bin/sh -c yum install gcc' returned a non-zero code: 1
Docker-lambda [info page] (https://hub.docker.com/r/lambci/lambda/) says:
This project consists of a set of Docker images for each of the supported Lambda runtimes.
There are also a set of build images that include packages like gcc-c++, git, zip and the aws-cli for compiling and deploying.
So I guess I shouldn't be needing to install gcc. Maybe the gcc compiler is not in $PATH, but I don't know what to do to fix that.
Here is the dockerfile
FROM lambci/lambda:python3.6
MAINTAINER tech#21buttons.com
USER root
ENV APP_DIR /var/task
WORKDIR $APP_DIR
COPY requirements.txt .
COPY bin ./bin
COPY lib ./lib
RUN mkdir -p $APP_DIR/lib
RUN pip3 install -r requirements.txt -t /var/task/lib
Any help on solving this?
Well, well, well...today was a lucky day for me.
So simple: all I had to do was replace
pycrypto==2.6.1
by
pycryptodome
on my requirements.txt file.
This thread says: "Highly recommend NOT to use pycrypto. It is old and not maintained and contains many vulnerabilities. Use pycryptodome instead - it is compatible and up to date".
And that's it! Docker builds just fine with pycryptodome.

Setting up mysql-connector-python in Docker file

I am trying to set up a mysql connection that will work with SqlAlchemy in Python 3.6.5 . I have the following in my Dockerfile:
RUN pip3 install -r /event_git/requirements.txt
I also have, in requirements.txt:
mysql-connector-python==8.0.15
However, I am not able to connect to the DB. Is there anything else that I need to do to set this up?
Update:
I got 8.0.5 working but not 8.0.15 . Apparently, a protobuff dependency was added; does anyone know how to handle that?
docker file is:
RUN apt-get -y update && apt-get install -y python3 python3-pip fontconfig wget nodejs nodejs-legacy npm
RUN pip3 install --upgrade pip
# Copy contents of this directory (i.e. full source) to image
COPY . /my_project
# Install Python dependencies
RUN pip3 install -r /event_git/requirements.txt
# Set event_git folder as working directory
WORKDIR /my_project
ENV LANG C.UTF-8
I am running it via
docker build -t event_git .;docker run -t -i event_git /bin/bash
and then executing a script; the db is on my local machine. This is working on mysql-connector-python==8.0.5 but not 8.0.15, so the setup is ok; I think I just need to fulfill the protobuff dependency that was added (see https://github.com/pypa/warehouse/issues/5537 for mention of protobuff dependency).
The mysql-connector-python has the Python Protobuf as an installation requirement, this means that protobuf will be installed along mysql-connector-python.
If this doesn't work, try to add protobuf==3.6.1 in your requirements.txt.
Figured out the issue. The key is that import mysql.connector needs to be at the top of the file where the create_engine is. Still not sure of the exact reason, but at the very least that seems to define _CONNECTION_POOLS = {}. If anyone knows why, please do give your thoughts.

Python 3.6 in tensorflow gpu docker images

How can I have python3.6 in tensorflow docker images.
All the images I tried (latest, nighty) are using python3.5 and I don't want to modify all my scripts.
The Tensorflow images are based on Ubuntu 16.04, as you can see from the Dockerfile. This release ships with Python 3.5 as standard.
So you'll have to re-build the image, and the Dockerfile will need editing, even though you need to do the actual build with the parameterized_docker_build.sh script.
This answer on ask Ubuntu covers how to get Python 3.6 on Ubuntu 16.04
The simplest way would probably be just to change the From line in the Dockerfile to FROM ubuntu:16.10, and python to python3.6 in the initial apt-get install line
Of course, this may break some other Ubuntu version-specific thing, so an alternative would be to keep Ubuntu 16.04 and install one of the alternative ppa's also listed in the linked answer:
RUN add-apt-repository ppa:deadsnakes/ppa &&
apt-get update &&
apt-get install -y python3.6
Note that you'll need this after the initial apt-get install, because that installs software-properties-common, which you need to add the ppa.
Note also, as in the comments to the linked answer, that you will need to symlink to Python 3.6.
Finally, note that I haven't tried any of this. The may be gotchas, and you may need to make another change to ensure that the correct version of Python is used by the running container.
You can use stable images which are supplied by third parties, like ufoym/deepo.
One that fits TensorFlow, python3.6 and cuda10 can be found here or you can pull it directly using the command docker pull ufoym/deepo:py36-cu100
I use their images all the time, never had problems
With this anwer, I just wanted to specify how I solved this problem (the previous answer of SiHa helped me a lot but I had to add a few steps so that it worked completly).
Context:
I'm using a package (segmentation model for unet++) that requires tensorflow==1.4.0 and keras==2.2.2.
I tried to use the docker image for tensorflow 1.4.0, however, the default version of python of this image is 3.5 which is not compatible with my package.
I managed to install python3.6 on the docker images thanks to the following files:
My Dockerfile contains the following lines:
Dockerfile:
FROM tensorflow/tensorflow:1.4.0-gpu-py3
RUN mkdir /AI_PLATFORM
WORKDIR /AI_PLATFORM
COPY ./install.sh ./install.sh
COPY ./requirements.txt ./requirements.txt
COPY ./computer_vision ./computer_vision
COPY ./config.ini ./config.ini
RUN bash install.sh
Install.sh:
#!/urs/bin/env bash
pip install --upgrade pip
apt-get update
apt-get install -y python3-pip
add-apt-repository ppa:deadsnakes/ppa &&
apt-get update &&
apt-get install python3.6 --assume-yes
apt-get install libpython3.6
python3.6 -m pip install --upgrade pip
python3.6 -m pip install -r requirements.txt
Three things are important:
use python3.6 -m pip instead of pip, else the packages are installed on python 3.5 default version of Ubuntu 16.04
use docker run python3.6 <command> to run your containers with python==3.6
in the requirements.txt file, I had to specify the following things:
h5py==2.10.0
tensorflow-gpu==1.4.1
keras==2.2.2
keras-applications==1.0.4
keras-preprocessing==1.0.2
I hope that this answer will be useful
Maybe the image I created will help you. It is based on the cuda-10.0-devel image and has tensorflow 2.0a-gpu installed.
You can use it as base image for your own implementation. The image itself doesn't do anything. I put the image on dockerhub https://cloud.docker.com/repository/docker/patientzero/tensorflow2.0a-gpu-py3.6
The github repo is located here: https://github.com/patientzero/tensorflow2.0-python3.6-Docker
Pulling it won't do much, but for completeness:
$ docker pull patientzero/tensorflow2.0-gpu-py3.6
edit: changed to general tensorflow 2.0x image.
Also as mentioned here, the official image for the beta 2.0 release now comes with python 3.6 support

How can I install python modules in a docker image?

I have an image called: Image and a running container called: container.
I want to install pytorch and anaconda. What's the easiest way to do this?
Do I have to change the Dockerfile and build a new image?
Thanks a lot.
Yes, the best thing is to build your image in such a way it has the python modules are in there.
Here is an example. I build an image with the build dependencies:
$ docker build -t oz123/alpine-test-mycoolapp:0.5 - < Image
Sending build context to Docker daemon 2.56 kB
Step 1 : FROM alpine:3.5
---> 88e169ea8f46
Step 2 : ENV MB_VERSION 3.1.4
---> Running in 4587d36fa4ae
---> b7c55df49803
Removing intermediate container 4587d36fa4ae
Step 3 : ENV CFLAGS -O2
---> Running in 19fe06dcc314
---> 31f6a4f27d4b
Removing intermediate container 19fe06dcc314
Step 4 : RUN apk add --no-cache python3 py3-pip gcc python3-dev py3-cffi file git curl autoconf automake py3-cryptography linux-headers musl-dev libffi-dev openssl-dev build-base
---> Running in f01b60b1b5b9
fetch http://dl-cdn.alpinelinux.org/alpine/v3.5/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.5/community/x86_64/APKINDEX.tar.gz
(1/57) Upgrading musl (1.1.15-r5 -> 1.1.15-r6)
(2/57) Upgrading zlib (1.2.8-r2 -> 1.2.11-r0)
(3/57) Installing m4 (1.4.17-r1)
(4/57) Installing perl (5.24.0-r0)
(5/57) Installing autoconf (2.69-r0)
(6/57) Installing automake (1.15-r0)
(7/57) Installing binutils-libs (2.27-r1)
...
Note, I am installing Python's pip inside the image, so later I can download packages from pypi. Packages like numpy might require a C compiler and tool chain, so I am installing these too.
After building the packages which require the build tools chain I remove the tool chain packages:
RUN apk del file pkgconf autoconf m4 automake perl g++ libstdc++
After you have your base image, you can run your application code in
an image building on top of it:
$ cat Dockerfile
FROM oz123/alpine-test-mycoolapp
ADD . /code
WORKDIR /code
RUN pip3 install -r requirements.txt -r requirements_dev.txt
RUN pip3 install -e .
RUN make clean
CMD ["pytest", "-vv", "-s"]
I simply run this with docker.

Resources