PyTorch Jupyter Notebook image unable to find torch - pytorch

I have built a pytorch jupyter notebook image using the Dockerfile below. The only thing I changed from Tensorflow Jupyter Dockerfile is the base image (From Tensorflow to PyTorch).
However, when I launch the Notebook in Kubeflow, I’m unable to import torch. However, with !pip list, I can actually find the torch module. Any solutions?
ARG BASE_IMAGE=pytorch/pytorch:1.5.1-cuda10.1-cudnn7-runtime
FROM $BASE_IMAGE
ARG TF_SERVING_VERSION=0.0.0
ARG NB_USER=jovyan
# TODO: User should be refactored instead of hard coded jovyan
USER root
ENV DEBIAN_FRONTEND noninteractive
ENV NB_USER $NB_USER
ENV NB_UID 1000
ENV HOME /home/$NB_USER
ENV NB_PREFIX /
ENV PATH $HOME/.local/bin:$PATH
# Use bash instead of sh
SHELL ["/bin/bash", "-c"]
RUN apt-get update && apt-get install -yq --no-install-recommends \
apt-transport-https \
build-essential \
bzip2 \
ca-certificates \
curl \
g++ \
git \
gnupg \
graphviz \
locales \
lsb-release \
openssh-client \
sudo \
unzip \
vim \
wget \
zip \
emacs \
python3-pip \
python3-dev \
python3-setuptools \
&& apt-get clean && \
rm -rf /var/lib/apt/lists/*
# Install Nodejs for jupyterlab-manager
RUN curl -sL https://deb.nodesource.com/setup_12.x | sudo -E bash -
RUN apt-get update && apt-get install -yq --no-install-recommends \
nodejs \
&& apt-get clean && \
rm -rf /var/lib/apt/lists/*
ENV DOCKER_CREDENTIAL_GCR_VERSION=1.4.3
RUN curl -LO https://github.com/GoogleCloudPlatform/docker-credential-gcr/releases/download/v${DOCKER_CREDENTIAL_GCR_VERSION}/docker-credential-gcr_linux_amd64-${DOCKER_CREDENTIAL_GCR_VERSION}.tar.gz && \
tar -zxvf docker-credential-gcr_linux_amd64-${DOCKER_CREDENTIAL_GCR_VERSION}.tar.gz && \
mv docker-credential-gcr /usr/local/bin/docker-credential-gcr && \
rm docker-credential-gcr_linux_amd64-${DOCKER_CREDENTIAL_GCR_VERSION}.tar.gz && \
chmod +x /usr/local/bin/docker-credential-gcr
# Install AWS CLI
RUN curl "https://s3.amazonaws.com/aws-cli/awscli-bundle.zip" -o "/tmp/awscli-bundle.zip" && \
unzip /tmp/awscli-bundle.zip && ./awscli-bundle/install -i /usr/local/aws -b /usr/local/bin/aws && \
rm -rf ./awscli-bundle
# Install Azure CLI
RUN curl -sL https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor | tee /etc/apt/trusted.gpg.d/microsoft.asc.gpg > /dev/null && \
AZ_REPO=$(lsb_release -cs) && \
echo "deb [arch=amd64] https://packages.microsoft.com/repos/azure-cli/ $AZ_REPO main" | tee /etc/apt/sources.list.d/azure-cli.list && \
apt-get update && \
apt-get install azure-cli
RUN echo "en_US.UTF-8 UTF-8" > /etc/locale.gen && \
locale-gen
ENV LC_ALL en_US.UTF-8
ENV LANG en_US.UTF-8
ENV LANGUAGE en_US.UTF-8
# Create NB_USER user with UID=1000 and in the 'users' group
# but allow for non-initial launches of the notebook to have
# $HOME provided by the contents of a PV
RUN useradd -M -s /bin/bash -N -u $NB_UID $NB_USER && \
chown -R ${NB_USER}:users /usr/local/bin && \
mkdir -p $HOME && \
chown -R ${NB_USER}:users ${HOME}
RUN export CLOUD_SDK_REPO="cloud-sdk-$(lsb_release -c -s)" && \
echo "deb https://packages.cloud.google.com/apt $CLOUD_SDK_REPO main" > /etc/apt/sources.list.d/google-cloud-sdk.list && \
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add - && \
apt-get update && \
apt-get install -y google-cloud-sdk kubectl
# Install Tini - used as entrypoint for container
RUN cd /tmp && \
wget --quiet https://github.com/krallin/tini/releases/download/v0.18.0/tini && \
echo "12d20136605531b09a2c2dac02ccee85e1b874eb322ef6baf7561cd93f93c855 *tini" | sha256sum -c - && \
mv tini /usr/local/bin/tini && \
chmod +x /usr/local/bin/tini
# Install base python3 packages
RUN pip3 --no-cache-dir install \
jupyter-console==6.0.0 \
jupyterlab \
kubeflow-fairing==1.0.1
RUN docker-credential-gcr configure-docker && chown ${NB_USER}:users $HOME/.docker/config.json
# Configure container startup
EXPOSE 8888
USER jovyan
ENTRYPOINT ["tini", "--"]
CMD ["sh","-c", "jupyter notebook --notebook-dir=/home/${NB_USER} --ip=0.0.0.0 --no-browser --allow-root --port=8888 --NotebookApp.token='' --NotebookApp.password='' --NotebookApp.allow_origin='*' --NotebookApp.base_url=${NB_PREFIX}"]

Related

Docker image generating error for /usr/bin/java

I am trying to run this docker image, but not sure why I am getting this error:
/usr/bin/time: cannot run /usr/bin/java: No such file or directory
Command exited with non-zero status 127
Can someone please help me debug this error?
My docker file:
FROM openjdk:8-jre
LABEL maintainer="APN <xxx#xxx.edu>"
LABEL org.label-schema.schema-version="1.0"
# LABEL org.label-schema.build-date=$BUILD_DATE
LABEL org.label-schema.name="apn/addreadgroups"
LABEL org.label-schema.description="Image for adding read groups in .bam"
ENV PICARD_VERSION 2.20.8
WORKDIR /tmp
RUN apt-get update -y \
&& apt-get install --no-install-recommends -y \
make \
gcc \
g++ \
libz-dev \
libbz2-dev \
liblzma-dev \
ncurses-dev \
bc \
libnss-sss \
time \
&& cd /tmp \
&& wget -q -O /usr/bin/picard.jar https://github.com/broadinstitute/picard/releases/download/${PICARD_VERSION}/picard.jar \
&& ln -sf /usr/share/zoneinfo/America/Chicago /etc/localtime \
&& echo "America/Chicago" > /etc/timezone \
&& dpkg-reconfigure --frontend noninteractive tzdata \
&& apt-get clean all \
&& rm -rfv /var/lib/apt/lists/* /tmp/* /var/tmp/*
# This makes the image crazy large -- will find a workaround
# COPY human_g1k_v37_decoy* /usr/local/
COPY ./entrypoint.sh /usr/local/bin/
ENV PICARD /usr/bin/picard.jar
ENTRYPOINT ["/usr/local/bin/entrypoint.sh"]
# CMD ["/bin/bash"]
and the entrypoint.sh:
JAVAOPTS="-Xms2g -Xmx${MEM}g -XX:+UseSerialGC -Dpicard.useLegacyParser=false"
CUR_STEP="AddOrReplaceReadGroups"
/usr/bin/java ${JAVAOPTS} -jar "${PICARD}" \
"${CUR_STEP}" \
I="${INBAM}" \
O=${BAMFILE} \
RGID=${FLOWCELL} \
RGLB=${LIBRARY} \
RGPL=${PLATFORM} \
RGPU=${FLOWCELL} \
RGSM=${SM}
Exit status 127 means no command found.
This is due to the java command in openjdk:8-jre not located in /usr/bin/java, see next:
$ docker run -it openjdk:8-jre which java
/usr/local/openjdk-8/bin/java

Docker not exposing ports as expected

I have the below dockerfile that I am trying to build. At the end of the file I am attempting to expose ports 8888 and 6006.
FROM nvidia/cuda:9.2-devel-ubuntu16.04
LABEL maintainer="nweir <nweir#iqt.org>"
ARG solaris_branch='master'
# prep apt-get and cudnn
RUN apt-get update && apt-get install -y --no-install-recommends \
apt-utils && \
rm -rf /var/lib/apt/lists/*
# install requirements
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
bc \
bzip2 \
ca-certificates \
curl \
git \
libgdal-dev \
libssl-dev \
libffi-dev \
libncurses-dev \
libgl1 \
jq \
nfs-common \
parallel \
python-dev \
python-pip \
python-wheel \
python-setuptools \
unzip \
vim \
wget \
build-essential \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
SHELL ["/bin/bash", "-c"]
ENV PATH /opt/conda/bin:$PATH
# install anaconda
RUN wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-4.5.4-Linux-x86_64.sh -O ~/miniconda.sh && \
/bin/bash ~/miniconda.sh -b -p /opt/conda && \
rm ~/miniconda.sh && \
/opt/conda/bin/conda clean -tipsy && \
ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh && \
echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc && \
echo "conda activate base" >> ~/.bashrc
# prepend pytorch and conda-forge before default channel
RUN conda update conda && \
conda config --prepend channels conda-forge && \
conda config --prepend channels pytorch
# get dev version of solaris and create conda environment based on its env file
WORKDIR /tmp/
RUN git clone https://github.com/cosmiq/solaris.git && \
cd solaris && \
git checkout ${solaris_branch} && \
conda env create -f environment.yml
ENV PATH /opt/conda/envs/solaris/bin:$PATH
RUN cd solaris && pip install .
# install various conda dependencies into the space_base environment
RUN conda install -n solaris \
jupyter \
jupyterlab \
ipykernel
# add a jupyter kernel for the conda environment in case it's wanted
RUN source activate solaris && python -m ipykernel.kernelspec \
--name solaris --display-name solaris
# open ports for jupyterlab and tensorboard
EXPOSE 8888
EXPOSE 6006
RUN ["/bin/bash"]
After building the dockerfile into an image I attempt to expose the ports by running the following command:
docker run -p localhost:8888:8888 -p localhost:6006:6006 1ff
When I run docker ps -a I get the below image. As you can see the ports are not exposed.
I am currently using Ubuntu 20.04.
I can't for the life of me figure out what is wrong, your help would be greatly appreciated!

Forgerock - Forgeops - util - building with RHEL?

I am trying to take this Dockerfile here - https://github.com/ForgeRock/forgeops/blob/release/6.5.0/docker/util/Dockerfile
And change the old version which is Alpine linux (seen below):
FROM alpine:3.7
...
RUN apk add --update ca-certificates \
&& apk add --update -t deps curl\
&& curl -L https://storage.googleapis.com/kubernetes-release/release/${KUBE_LATEST_VERSION}/bin/linux/amd64/kubectl -o /usr/local/bin/kubectl \
&& chmod +x /usr/local/bin/kubectl \
&& apk del --purge deps \
&& apk add --update jq su-exec unzip curl bash openldap-clients \
&& rm /var/cache/apk/* \
&& mkdir -p $FORGEROCK_HOME \
&& addgroup -g 11111 forgerock \
&& adduser -s /bin/bash -h "$FORGEROCK_HOME" -u 11111 -D -G forgerock forgerock
To change it to run off of RHEL 7 (my changes below)
FROM ubi7-stigd:7.6
...
# Install epel, so we can install jq later
RUN rpm --import http://download.fedoraproject.org/pub/epel/RPM-GPG-KEY-EPEL-7 \
&& yum install -y --disableplugin=subscription-manager https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
# Install other stuff
RUN yum -y --disableplugin=subscription-manager update \
&& yum install -y --disableplugin=subscription-manager jq su-exec unzip curl bash openldap-clients ca-certificates deps \
&& curl -L https://storage.googleapis.com/kubernetes-release/release/${KUBE_LATEST_VERSION}/bin/linux/amd64/kubectl -o /usr/local/bin/kubectl \
&& chmod +x /usr/local/bin/kubectl \
&& mkdir -p $FORGEROCK_HOME \
&& groupadd -g 11111 forgerock \
&& useradd -m -s /bin/bash -d "$FORGEROCK_HOME" -u 11111 -g forgerock -G root forgerock
The container builds just fine (although it complains about not being able to find "su-exec" and "deps"). But when I upload this image to my OpenShift and run it via an OpenAM pod, the container fails to start, timing out after 10 minutes. The events show that the container started, and logs only show 2 lines, saying it timed out after 10 minutes.
Anyone know what the issue might be?
I needed to install the "nc" package, as one of the .sh files uses nc.

Docker File Non-Zero Code 100 Error When Building

Here is my Docker File:
FROM ubuntu:16.04
MAINTAINER Alexandre Savio <alexsavio#gmail.com>
RUN ln -snf /bin/bash /bin/sh
ARG DEBIAN_FRONTEND=noninteractive
ENV PETPVC_VERSION v1.2.1
ENV PETPVC_GIT https://github.com/UCL/PETPVC.git
ENV ITK_VERSION v4.12.2
ENV ITK_GIT http://itk.org/ITK.git
ENV VTK_VERSION v6.3.0
ENV VTK_GIT https://gitlab.kitware.com/vtk/vtk.git
ENV SIMPLEITK_VERSION v1.0.1
ENV SIMPLEITK_GIT http://itk.org/SimpleITK.git
ENV ANTS_VERSION v2.2.0
ENV ANTS_GIT https://github.com/stnava/ANTs.gi
ENV NEURODEBIAN_URL http://neuro.debian.net/lists/xenial.de-m.full
ENV NEURODEBIAN_PGP hkp://pool.sks-keyservers.net:80 0xA5D32F012649A5A9
ENV LIBXP_URL http://mirrors.kernel.org/ubuntu/pool/main/libx/libxp/libxp6_1.0.2-2_amd64.deb
ENV AFNI_URL https://afni.nimh.nih.gov/pub/dist/bin/linux_fedora_21_64/#update.afni.binaries
ENV CAMINO_GIT git://git.code.sf.net/p/camino/code
ENV SPM12_URL http://www.fil.ion.ucl.ac.uk/spm/download/restricted/utopia/dev/spm12_latest_Linux_R2017b.zip
ENV MLAB_URL http://ssd.mathworks.com/supportfiles/downloads/R2017b/deployment_files/R2017b/installers/glnxa64/MCR_R2017b_glnxa64_installer.zip
ENV MCR_VERSION_DIR v93
ENV PYENV_NAME pyenv
ENV N_CPUS 2
RUN apt-get update && \
apt-get -y install apt-utils locales && \
echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen && \
locale-gen en_US.utf8 && \
/usr/sbin/update-locale LANG=en_US.UTF-8
ENV LC_ALL en_US.UTF-8
ENV LANG en_US.UTF-8
ENV LANGUAGE en_US.UTF-8
ENV TERM xterm
ENV HOME /work
ENV SOFT $HOME/soft
ENV BASHRC $HOME/.bashrc
ENV BASICUSER basicuser
ENV BASICUSER_UID 1000
RUN useradd -m -d $HOME -s /bin/bash -N -u $BASICUSER_UID $BASICUSER && \
mkdir $SOFT && \
mkdir $HOME/.scripts && \
mkdir $HOME/.nipype
USER $BASICUSER
WORKDIR $HOME
COPY root/.* $HOME/
COPY root/* $HOME/
COPY root/.scripts/* $HOME/.scripts/
COPY root/.nipype/* $HOME/.nipype/
USER root
RUN \
chown -R $BASICUSER $HOME && \
echo "export SOFT=\$HOME/soft" >> $BASHRC && \
echo "source /etc/fsl/5.0/fsl.sh" >> $BASHRC && \
echo "export FSLPARALLEL=condor" >> $BASHRC && \
apt-get update && \
apt-get install -y wget bzip2 unzip htop curl git && \
wget -O- $NEURODEBIAN_URL | tee /etc/apt/sources.list.d/neurodebian.sources.list && \
apt-key adv --recv-keys --keyserver hkp://pool.sks-keyservers.net:80 0xA5D32F012649A5A9 && \
sed -i "s/# \(.*multiverse$\)/\1/g" /etc/apt/sources.list && \
echo 'deb-src http://archive.ubuntu.com/ubuntu xenial main restricted' | tee /etc/apt/sources.list && \
apt-get update && \
apt-get -y upgrade && \
apt-get install -y \
apt-utils \
locales \
cmake \
gcc-4.9 \
g++-4.9 \
gfortran-4.9 \
gcc-5 \
g++-5 \
gfortran-5 \
tcsh \
libjpeg62 \
libxml2-dev \
libxslt1-dev \
dicomnifti \
dcm2niix \
xdot \
fsl-5.0-eddy-nonfree \
fsl-5.0-core && \
rm -rf /var/lib/apt/lists/* && \
update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-5 90 --slave /usr/bin/g++ g++ /usr/bin/g++-5 && \
update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.9 80 --slave /usr/bin/g++ g++ /usr/bin/g++-4.9 && \
apt-get build-dep vtk6
&& ln -s /usr/lib/x86_64-linux-gnu/libgsl.so /usr/lib/libgsl.so.0 && \ </code>
I'm specifically getting the error here
Error: The command returned a nonzero code 100
RUN \
chown -R $BASICUSER $HOME && \
echo "export SOFT=\$HOME/soft" >> $BASHRC && \
echo "source /etc/fsl/5.0/fsl.sh" >> $BASHRC && \
echo "export FSLPARALLEL=condor" >> $BASHRC && \
apt-get update && \
apt-get install -y wget bzip2 unzip htop curl git && \
wget -O- $NEURODEBIAN_URL | tee /etc/apt/sources.list.d/neurodebian.sources.list && \
apt-key adv --recv-keys --keyserver hkp://pool.sks-keyservers.net:80 0xA5D32F012649A5A9 && \
sed -i "s/# (.multiverse$)/\1/g" /etc/apt/sources.list && \
echo 'deb-src http://archive.ubuntu.com/ubuntu xenial main restricted' | tee /etc/apt/sources.list && \
apt-get update && \
apt-get -y upgrade && \
apt-get install -y \
This is a prebuilt docker file from a repository for neuroimaging, so I would figure that it would work. Is this something wrong on my end?
I'm not entirely sure how to go
about debugging and solving this problem.
On your apt-get install line, add -y... what is happening is that your build cannot complete as it is waiting for user confirmation to allow the install, something which will never happen because the Docker build process is non-interactive.
That should fix you up!

Installing the most recent stable version of nodejs Dockerfile

The following portion of Dockerfile installs node, but defaults to v.4.2.6, How do I install the most recent stable version 7.4.0:
RUN apt-get clean && apt-get update \
&& apt-get -yqq install \
apache2 \
nodejs \ ## nodejs installed here
php \
php-mcrypt \
php-curl \
php-mbstring \
php-xml \
php-zip \
libapache2-mod-php \
php-mysql \
git \
supervisor \
&& apt-get -y autoremove \
&& apt-get clean \
&& php -r "readfile('http://getcomposer.org/installer');" | php -- --install-dir=/usr/bin/ --filename=composer \
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* \
&& ln -sf /dev/stdout /var/log/apache2/access.log \
&& ln -sf /dev/stderr /var/log/apache2/error.log
According to the documentation from nodejs.org you can install it by doing this :
curl -sL https://deb.nodesource.com/setup_7.x | sudo -E bash -
sudo apt-get install -y nodejs
So your Dockerfile could be like this :
RUN curl -sL https://deb.nodesource.com/setup_7.x | sudo -E bash \
&& apt-get clean && apt-get update \
&& apt-get -yqq install \
apache2 \
nodejs \ ## It should be the good version
RUN apt-get clean && apt-get update \
&& apt-get -yqq install \
apache2 \
php \
php-mcrypt \
php-curl \
php-mbstring \
php-xml \
php-zip \
libapache2-mod-php \
php-mysql \
git \
supervisor \
&& apt-get -y autoremove \
&& apt-get clean \
&& php -r "readfile('http://getcomposer.org/installer');" | php -- --install-dir=/usr/bin/ --filename=composer \
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* \
&& ln -sf /dev/stdout /var/log/apache2/access.log \
&& ln -sf /dev/stderr /var/log/apache2/error.log \
curl -o- https://raw.githubusercontent.com/creationix/nvm/v0.33.0/install.sh | bash && nvm install 7.4.0 \
&& nvm use 7.4.0 \

Resources