How to create a Docker container for Snakemake - python-3.x

This is my first attempt at creating a Docker container.
I need to make a container for a legacy version of VarScan2 so I based much of it off a Dockerfile found for a newer version of VarScan2
I got an error trying to run the first container I built hinting that it may not be running because Snakemake.utils was not available. Therefore I think I need to install Snakemake in my container for which I need >= Python3.5.
I'm having trouble trying to get Python3.8 as Snakemake is failing to install on docker build:
Error:
Downloading snakemake-5.26.1.tar.gz (237 kB)
ERROR: Command errored out with exit status 1:
command: /usr/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-_50ix5/snakemake/setup.py'"'"'; __file__='"'"'/tmp/pip-install-_50ix5/snakemake/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-AMGZ1t
cwd: /tmp/pip-install-_50ix5/snakemake/
Complete output (2 lines):
At least Python 3.5 is required.
----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
Dockerfile
FROM ubuntu:14.04
MAINTAINER Matthew Jordan Oldach, moldach686#gmail.com
# create a working directory and work from there
RUN mkdir /tmp/install
WORKDIR /tmp/install
RUN apt-get update && apt-get install -y \
build-essential \
libncurses5-dev \
libgdbm-dev \
libnss3-dev \
libssl-dev \
libreadline-dev \
libffi-dev \
gcc \
make \
zlib1g-dev \
git \
wget \
python3-pip \
default-jre \
r-base \
bc
# [Download Python3.7.5 following these instructions][2]
RUN wget https://www.python.org/ftp/python/3.7.5/Python-3.7.5.tgz
RUN tar -xf Python-3.7.5.tgz
RUN cd python-3.7.5; ./configure --enable-optimizations; cd ..
# Download Snakemake
RUN wget https://bootstrap.pypa.io/get-pip.py
RUN python get-pip.py
RUN pip install snakemake
# Samtools 0.1.18 - note: 0.1.19 and 1.1 do NOT work, VarScan copynumber dies on the mpileup
RUN wget http://downloads.sourceforge.net/project/samtools/samtools/0.1.18/samtools-0.1.18.tar.bz2
RUN tar -xvf samtools-0.1.18.tar.bz2
# the make command generates a lot of warnings, none of them relevant to the final samtools code, hence 2>/dev/null
#RUN (cd samtools-0.1.18/ && make DFLAGS='-D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_USE_KNETFILE -D_CURSES_LIB=0' LIBCURSES='' 2>/dev/null && mv samtools /usr/local/bin)
# get varscan
RUN wget -O /usr/local/bin/VarScan.jar https://netactuate.dl.sourceforge.net/project/varscan/VarScan.v2.3.9.jar
# Set WORKDIR to /data -- predefined mount location.
RUN mkdir /data
WORKDIR /data
# And clean up
RUN apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/install
ENTRYPOINT ["bash", "java -jar /usr/local/bin/VarScan.jar"]
I'd appreciate if anyone could help me figure out what's going wrong here, thanks.
Update
The solution by OneCricketeer solved this solution.
Dockerfile Solution
FROM python:3.7.5
MAINTAINER Matthew Jordan Oldach, moldach686#gmail.com
# create a working directory and work from there
RUN mkdir /tmp/install
WORKDIR /tmp/install
RUN apt-get update && apt-get install -y \
gcc \
make \
zlib1g-dev \
git \
wget \
python3-pip \
default-jre \
r-base \
bc
# Download Snakemake
RUN wget https://bootstrap.pypa.io/get-pip.py
RUN python get-pip.py
RUN pip install snakemake
# Samtools 0.1.18 - note: 0.1.19 and 1.1 do NOT work, VarScan copynumber dies on the mpileup
RUN wget http://downloads.sourceforge.net/project/samtools/samtools/0.1.18/samtools-0.1.18.tar.bz2
RUN tar -xvf samtools-0.1.18.tar.bz2
# the make command generates a lot of warnings, none of them relevant to the final samtools code, hence 2>/dev/null
#RUN (cd samtools-0.1.18/ && make DFLAGS='-D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_USE_KNETFILE -D_CURSES_LIB=0' LIBCURSES='' 2>/dev/null && mv samtools /usr/local/bin)
# get varscan
RUN wget -O /usr/local/bin/VarScan.jar https://netactuate.dl.sourceforge.net/project/varscan/VarScan.v2.3.9.jar
# Set WORKDIR to /data -- predefined mount location.
RUN mkdir /data
WORKDIR /data
# And clean up
RUN apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/install
ENTRYPOINT ["bash", "java -jar /usr/local/bin/VarScan.jar"]

Related

Unzip not handling utf-8 in Node Alpine Docker image: how to set correct locale?

With this zip file, this Node script successfully outputs the files:
const child_process = require('child_process')
const util = require('util')
const exec = util.promisify(child_process.exec)
exec(`unzip -Z1 metamorpR.zip`).then(zip_contents => {
if (zip_contents.stderr) {
throw new Error(`unzip error: ${zip_contents.stderr}`)
}
console.log(zip_contents.stdout)
})
metamorpR.z5
Варианты Прохождения.txt
Интерактивная Литература.pdf
But when I run the script from within Docker, it doesn't.
Using this Dockerfile:
FROM node:16-alpine
RUN apk add --no-cache unzip
COPY . .
ENTRYPOINT ["node", "unzip.js"]
Build and run (substitute in your container image name):
docker build .
docker run --rm 1dc072
Output:
metamorpR.z5
??????? ????????.txt
???????????? ??????????.pdf
I think this means the locales aren't set correctly within the Docker image? Any ideas how to fix this?
TL;DR
unzip on alpine doesn't appear to support locales. unzip on debian doesn't appear to support locales either. unzip on ubuntu supports using locales (however there exists no official node ubuntu image).
On ubuntu:
FROM ubuntu:18.04
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && \
apt-get install -y --no-install-recommends \
locales \
unzip && \
apt-get clean
RUN sed -i -e 's/# ru_RU.UTF-8 UTF-8/ru_RU.UTF-8 UTF-8/' /etc/locale.gen && \
locale-gen && \
update-locale LANG=ru_RU.UTF-8 LC_ALL=ru_RU.UTF-8 && \
ldconfig
ENV LANG=ru_RU.UTF-8
COPY metamorpR.zip /metamorpR.zip
CMD ["unzip", "-l", "metamorpR.zip"]
... there are no issues in the unzip file name output:
... however the same build FROM node:16-bullseye won't produce the same results:
You could apply this patch during the build, then generate the locales, however unzip doesn't appear to use the locales:
FROM node:16-alpine
RUN apk add --no-cache unzip wget
RUN wget -q -O /etc/apk/keys/sgerrand.rsa.pub https://alpine-pkgs.sgerrand.com/sgerrand.rsa.pub && \
wget https://github.com/sgerrand/alpine-pkg-glibc/releases/download/2.34-r0/glibc-2.34-r0.apk && \
wget https://github.com/sgerrand/alpine-pkg-glibc/releases/download/2.34-r0/glibc-bin-2.34-r0.apk && \
wget https://github.com/sgerrand/alpine-pkg-glibc/releases/download/2.34-r0/glibc-i18n-2.34-r0.apk && \
apk add glibc-2.34-r0.apk glibc-bin-2.34-r0.apk glibc-i18n-2.34-r0.apk && \
rm /glibc-2.34-r0.apk /glibc-bin-2.34-r0.apk /glibc-i18n-2.34-r0.apk && \
/usr/glibc-compat/bin/localedef -i ru_RU -f UTF-8 ru_RU.UTF-8
ENV LANG=ru_RU.UTF-8
COPY metamorpR.zip /metamorpR.zip
CMD ["unzip", "-l", "metamorpR.zip"]
Thanks to #masseyb's answer, I was able to get it working with this Dockerfile, which basically just installs Node manually into an Ubuntu image. The main downside is the image is twice the size, but it's comparatively simple so that's an acceptable downside to me.
FROM ubuntu:20.04
RUN apt-get update && \
apt install -y curl locales unzip && \
curl -fsSL https://deb.nodesource.com/setup_16.x | bash - && \
apt install -y nodejs && \
rm -rf /var/lib/apt/lists/* && \
localedef -i en_US -c -f UTF-8 -A /usr/share/locale/locale.alias en_US.UTF-8
ENV LANG en_US.UTF-8
COPY . .
ENTRYPOINT ["node", "unzip.js"]
Apparently some versions of unzip that is available from Ubuntu repositories can handle automatic decoding of filenames if you specify the -a switch.

python3: command not found

I have a dockerfile in which I have specified the entrypint as shell script named run-services.sh
Contents of the shell script are as follows:
apache2ctl start
echo "Started apache2ctl..."
python3 mock_ta.py
Now when I deploy this service at my local machine I get an error saying
python3: command not found
I removed entrypoint and went inside the container and executed the which python3 command and I can see that python3 is installed at /usr/bin/python3.
Ideally it should run the python script if python is installed, right? Any idea why this happens?
============================================================
Edit:Added Dockerfile
FROM php:7.1-apache
# Utilities
RUN apt-get update && \
apt-get -y install apt-transport-https git curl vim --no-install-recommends && \
rm -r /var/lib/apt/lists/*
# SimpleSAMLphp
ARG SIMPLESAMLPHP_VERSION=1.15.2
RUN curl -s -L -o /tmp/simplesamlphp.tar.gz https://github.com/simplesamlphp/simplesamlphp/releases/download/v$SIMPLESAMLPHP_VERSION/simplesamlphp-$SIMPLESAMLPHP_VERSION.tar.gz && \
tar xzf /tmp/simplesamlphp.tar.gz -C /tmp && \
rm -f /tmp/simplesamlphp.tar.gz && \
mv /tmp/simplesamlphp-* /var/www/simplesamlphp && \
touch /var/www/simplesamlphp/modules/exampleauth/enable
COPY config/simplesamlphp/config.php /var/www/simplesamlphp/config
COPY config/simplesamlphp/authsources.php /var/www/simplesamlphp/config
COPY config/simplesamlphp/saml20-sp-remote.php /var/www/simplesamlphp/metadata
COPY config/simplesamlphp/server.crt /var/www/simplesamlphp/cert/
COPY config/simplesamlphp/server.pem /var/www/simplesamlphp/cert/
# Apache
COPY config/apache/ports.conf /etc/apache2
COPY config/apache/simplesamlphp.conf /etc/apache2/sites-available
COPY config/apache/cert.crt /etc/ssl/cert/cert.crt
COPY config/apache/private.key /etc/ssl/private/private.key
RUN echo "ServerName localhost" >> /etc/apache2/apache2.conf && \
a2enmod ssl && \
a2dissite 000-default.conf default-ssl.conf && \
a2ensite simplesamlphp.conf
COPY config/run-services.sh /var/www/simplesamlphp/config/run-services.sh
ENTRYPOINT ["/var/www/simplesamlphp/config/run-services.sh"]
# Set work dir
WORKDIR /var/www/simplesamlphp
# General setup
EXPOSE 8080 8443
Thanks #David
With your help I was able to figure out that the python3 image that was present inside container was not accessible indeed.
So I had to install python3 and pip packages with the help of following command
RUN apt update -y && apt upgrade -y && apt install -y python3 && apt install -y python3-pip

Running selenium with nodejs in docker env : xvfb failed to start

I am trying to run selenium in docker. first I found a docker image from blueimp.
FROM blueimp/geckodriver
USER root
RUN apt-get update
RUN apt-get install -y --fix-missing x11-utils wget xclip firefox-esr xvfb xsel unzip libncurses5 libxslt-dev libxml2-dev libz-dev npm nodejs
RUN wget -q "https://github.com/mozilla/geckodriver/releases/download/v0.19.1/geckodriver-v0.19.1-linux64.tar.gz" -O /tmp/geckodriver.tgz \
&& tar zxf /tmp/geckodriver.tgz -C /usr/bin/ \
&& rm /tmp/geckodriver.tgz
RUN ln -s /usr/bin/geckodriver \
&& chmod 777 /usr/bin/geckodriver \
RUN /usr/bin/Xvfb :99 -ac -screen 0 1024x768x8 & export DISPLAY=":99"
RUN curl -L https://github.com/mozilla/geckodriver/releases/download/v0.24.0/geckodriver-v0.24.0-linux64.tar.gz > geckodriver-v0.24.0-linux64.tar.gz && tar -xzf geckodriver-v0.24.0-linux64.tar.gz && rm geckodriver-v0.24.0-linux64.tar.gz && mv geckodriver /usr/local/bin && chmod -R 777 /usr/local/bin
COPY package.json /src/package.json
RUN cd /src; npm install
COPY . /src
CMD ["node", "/src/app.js"]
this docker file works fine and builds complete successfully. without xvfb selenium complains with this error: invalid argument: can't kill an exited process .
then according to this answer : https://stackoverflow.com/a/53198328/5677187 you can handle selenuim with some virtual display. but when I try to run my docker container it comes with this error:
xvfb-run : xvfb failed to start
I am entring to running container shell and execute this: Xvfb and the output is :
fatal server error:
(EE) Server is already active for display 0 . if this server is no
longer running, remove /tmp/.X0-lock

using a docker app to make a new directory in an external hard drive

I am using a docker container to execute a python script located at my host machine. The script should make a new directory at a target location.
When the target location is located under $HOME or $HOME/*, everything works. However, when I want to create a directory at /media/my_name/external_drive, the terminal says that PermissionError: [Errno 13] Permission denied: '/media/my_name'
Here is the code I run
sudo docker-compose run --rm --user="$(id -u):$(id -g)" main process_all.py
Here is docker-compose.yml:
version: '2.3'
services:
main:
build: .
volumes:
- .:/app
- /etc/localtime:/etc/localtime:ro
environment:
- PYTHONIOENCODING=utf_8
init: true
network_mode: host
Here is the dockerfile
FROM ubuntu:16.04
# Install some basic utilities
RUN apt-get update && apt-get install -y \
curl \
ca-certificates \
sudo \
git \
bzip2 \
axel \
&& rm -rf /var/lib/apt/lists/*
# Create a working directory
RUN mkdir /app
WORKDIR /app
# Create a non-root user and switch to it
RUN adduser --disabled-password --gecos '' --shell /bin/bash user \
&& chown -R user:user /app
RUN echo "user ALL=(ALL) NOPASSWD:ALL" > /etc/sudoers.d/90-user
USER user
# All users can use /home/user as their home directory
ENV HOME=/home/user
RUN chmod 777 /home/user
# Install Miniconda
RUN curl -so ~/miniconda.sh https://repo.continuum.io/miniconda/Miniconda3-4.4.10-Linux-x86_64.sh \
&& chmod +x ~/miniconda.sh \
&& ~/miniconda.sh -b -p ~/miniconda \
&& rm ~/miniconda.sh
ENV PATH=/home/user/miniconda/bin:$PATH
# Create a Python 3.6 environment
RUN /home/user/miniconda/bin/conda install conda-build \
&& /home/user/miniconda/bin/conda create -y --name py36 python=3.6.4 \
&& /home/user/miniconda/bin/conda clean -ya
ENV CONDA_DEFAULT_ENV=py36
ENV CONDA_PREFIX=/home/user/miniconda/envs/$CONDA_DEFAULT_ENV
ENV PATH=$CONDA_PREFIX/bin:$PATH
# Ensure conda version is at least 4.4.11
# (because of this issue: https://github.com/conda/conda/issues/6811)
ENV CONDA_AUTO_UPDATE_CONDA=false
RUN conda install -y "conda>=4.4.11" && conda clean -ya
# Install FFmpeg
RUN conda install --no-update-deps -y -c conda-forge ffmpeg=3.2.4 \
&& conda clean -ya
# Install NumPy
RUN conda install --no-update-deps -y numpy=1.13.3 \
&& conda clean -ya
# Install build tools
RUN sudo apt-get update \
&& sudo apt-get install -y build-essential gfortran libncurses5-dev \
&& sudo rm -rf /var/lib/apt/lists/*
# Build and install CDF
RUN cd /tmp \
&& curl -O https://spdf.sci.gsfc.nasa.gov/pub/software/cdf/dist/cdf36_4/linux/cdf36_4-dist-all.tar.gz \
&& tar xzf cdf36_4-dist-all.tar.gz \
&& cd cdf36_4-dist \
&& make OS=linux ENV=gnu CURSES=yes FORTRAN=no UCOPTIONS=-O2 SHARED=yes all \
&& sudo make INSTALLDIR=/usr/local/cdf install
# Install other dependencies from pip
COPY requirements.txt .
RUN pip install -r requirements.txt
# Create empty SpacePy config (suppresses an annoying warning message)
RUN mkdir /home/user/.spacepy && echo "[spacepy]" > /home/user/.spacepy/spacepy.rc
# Copy scripts into the image
COPY --chown=user:user . /app
# Set the default command to python3
CMD ["python3"]
Untested, going by memory but I would debug the issue with an interactive version of your container.
Something like:
sudo docker run -t -i --rm --user="$(id -u):$(id -g)" main /bin/bash
You'll get a bash shell. Then you can debug it by
cd /media
ls -l
What I think you'll find is that the drive is probably not mounted. Or, the user doesn't have permission to access it.
With regards to mounts, either pass it through from the host or create a volume mount. I'm a little bit unsure about what you can do there because since I last used docker many changes around mounting and volume drivers were introduced. But the documentation on the docker website is pretty good. So experiment.
This is the cmd line reference for docker: https://docs.docker.com/engine/reference/run/
The key is to use the -t -i parameters to make it interactive.

Azure Hybrid Worker Docker

I am currently trying to docker-ize a Azure Hybrid Worker using the instructions provided at:
https://learn.microsoft.com/en-us/azure/automation/automation-linux-hrw-install
I am 90% successful however when I try to run the final step using onboarding.py the script is not found in the location specificied by the documentation. Basically the file is not found anywhere in the container. Any help would be great.
FROM ubuntu:14.04
RUN apt-get update && \
apt-get -y install sudo
ENV user docker
RUN useradd -m -d /home/${user} ${user} && \
chown -R ${user} /home/${user} && \
adduser ${user} sudo && \
echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers
USER ${user}
#WORKDIR /home/${user}
RUN sudo apt-get -y install apt-utils && \
sudo apt-get -y install openssl && \
sudo apt-get -y install curl && \
sudo apt-get -y install wget && \
sudo apt-get -y install cron && \
sudo apt-get -y install net-tools && \
sudo apt-get -y install auditd && \
sudo apt-get -y install python-ctypeslib
RUN sudo wget https://raw.githubusercontent.com/Microsoft/OMS-Agent-for-Linux/master/installer/scripts/onboard_agent.sh && \
sudo sh onboard_agent.sh -w <my-workplace-id> -s <my-workspace-key>
RUN sudo python /opt/microsoft/omsconfig/modules/nxOMSAutomationWorker/DSCResources/MSFT_nxOMSAutomationWorkerResource/automationworker/scripts/onboarding.py --register <arguments-removed-for-stackoverflow-post>
EXPOSE 443
Although I don't know the exact reason why it doesn't work yet, I have made some progress that I would like to share.
I've been experimenting with this problem by comparing the differences between centos running on a VM and a centos docker container. Although I haven't been able to pinpoint the exact things that are missing, I was able to get the onboarding.py file to show up on a centos docker container.
First what I did is create a file that has a list of packages that are installed on a minimal centos VM. In my docker file I run through this file and install each package. I plan to cut down the file to see what's necessary for this to work.
The second thing is you must have systemd, which is not installed by default. Here is what my docker image looks like while I'm testing:
FROM centos:7
RUN yum -y update && yum install -y sudo
RUN (cd /lib/systemd/system/sysinit.target.wants/; for i in *; do [ $i == \
systemd-tmpfiles-setup.service ] || rm -f $i; done); \
rm -f /lib/systemd/system/multi-user.target.wants/*;\
rm -f /etc/systemd/system/*.wants/*;\
rm -f /lib/systemd/system/local-fs.target.wants/*; \
rm -f /lib/systemd/system/sockets.target.wants/*udev*; \
rm -f /lib/systemd/system/sockets.target.wants/*initctl*; \
rm -f /lib/systemd/system/basic.target.wants/*;\
rm -f /lib/systemd/system/anaconda.target.wants/*;
ENV user docker
RUN useradd -m -d /home/${user} ${user}
RUN chown -R ${user} /home/${user}
RUN echo "docker ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers
USER ${user}
WORKDIR /home/${user}
COPY ./install_packages .
RUN sudo yum install -y $(cat ./install_packages)
sudo wget https://raw.githubusercontent.com/Microsoft/OMS-Agent-for-Linux/master/installer/scripts/onboard_agent.sh
CMD ["/usr/sbin/init"]
After that I use docker run to run my container locally and start systemd:
docker run -v /run -v /sys/fs/cgroup:/sys/fs/cgroup:ro -d container_id
I then exec into my container and run the onboard script:
sudo sh onboard_agent.sh -w 'xxx' -s 'xxx'
After it's done, you sometimes need to wait about 5 minutes for the missing folders to appear. To trigger this to happen sooner, you need to run this command:
/opt/microsoft/omsagent/bin/service_control restart {OMS_WORKSTATION_ID}
My understanding is this command will restart the OMS agent and it requires systemctl.
I understand this doesn't answer your question on how to get it working from building and running the container without having to remote into it. I'm still working on that and I'll let you know if I find an answer.
Good luck.

Resources