Docker doesn't download recommended packages - linux

I am using docker for a Python application.
FROM python:3.5-slim
WORKDIR /abc
ADD . /abc
RUN apt-get update && \
apt-get install -y --no-install-recommends \
curl \
gcc \
python3-dev \
musl-dev \
&& \
pip install -r requirements.txt &&\
apt-get clean && \
rm -rf /var/lib/apt/lists/* &&\
apt-get purge -y --auto-remove gcc
So whenever I am running the docker build command it first runs the apt-get update command there.
With update command, it's also downloading many recommended packages and taking long build time.
How can I stop Ubuntu from installing recommended packages and build docker faster?
Note: In the Dockerfile, apt-get --no-install-recommends update is not working; it's still downloading packages.

apt-get update should not install anything. The only thing apt-get update should do is update the local description of what packages are available. That does not download those packages though -- it just downloads the updated descriptions. That can take a while.
apt-get install will of course install packages. In order to install those packages, it needs to download them. Using --no-install-recommends tells apt-get to not install "recommended packages". For example, if you install vim, there are many plugins that are also recommended and provided as separate packages. With that switch, those vim plugins will not be installed. Of course, installing the packages you selected can also take a while.
What you're doing, using && \ is to put all of that into a single docker command. So every time you rebuild your image, you will have to do that every time because the list of packages changes every day, sometimes even multiple times per day.
Try moving pip install -r requirements.txt to its own RUN command after you've run apt-get stuff. If that then does what you want, then I suggest reading and learning more about how Docker works under the hood. In particular, it's important to understand how each single command adds a new layer and how any dynamic information in a single layer can cause long build times because the layer will frequently change with large amounts of changes.
Additionally, you might want to move ADD . /abc to after the RUN commands. Any changes you've made to the files being added (source code, I assume) will invalidate the layer which represents the apt-get command that has been executed. Since it's been invalidated, it will need to be rebuilt. If you're actively working on and developing those projects, that can easily cause apt-get to be executed every time you build your image.
There are plenty of resources you can search for which discuss how to optimize your time when using Docker. I won't recommend any specific one and will leave it to you for learning.

Related

Azure: How to create an environment where the VM has a special package installed?

I am about to deploy a model on Azure but the model needs a special package installed on Ubuntu. My model is written in python and I have a python-wrapper installed (and other necessary pip packages) already in the environment.
The challenge is that the wrapper needs the special package to be installed on the Ubuntu. How and at what point I need to specify what packages I want to be installed on Ubuntu when creating the environment? The package is not a default one.
The following code snippet helped me to solve this. Just substitute the package you want to install into "<'package-1'>".
FROM <prebuilt docker image from MCR>
# Switch to root to install apt packages
USER root:root
RUN apt-get update && \
apt-get install -y \
**<package-1>** \
...
<package-n> && \
apt-get clean -y && \
rm -rf /var/lib/apt/lists/*
# Switch back to non-root user
USER dockeruser
The complete tutorial can be found here: https://learn.microsoft.com/en-us/azure/machine-learning/how-to-extend-prebuilt-docker-image-inference

Fail building a Docker container early with bad list of packages for yum install

My Dockerfile wants some package not-here-yet that is not in the registered repositories.
RUN yum install -d 1 -y not-here-yet && yum clean all
This fails as expected. But because the container does not have dnf, the below returns exit code 0 even though it has the same problem.
RUN yum install -d 1 -y inotify-tools not-here-yet && yum clean all
yum's poor validation in this area leaves me with an incomplete container unless the Dockerfile's maintainer knows what commands make not-here-yet available.
Assuming for policy reasons that I cannot install dnf on this container, how do I make yum fail if any one package is not found in a list without copying the same RUN line over and over again to install one package at a time?

Installing netstat on docker linux container

I want to install netstat on my Docker container.
I looked here https://askubuntu.com/questions/813579/netstat-or-alternative-in-docker-ubuntu-server-16-04-container so I'm trying to install it like this:
apt-get install net-tools
However, I'm getting:
Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Unable to locate package net-tools
So how can I install netstat?
You need to run apt-get update first to download the current state of the package repositories. Docker images do not include this to save space, and because they'd likely be outdated when you use it. If you are doing this in a Dockerfile, make sure to keep it as a single RUN command so that caching of the layers doesn't cache an old version of the update command with a new package install request:
RUN apt-get update \
&& DEBIAN_FRONTEND=noninteractive apt-get install -y \
net-tools \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
netstat is provided by the net-tools package,net-tools is probably not installed by default in the Docker image for Ubuntu 16.04 to keep the image size as small as possible.
Execute the following commands inside docker container:
apt update
apt install net-tools

In Docker, why is it recommended to run `apt-get` update in the Dockerfile?

Sorry, very new to server stuff, but very curious. Why run apt-get update when building a container?
My guess would be that it's for security purposes, if that the case than that'll answer the question.
apt-get update ensures all package sources and dependencies are at their latest version, it does not update existing packages that have been installed. It's recommended that you always run apt-get update prior to running an apt-get install this is so when the apt-get install is run, the latest version of the package should be used.
RUN apt-get update -q -y && apt-get install -q -y <your-program>
(the -q -y flags just mean that the apt process will run quietly without asking you for confirmations as this would cause the Docker process to fail)
First, lets make a distinction between apt-get update and apt-get upgrade. The update is to get the latest package index. This is so that you don't run into errors for outdated or redacted packages when doing a apt-get install.
The upgrade is actually going through an upgrading packages. It usually also requires a preceding update to have the updated package index. This might be done if there are package or security concerns of already installed packages.
You usually see an update a lot in builds because the base image may have a fairly out of date package index and just doing an apt-get install can fail.
The upgrade would be less common. But could still be done if you want to ensure the latest packages are installed.

Python3.X.X Proper Setup with Virtualenv and multiple installs to /OPT/

I have spent about a week trying to get python python 3.x.x setup “properly” on my system. It has been quite a battle and I'm just about there with one final obstacle I can't seem to resolve. Many forums discuss setting up Python 3.X.X on various distro's and each has different methods, goals, outcomes, errors/issues with no clear answer. By now I have put in over 100 hours and have busted/reinstalled my system with clonzilla images dozens to times. But after all that, I have captured all the steps necessary to have the ultimate python setup in this posting - minus the answer to the final obstacle which I'm hoping someone can help me with:
The end goal I'm aiming for is the “ultimate python3.x.x setup” that I define as having the following characteristics:
has clean installation/configurations of python3.x.x. that are built from source, and include multiple side-by-side python3.x.x. versions (e.g. python3.0.1, python3.2.5, python3.3.0) with preserved root permissions assigned to each folder as part of the default /opt/ directory
Each installation does not interfere with the system's default interpreter,has pip, easy install, distribute tools, and virtualenv all properly configured and working and can run in isolation with different modules via virtualenv's
each python3.x.x is compiled, installed and named clearly in the /opt/ directory (e.g. python3.3.0, python3.3.1, python3.3.2), and is configured such that when calling whatever python3.x.x from the terminal window or using that version that it does not screwup the system's default interpretor, its dependencies/packages (plenty of forums on this one)
Each python3.x.x is working in pycharm's stupidly simplistic and awesome virtualenv manager - my last hurdle
The following steps are my setup so far and it compiled from multiple forums necessary to accomplish all of the above minus the last hurdle. Two important points are 1) I'm running Linux Mint LTS 13 and 2) I have NOT ran “sudo apt-get install python3” or any similar apt-get of python3xxxxx (this is deliberate for reasons below).
These are the steps I have taken on a fresh install of Linux mint 13, and now have backed up clonezilla image as well as virtual box's which I'm now using to solve this last hurdle.
Step 1:
This mega-command will download and setup pycharm, including the program's oracle (sun) dependencies, and install everything to the /opt/ directory (i.e.the proper location). I simply accept oracle's prompts and complete pycharm's final installation prompts (e.g. accept license, trial period etc.)
Pycharm
sudo add-apt-repository ppa:webupd8team/java -y && sudo apt-get update && sudo apt-get install oracle-java7-set-default -y && sudo apt-get install oracle-java7-installer -y && wget "http://download.jetbrains.com/python/pycharm-professional-3.0.2.tar.gz" && sudo mkdir /opt/Pycharm && sudo cp pycharm-professional-3.0.2.tar.gz /opt/Pycharm/ && cd /opt/Pycharm/ && sudo tar xvfz pycharm-professional-3.0.2.tar.gz && cd pycharm-3.0.2/bin && sudo sh pycharm.sh
Step 2:
This single command will download, extract, move, compile, and install 3.3.0, with all necessary prior dependencies, and place python3.3.0 in the /opt/ directory (the proper location).
Python3.3.0
sudo apt-get install build-essential libbz2-dev bzip2 zlib1g-dev sqlite3 libsqlite3-dev -y && wget http://python.org/ftp/python/3.3.0/Python-3.3.0.tgz && tar xvfz Python-3.3.0.tgz && cd Python-3.3.0 && ./configure --prefix=/opt/python3.3.0 && make && sudo make install
Step 3:
This single command will download, extract, move, compile, and install 3.2.5, with all necessary dependencies, and place python3.2.5 in the /opt/ directory.
Python3.2.5
wget http://www.python.org/ftp/python/3.2.5/Python-3.2.5.tgz && tar xvfz Python-3.2.5.tgz && cd Python-3.2.5 && ./configure --prefix=/opt/python3.2.5 && make && sudo make install
We now have pycharm and two side-by-side installations of python3.3.0 and python3.2.5 that are built from source, installed in the /opt/directory, and will not interfere with the system's python2.x.x interpretor or its dependencies/packages. Good so far as this is a very clean setup... Now comes the final hurdle.
If I (or you) “sudo apt-get install python3-dev” from this point, including a few other commands to setup and activate a virtualenv of python3.x.x, everything appears to work. Meaning you can setup multiple Python3.x.x virtualenv's and run them with pycharm, eclipse or from a terminal windows as either virtualenv's or non-virtualenv's. Pycharm makes it stupidly easy to manage virtually any configuration you want with its built-in virtualenv manager. The problem though is that doing “sudo apt-get install python3-dev” defeats the whole purpose of keeping python3.x.x as separate installations and runs the risk of 1) breaking python2.x.x packages, 2) installs pip packages meant for python3.x.x. into python2.x.x directories, 3) limits the ability of the user to only python3.2 and lower because you have to point whatever virtualenv you're using to the interpreter that came with running “sudo apt-get install python3-dev”, 4) a plethora of other problems scattered throughout forums I have investigated this this week in trying to figure this all out. Therefore “sudo apt-get install python3-dev” or any other apt-get of python3.x.x is not a solution as it leads to too many issues.
At this point I have a master version of a virtual box image setup with all the above steps completed in which I keep cloning and retrying to get the compiled interpretorls from /opt/ to function without doing a “sudo apt-get python3.xxx”. The 'key problem' indicated in screen shot is this issue. Nothing I do seems to allow me to point it to /opt/python3.xx/bin/pythonX interpretor whether using an IDE like pycharm, eclipse or by terminal. As soon as I run “sudo apt-get python3.xxx” it will work – but of course inherit all the other nightmare that people scream about in forums when they go down that route. Any help is greatly appreciated...
screenshot http://www.pasteall.org/pic/show.php?id=65653
Every configuration I have tried in getting the interpretor's that were compiled from source fails to allow those python3.x.x installations to function as virtualenv's and thus use package managers like pip either in a terminal window or with pycharm/eclipse. I have tried installing to home directories, changing permissions in /opt/, making system links, practically everything - everything that doesn't involve a “sudo apt-get install python3.xxxx”...This post (https://askubuntu.com/questions/406756/how-to-install-python-3-x-x-properly#406762), at step two, works but only if you revert to doing a “sudo apt-get install python3”.
What you're looking for is pyenv. It will manage your python installations, and allow you to install new versions of python without hampering other installations, it will work fine Pycharm, and will not hamper other python installations. Its done completely in bash, so it does not have python as a prerequisite installation.
I have finally figured out what I was doing wrong. I was not reading the make report and fixing additional dependencies before installing. The main problem was including dependencies for _ssl which is required for pip to work with python3.
I now have my "ultimate python setup"

Resources