Issue itself
Got an Azure Container registry as both image and chart storage. Assume it myacr.azurecr.io with 8 different charts pushed. As far as I read before Azure ACR is capable of storing charts and compatible with Helm 3 (version 3.5.2).
The following steps to reproduce are simple.
helm repo add myacr https://myacr.azurecr.io/helm/v1/repo --username myusername -password admin123 - repo added. OK.
helm chart save ./my-chart/ myacr.azurecr.io/helm/my-chart:1.0.0 - chart saved. OK
helm push ./my-chart/ myacr.azurecr.io/helm/my-chart:1.0.0 - pushed. Available in Azure portal. OK.
helm repo update - what could go wrong here? As expected. OK
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "ingress-nginx" chart repository
...Successfully got an update from the "jetstack" chart repository
...Successfully got an update from the "myacr" chart repository
Update Complete. ⎈Happy Helming!⎈
helm search repo -l - I see everything from ingress-nginx and jetstack but nothing from myacr in the list.
Yet if I do pull and export everything works fine - chart is in place
What I tried
renaming repo name to helm/{app} according to some theories in the web - fail
reconfiguring chart with full descriptions e.t.c. according to ingress-nginx - fail
executing helm search repo -l --devel to see all possible chart versions - no luck
"Swithing off and on again" - removing and adding repo again with different combinations - fail
explicit slang language on every attempt - warms up a bit but doesn't solve the issue
The questions are
Is Azure ACR fully compatible with Helm 3?
Is there any specific workaround to make it compatible with Helm 3?
Does search functionality have any requirements to chart structure or version?
Is Azure ACR fully compatible with Helm 3?
Yes, it's fully compatible with Helm 3.
Is there any specific workaround to make it compatible with Helm 3?
Nothing needs to be done because the first question is yes.
Does search functionality have any requirements to chart structure or
version?
You need to first to add the repo to your local helm with the command az acr helm repo add --name myacr or helm repo add myacr https://myacr.azurecr.io/helm/v1/repo --username xxxxx --password xxxxxx, and then you get the output like this running the command helm search repo -l:
And the local repo looks like this:
I've tried latest Ubuntu and Python - they successfully extract and mount. However, as soon as I install further dependencies and add my app and push the image to Azure ACR - this error happens.
What is happening on my local machine? I have the Ubuntu image running for example, I install pip3 for example, and "docker commit" the changes locally, then tag the image and push it to ACR. This image will then fail to load with the above error. I can see that the segments in the previous image are already in the registry and only the latest image segment is actually pushed. So the error appears to occur with the latest change to the image.
The full error message is:-
2020-06-25T03:14:43.517Z ERROR - failed to register layer: Error
processing tar file(exit status 1): Container ID 197609 cannot be
mapped to a host IDErr: 0, Message: failed to register layer: Error
processing tar file(exit status 1): Container ID 197609 cannot be
mapped to a host ID
2020-06-25T03:14:43.589Z INFO - Pull Image
failed, Time taken: 0 Minutes and 46 Seconds
2020-06-25T03:14:43.590Z ERROR - Pulling docker image *******.azurecr.io/seistech-1:v1.0.0.15
failed:
2020-06-25T03:14:43.590Z INFO - Pulling image from Docker
hub: .azurecr.io/seistech-1:v1.0.0.15 2020-06-25T03:14:43.987Z
ERROR - DockerApiException: Docker API responded with status
code=InternalServerError, response={"message":"Get
https://.azurecr.io/v2/*******-1/manifests/v1.0.0.15:
unauthorized: authentication required, visit
https://aka.ms/acr/authorization for more information."}
2020-06-25T03:14:44.020Z ERROR - Image pull failed: Verify docker
image configuration and credentials (if using private repository)
2020-06-25T03:14:46.491Z INFO - Stopping site *******-dev-container
because it failed during startup.
Note re' authentication message(s) - I have created system assigned identity in the web app and assigned image pull permissions in the ACR - so as far as I can tell, there should be no auth issue.
Suggestions appreciated - very little diagnostic info to work with.
Thanks
Andy, NZ
In my case, it was related to npm v9 and only when using Azure App Service.
may install modules into your node_modules directory with a high ID for the file owner/creator.
My Solution:
As a quick workaround, I reverted back to an older version of npm 8.x. That worked for me at the moment.
Found this article to be helpful
NPM-specific issues causing users remapping exceptions
The Managed identity cannot be authentication to deploy the Docker image from the ACR. You must set the environment such as:
DOCKER_REGISTRY_SERVER_USERNAME - The username for the ACR server.
DOCKER_REGISTRY_SERVER_URL - The full URL to the ACR server. (For example, https://my-server.azurecr.io.)
DOCKER_REGISTRY_SERVER_PASSWORD - The password for the ACR server.
This is the only way to pull the image from a private registry and so for ACR. The managed identity just works when the Web App running, but pulling the image is before it.
For some reason I took several days to find this documentation page
The key piece of info in the container error log is this:
...container id xxxxxx cannot be mapped...
And the shell script to find these items is this:
$ find / \( -uid 1000000 \) -ls 2>/dev/null
(where 1000000 is the high ID from the error message)
The solution then was simple, in the final step of my docker-compose Dockerfile I added:
&& chown -R root:root /home
which is where the find command show all the problematic files were - they were created by a tar -x operation. I haven't dug further into whether this caused the permissions problem.
I am using an estimator step for a pipeline using the Environment class, in order to have a custom Docker image as I need some apt-get packages to be able to install a specific pip package. It appears from the logs that it's completely ignoring, unlike the non-pipeline version of the estimator, the docker portion of the environment variable. Very simply, this seems broken :
I'm running on SDK v1.0.65, and my dockerfile is completely ignored, I'm using
FROM mcr.microsoft.com/azureml/base:latest\nRUN apt-get update && apt-get -y install freetds-dev freetds-bin vim gcc
in the base_dockerfile property of my code.
Here's a snippet of my code :
from azureml.core import Environment
from azureml.core.environment import CondaDependencies
conda_dep = CondaDependencies()
conda_dep.add_pip_package('pymssql==2.1.1')
myenv = Environment(name="mssqlenv")
myenv.python.conda_dependencies=conda_dep
myenv.docker.enabled = True
myenv.docker.base_dockerfile = 'FROM mcr.microsoft.com/azureml/base:latest\nRUN apt-get update && apt-get -y install freetds-dev freetds-bin vim gcc'
myenv.docker.base_image = None
This works well when I use an Estimator by itself, but if I insert this estimator in a Pipeline, it fails. Here's my code to launch it from a Pipeline run:
from azureml.pipeline.steps import EstimatorStep
sql_est_step = EstimatorStep(name="sql_step",
estimator=est,
estimator_entry_script_arguments=[],
runconfig_pipeline_params=None,
compute_target=cpu_cluster)
from azureml.pipeline.core import Pipeline
from azureml.core import Experiment
pipeline = Pipeline(workspace=ws, steps=[sql_est_step])
pipeline_run = exp.submit(pipeline)
When launching this, the logs for the container building service reveal:
FROM continuumio/miniconda3:4.4.10... etc.
Which indicates it's ignoring my FROM mcr.... statement in the Environment class I've associated with this Estimator, and my pip install fails.
Am I missing something? Is there a workaround?
I can confirm that this is a bug on the AML Pipeline side. Specifically, the runconfig property environment.docker.base_dockerfile is not being passed through correctly in pipeline jobs. We are working on a fix. In the meantime, you can use the workaround from this thread of building the docker image first and specifying it with environment.docker.base_image (which is passed through correctly).
I found a workaround for now, which is to build your own Docker image. You can do this by using these options of the DockerSection of the Environment :
myenv.docker.base_image_registry.address = '<your_acr>.azurecr.io'
myenv.docker.base_image_registry.username = '<your_acr>'
myenv.docker.base_image_registry.password = '<your_acr_password>'
myenv.docker.base_image = '<your_acr>.azurecr.io/testimg:latest'
and use obviously whichever docker image you built and pushed to the container registry linked to the Azure Machine Learning Workspace.
To create the image, you would run something like this at the command line of a machine that can build a linux based container (like a Notebook VM):
docker build . -t <your_image_name>
# Tag it for upload
docker tag <your_image_name:latest <your_acr>.azurecr.io/<your_image_name>:latest
# Login to Azure
az login
# login to the container registry so that the push will work
az acr login --name <your_acr>
# push the image
docker push <your_acr>.azurecr.io/<your_image_name>:latest
Once the image is pushed, you should be able to get that working.
I also initially used EstimatorStep for custom images, but recently have figured out how to successfully pass Environment's first to RunConfiguration's, then to PythonScriptStep's. (example below)
Another workaround similar to your workaround would be to publish your custom docker image to Docker hub, then the param, docker_base_image becomes the URI, in our case mmlspark:0.16.
def get_environment(env_name, yml_path, user_managed_dependencies, enable_docker, docker_base_image):
env = Environment(env_name)
cd = CondaDependencies(yml_path)
env.python.conda_dependencies = cd
env.python.user_managed_dependencies = user_managed_dependencies
env.docker.enabled = enable_docker
env.docker.base_image = docker_base_image
return env
spark_env = f.get_environment(env_name='spark_env',
yml_path=os.path.join(os.getcwd(), 'compute/aml_config/spark_compute_dependencies.yml'),
user_managed_dependencies=False, enable_docker=True,
docker_base_image='microsoft/mmlspark:0.16')
# use pyspark framework
spark_run_config = RunConfiguration(framework="pyspark")
spark_run_config.environment = spark_env
roll_step = PythonScriptStep(
name='rolling window',
script_name='roll.py',
arguments=['--input_dir', joined_data,
'--output_dir', rolled_data,
'--script_dir', ".",
'--min_date', '2015-06-30',
'--pct_rank', 'True'],
compute_target=compute_target_spark,
inputs=[joined_data],
outputs=[rolled_data],
runconfig=spark_run_config,
source_directory=os.path.join(os.getcwd(), 'compute', 'roll'),
allow_reuse=pipeline_reuse
)
A couple of other points (that may be wrong):
PythonScriptStep is effectively a wrapper for ScriptRunConfig, which takes run_config as an argument
Estimator is a wrapper for ScriptRunConfig where RunConfig settings are made available as parameters
IMHO EstimatorStep shouldn't exist because it is better to define Env's and Steps separately instead of at the same time in one call.
I'm trying to change the tag of a Docker image using a Docker task on an Azure DevOps pipeline, without success.
Consider the following Docker image hosted on an Azure container registry:
My task is configured as follows:
$(DockerImageName) value is agents/standard-linux-docker2:310851
I'm trying to change the Docker image tag (e.g. to latest) but so far I wasn't able to make it work. I've also tried to set the arguments as well, without success.
Task fails with the following error message:
Error response from daemon: No such image: agents/standard-linux-docker2:310851
/usr/bin/docker failed with return code: 1
What am I missing here?
Try using Azure CLI Task. Run the following command and select the options in the image.
az acr import --name xxxxxacr --source xxxxxacr.azurecr.io/xxx/xxx-api:stage --image xxxxyyyyyyy/yyyyyyyy-api:prod --force
I've been experimenting with Docker recently on building some services to play around with and one thing that keeps nagging me has been putting passwords in a Dockerfile. I'm a developer so storing passwords in source feels like a punch in the face. Should this even be a concern? Are there any good conventions on how to handle passwords in Dockerfiles?
Definitely it is a concern. Dockerfiles are commonly checked in to repositories and shared with other people. An alternative is to provide any credentials (usernames, passwords, tokens, anything sensitive) as environment variables at runtime. This is possible via the -e argument (for individual vars on the CLI) or --env-file argument (for multiple variables in a file) to docker run. Read this for using environmental with docker-compose.
Using --env-file is definitely a safer option since this protects against the secrets showing up in ps or in logs if one uses set -x.
However, env vars are not particularly secure either. They are visible via docker inspect, and hence they are available to any user that can run docker commands. (Of course, any user that has access to docker on the host also has root anyway.)
My preferred pattern is to use a wrapper script as the ENTRYPOINT or CMD. The wrapper script can first import secrets from an outside location in to the container at run time, then execute the application, providing the secrets. The exact mechanics of this vary based on your run time environment. In AWS, you can use a combination of IAM roles, the Key Management Service, and S3 to store encrypted secrets in an S3 bucket. Something like HashiCorp Vault or credstash is another option.
AFAIK there is no optimal pattern for using sensitive data as part of the build process. In fact, I have an SO question on this topic. You can use docker-squash to remove layers from an image. But there's no native functionality in Docker for this purpose.
You may find shykes comments on config in containers useful.
Our team avoids putting credentials in repositories, so that means they're not allowed in Dockerfile. Our best practice within applications is to use creds from environment variables.
We solve for this using docker-compose.
Within docker-compose.yml, you can specify a file that contains the environment variables for the container:
env_file:
- .env
Make sure to add .env to .gitignore, then set the credentials within the .env file like:
SOME_USERNAME=myUser
SOME_PWD_VAR=myPwd
Store the .env file locally or in a secure location where the rest of the team can grab it.
See: https://docs.docker.com/compose/environment-variables/#/the-env-file
Docker now (version 1.13 or 17.06 and higher) has support for managing secret information. Here's an overview and more detailed documentation
Similar feature exists in kubernetes and DCOS
You should never add credentials to a container unless you're OK broadcasting the creds to whomever can download the image. In particular, doing and ADD creds and later RUN rm creds is not secure because the creds file remains in the final image in an intermediate filesystem layer. It's easy for anyone with access to the image to extract it.
The typical solution I've seen when you need creds to checkout dependencies and such is to use one container to build another. I.e., typically you have some build environment in your base container and you need to invoke that to build your app container. So the simple solution is to add your app source and then RUN the build commands. This is insecure if you need creds in that RUN. Instead what you do is put your source into a local directory, run (as in docker run) the container to perform the build step with the local source directory mounted as volume and the creds either injected or mounted as another volume. Once the build step is complete you build your final container by simply ADDing the local source directory which now contains the built artifacts.
I'm hoping Docker adds some features to simplify all this!
Update: looks like the method going forward will be to have nested builds. In short, the dockerfile would describe a first container that is used to build the run-time environment and then a second nested container build that can assemble all the pieces into the final container. This way the build-time stuff isn't in the second container. This of a Java app where you need the JDK for building the app but only the JRE for running it. There are a number of proposals being discussed, best to start from https://github.com/docker/docker/issues/7115 and follow some of the links for alternate proposals.
An alternative to using environment variables, which can get messy if you have a lot of them, is to use volumes to make a directory on the host accessible in the container.
If you put all your credentials as files in that folder, then the container can read the files and use them as it pleases.
For example:
$ echo "secret" > /root/configs/password.txt
$ docker run -v /root/configs:/cfg ...
In the Docker container:
# echo Password is `cat /cfg/password.txt`
Password is secret
Many programs can read their credentials from a separate file, so this way you can just point the program to one of the files.
run-time only solution
docker-compose also provides a non-swarm mode solution (since v1.11:
Secrets using bind mounts).
The secrets are mounted as files below /run/secrets/ by docker-compose. This solves the problem at run-time (running the container), but not at build-time (building the image), because /run/secrets/ is not mounted at build-time. Furthermore this behavior depends on running the container with docker-compose.
Example:
Dockerfile
FROM alpine
CMD cat /run/secrets/password
docker-compose.yml
version: '3.1'
services:
app:
build: .
secrets:
- password
secrets:
password:
file: password.txt
To build, execute:
docker-compose up -d
Further reading:
mikesir87's blog - Using Docker Secrets during Development
My approach seems to work, but is probably naive. Tell me why it is wrong.
ARGs set during docker build are exposed by the history subcommand, so no go there. However, when running a container, environment variables given in the run command are available to the container, but are not part of the image.
So, in the Dockerfile, do setup that does not involve secret data. Set a CMD of something like /root/finish.sh. In the run command, use environmental variables to send secret data into the container. finish.sh uses the variables essentially to finish build tasks.
To make managing the secret data easier, put it into a file that is loaded by docker run with the --env-file switch. Of course, keep the file secret. .gitignore and such.
For me, finish.sh runs a Python program. It checks to make sure it hasn't run before, then finishes the setup (e.g., copies the database name into Django's settings.py).
There is a new docker command for "secrets" management. But that only works for swarm clusters.
docker service create
--name my-iis
--publish target=8000,port=8000
--secret src=homepage,target="\inetpub\wwwroot\index.html"
microsoft/iis:nanoserver
The issue 13490 "Secrets: write-up best practices, do's and don'ts, roadmap" just got a new update in Sept. 2020, from Sebastiaan van Stijn:
Build time secrets are now possible when using buildkit as builder; see the blog post "Build secrets and SSH forwarding in Docker 18.09", Nov. 2018, from Tõnis Tiigi.
The documentation is updated: "Build images with BuildKit"
The RUN --mount option used for secrets will graduate to the default (stable) Dockerfile syntax soon.
That last part is new (Sept. 2020)
New Docker Build secret information
The new --secret flag for docker build allows the user to pass secret information to be used in the Dockerfile for building docker images in a safe way that will not end up stored in the final image.
id is the identifier to pass into the docker build --secret.
This identifier is associated with the RUN --mount identifier to use in the Dockerfile.
Docker does not use the filename of where the secret is kept outside of the Dockerfile, since this may be sensitive information.
dst renames the secret file to a specific file in the Dockerfile RUN command to use.
For example, with a secret piece of information stored in a text file:
$ echo 'WARMACHINEROX' > mysecret.txt
And with a Dockerfile that specifies use of a BuildKit frontend docker/dockerfile:1.0-experimental, the secret can be accessed.
For example:
# syntax = docker/dockerfile:1.0-experimental
FROM alpine
# shows secret from default secret location:
RUN --mount=type=secret,id=mysecret cat /run/secrets/mysecret
# shows secret from custom secret location:
RUN --mount=type=secret,id=mysecret,dst=/foobar cat /foobar
This Dockerfile is only to demonstrate that the secret can be accessed. As you can see the secret printed in the build output. The final image built will not have the secret file:
$ docker build --no-cache --progress=plain --secret id=mysecret,src=mysecret.txt .
...
#8 [2/3] RUN --mount=type=secret,id=mysecret cat /run/secrets/mysecret
#8 digest: sha256:5d8cbaeb66183993700828632bfbde246cae8feded11aad40e524f54ce7438d6
#8 name: "[2/3] RUN --mount=type=secret,id=mysecret cat /run/secrets/mysecret"
#8 started: 2018-08-31 21:03:30.703550864 +0000 UTC
#8 1.081 WARMACHINEROX
#8 completed: 2018-08-31 21:03:32.051053831 +0000 UTC
#8 duration: 1.347502967s
#9 [3/3] RUN --mount=type=secret,id=mysecret,dst=/foobar cat /foobar
#9 digest: sha256:6c7ebda4599ec6acb40358017e51ccb4c5471dc434573b9b7188143757459efa
#9 name: "[3/3] RUN --mount=type=secret,id=mysecret,dst=/foobar cat /foobar"
#9 started: 2018-08-31 21:03:32.052880985 +0000 UTC
#9 1.216 WARMACHINEROX
#9 completed: 2018-08-31 21:03:33.523282118 +0000 UTC
#9 duration: 1.470401133s
...
The 12-Factor app methodology tells, that any configuration should be stored in environment variables.
Docker compose could do variable substitution in configuration, so that could be used to pass passwords from host to docker.
Starting from Version 20.10, besides using secret-file, you could also provide secrets directly with env.
buildkit: secrets: allow providing secrets with env moby/moby#41234 docker/cli#2656 moby/buildkit#1534
Support --secret id=foo,env=MY_ENV as an alternative for storing a secret value to a file.
--secret id=GIT_AUTH_TOKEN will load env if it exists and the file does not.
secret-file:
THIS IS SECRET
Dockerfile:
# syntax = docker/dockerfile:1.3
FROM python:3.8-slim-buster
COPY build-script.sh .
RUN --mount=type=secret,id=mysecret ./build-script.sh
build-script.sh:
cat /run/secrets/mysecret
Execution:
$ export MYSECRET=theverysecretpassword
$ export DOCKER_BUILDKIT=1
$ docker build --progress=plain --secret id=mysecret,env=MYSECRET -t abc:1 . --no-cache
......
#9 [stage-0 3/3] RUN --mount=type=secret,id=mysecret ./build-script.sh
#9 sha256:e32137e3eeb0fe2e4b515862f4cd6df4b73019567ae0f49eb5896a10e3f7c94e
#9 0.931 theverysecretpassword#9 DONE 1.5s
......
With Docker v1.9 you can use the ARG instruction to fetch arguments passed by command line to the image on build action. Simply use the --build-arg flag. So you can avoid to keep explicit password (or other sensible information) on the Dockerfile and pass them on the fly.
source: https://docs.docker.com/engine/reference/commandline/build/ http://docs.docker.com/engine/reference/builder/#arg
Example:
Dockerfile
FROM busybox
ARG user
RUN echo "user is $user"
build image command
docker build --build-arg user=capuccino -t test_arguments -f path/to/dockerfile .
during the build it print
$ docker build --build-arg user=capuccino -t test_arguments -f ./test_args.Dockerfile .
Sending build context to Docker daemon 2.048 kB
Step 1 : FROM busybox
---> c51f86c28340
Step 2 : ARG user
---> Running in 43a4aa0e421d
---> f0359070fc8f
Removing intermediate container 43a4aa0e421d
Step 3 : RUN echo "user is $user"
---> Running in 4360fb10d46a
**user is capuccino**
---> 1408147c1cb9
Removing intermediate container 4360fb10d46a
Successfully built 1408147c1cb9
Hope it helps! Bye.
Something simply like this will work I guess if it is bash shell.
read -sp "db_password:" password | docker run -itd --name <container_name> --build-arg mysql_db_password=$db_password alpine /bin/bash
Simply read it silently and pass as argument in Docker image. You need to accept the variable as ARG in Dockerfile.
While I totally agree there is no simple solution. There continues to be a single point of failure. Either the dockerfile, etcd, and so on. Apcera has a plan that looks like sidekick - dual authentication. In other words two container cannot talk unless there is a Apcera configuration rule. In their demo the uid/pwd was in the clear and could not be reused until the admin configured the linkage. For this to work, however, it probably meant patching Docker or at least the network plugin (if there is such a thing).