I followed the pip section on the Azure documentation on pipeline caching to speed up my Azure DevOps CI pipeline (in particular the dependency installation step). However, the packages are still installed every time I execute the pipeline (which I ideally also want to cache). How can I achieve this?
The Azure DevOps documentation is a bit lackluster here. Following the pip section just leads to caching of the wheels, not the installation itself (which you can also cache to further improve the pipeline execution time). To enable this, you need to work with a virtual environment (such as venv or a conda environment) and cache the entire environment.
Below you can find a code example with conda on how to cache an entire installed environment:
variables:
CONDA_ENV_NAME: "unit_test"
# set $(CONDA) environment variable to your conda path (pre-populated on 'ubuntu-latest' VMs)
CONDA_ENV_DIR: $(CONDA)/envs/$(CONDA_ENV_NAME)
steps:
- script: echo "##vso[task.prependpath]$CONDA/bin"
displayName: Add conda to PATH
- task: Cache#2
displayName: Use cached Anaconda environment
inputs:
key: 'conda | "$(Agent.OS)" | requirements.txt'
path: $(CONDA_ENV_DIR)
cacheHitVar: CONDA_CACHE_RESTORED
- bash: conda create --yes --quiet --name $(CONDA_ENV_NAME)
displayName: Create Anaconda environment
condition: eq(variables.CONDA_CACHE_RESTORED, 'false')
- bash: |
source activate $(CONDA_ENV_NAME)
pip install -r requirements.txt
displayName: Install dependencies
condition: eq(variables.CONDA_CACHE_RESTORED, 'false')
# Optional step here: Install your package (do not cache this step)
- bash: |
source activate $(CONDA_ENV_NAME)
pip install --no-deps .
pytest .
displayName: Install package and execute unit tests
Related
I have two repositories.
The first is built by AzureDevOps Pipelines into a whl file and published on a Azure DevOps Artifact feed. (works)
The second should also be built by AzureDevOps Pipelines and published on Azure DevOps artifacts -> but it is dependend on the first one and needs to install it from the AzureDevops Artifact feed during the build-process. (this does not work).
I can install it locally, but the pipeline of the second package fails. When the pipeline fails, I get the following error:
401 Client Error: Unauthorized for url:
https://pkgs.dev.azure.com/<company>/<some-numbers>/_packaging/<some-numbers>/pypi/download/<mypackage>/0.0.1.9/<mypackage>-0.0.1.9-py3-none-any.whl#sha256=<some-numbers>
---------------------------------- SETUP ----------------------------------
I added the feed as a secondary source to the pyproject.toml of my second repository, this allows me to successfully install the first package with poetry add <firstpackage> and poetry install for my local IDE:
[[tool.poetry.source]]
name = "azure"
url = "https://pkgs.dev.azure.com/<company>/<some-numbers>/_packaging/<feed-name>/pypi/simple/"
secondary = true
YAML script to install packages via poetry - works for the first repository, but not for the second repository which needs to install the first package from the Azure DevOps artifcats feed (the first installs everything from pypi.org):
- script: |
python -m pip install -U pip
pip install poetry==1.1.3 # Install poetry via pip to pin the version
poetry install
displayName: Install software
YAML script to publish a package to an Azure DevOps artifact feed (with a personal access token as authentification) - works:
- script: |
poetry config repositories.azure https://pkgs.dev.azure.com/<company>/<somenumbers>/_packaging/<feed-name>/pypi/upload/
poetry config http-basic.azure usernamedoesnotmatter $(pat)
poetry publish --repository azure
exit 0
displayName: Publish package
I am not checking my Personal Access Token (PAT) into my repository.
pyproject.toml (partial):
[[tool.poetry.source]]
name = "azure"
url = "https://pkgs.dev.azure.com/<company>/<some-numbers>/_packaging/<feed-name>/pypi/simple/"
secondary = true
I added the PipAuthenticate#1 task that sets the PIP_EXTRA_INDEX_URL environment variable that contains a PAT. In the script, I extract the PAT and use it to configure poetry.
azure-pipelines.yaml (partial):
- task: PipAuthenticate#1
displayName: 'Pip Authenticate'
inputs:
artifactFeeds: '<some-numbers>/<feed-name>'
onlyAddExtraIndex: True
- script: |
python -m pip install --upgrade pip
pip install poetry
export PAT=$(echo "$PIP_EXTRA_INDEX_URL" | sed 's/.*build:\(.*\)#pkgs.*/\1/')
poetry config http-basic.azure build "$PAT"
poetry install
displayName: "Install dependencies"
Turns out, I just needed to configure poetry in the pipeline before the install for the second repository - same as I did locally, some long time ago (and forgot about it).
- script: |
python -m pip install -U pip
pip install poetry==1.1.3 # Install poetry via pip to pin the version
# configuring the feed as a secondary source for poetry
poetry config repositories.azure https://pkgs.dev.azure.com/<company>/<some-numbers>/_packaging/<feed-name>/pypi/simple/
poetry config http-basic.azure userNameDoesntMatter $(pat)
poetry install
displayName: Install software
Requesting your help with Azure Artifact connection from Azure Pipelines.
My Azure Pipeline is building a image from docker file and a 'requirements' file has list of packages to be pip installed. In the pipeline I authenticate to my Azure Artifacts feed using PipAuthenticate#1 task and the authentication is successfull and the URL is passed as an argument to the docker file.
However, I can see that the packages are getting installed from external links, but are not downloaded to my artifact feed.
The artifact feed 'testartifact' is currently empty and so it is correctly going to the external link to download the package. But I was expecting the package to be then saved in 'testartifact' feed so that next docker build, it takes the package directly from testartifact feed. Is my assumption correct?
If so, could you help on if I am missing something in the code due to which the package is not getting saved to my artifact.
Here is the Azure Pipeline yaml file and the docker file. Also attached the log of package download.
Thanks for your time!
pool:
vmImage: 'ubuntu-latest'
# Set variables
variables:
imageversion: 1.0
artifactFeed: testartifact
stages:
- stage: DevDeploy
jobs:
- job: DevBuildandPushImage
steps:
- bash: echo DevDeploy
- task: PipAuthenticate#1
displayName: 'Pip Authenticate'
inputs:
artifactFeeds: $(artifactFeed)
onlyAddExtraIndex: true
- bash: echo "##vso[task.setvariable variable=artifactoryUrl;]$PIP_EXTRA_INDEX_URL"
- bash: echo $PIP_EXTRA_INDEX_URL
- task: Docker#2
inputs:
containerRegistry: 'testcontaineregistry'
repository: 'testrepository'
command: 'build'
Dockerfile: '**/dockerfile'
arguments: '--build-arg PIP_EXTRA_URL=$(PIP_EXTRA_INDEX_URL)'
Part of the dockerfile
ARG PIP_EXTRA_URL
ENV PIP_EXTRA_INDEX_URL=$PIP_EXTRA_URL
RUN echo 'PIP_EXTRA_INDEX_URL'$PIP_EXTRA_INDEX_URL
# Install Python Packages & Requirements
COPY requirements requirements
RUN pip3 install -r requirements --extra-index-url $PIP_EXTRA_URL
part of the log
2020-07-16T17:39:05.0301632Z Step 8/28 : RUN echo 'PIP_EXTRA_INDEX_URL'$PIP_EXTRA_INDEX_URL
2020-07-16T17:39:05.4787725Z PIP_EXTRA_INDEX_URLhttps://build:***#XXXXXXX.pkgs.visualstudio.com/_packaging/testartifact/pypi/simple
2020-07-16T17:39:06.1264997Z Step 9/28 : COPY requirements requirements
2020-07-16T17:39:07.0309036Z Step 10/28 : RUN pip3 install -r requirements --extra-index-url $PIP_EXTRA_URL
2020-07-16T17:39:08.3873873Z Collecting pypyodbc (from -r requirements (line 1))
2020-07-16T17:39:08.7139882Z Downloading https://files.pythonhosted.org/packages/ea/48/bb5412846df5b8f97d42ac24ac36a6b77a802c2778e217adc0d3ec1ee7bf/pypyodbc-1.3.5.2.zip
2020-07-16T17:39:08.9900873Z Collecting pyodbc (from -r requirements (line 2))
2020-07-16T17:39:09.2421266Z Downloading https://files.pythonhosted.org/packages/81/0d/bb08bb16c97765244791c73e49de9fd4c24bb3ef00313aed82e5640dee5d/pyodbc-4.0.30.tar.gz (266kB)
2020-07-16T17:39:09.4960835Z Collecting xlrd (from -r requirements (line 3))
2020-07-16T17:39:09.6500787Z Downloading https://files.pythonhosted.org/packages/b0/16/63576a1a001752e34bf8ea62e367997530dc553b689356b9879339cf45a4/xlrd-1.2.0-py2.py3-none-any.whl (103kB)
2020-07-16T17:39:09.6782714Z Collecting pandas (from -r requirements (line 4))
2020-07-16T17:39:10.2506552Z Downloading https://files.pythonhosted.org/packages/c0/95/cb9820560a2713384ef49060b0087dfa2591c6db6f240215c2bce1f4211c/pandas-1.0.5-cp36-cp36m-manylinux1_x86_64.whl (10.1MB)
2020-07-16T17:39:11.4371150Z Collecting datetime (from -r requirements (line 5))
2020-07-16T17:39:11.6083120Z Downloading https://files.pythonhosted.org/packages/73/22/a5297f3a1f92468cc737f8ce7ba6e5f245fcfafeae810ba37bd1039ea01c/DateTime-4.3-py2.py3-none-any.whl (60kB)
2020-07-16T17:39:11.6289946Z Collecting azure-storage-blob (from -r requirements (line 6))
From the task log, the Python packages are restored from external links. You need to make sure that the packages are installed from Feed upstream source . Then the package will exist in the feed after installation.
Here are the steps:
Step1: Add the Python Upstream source to feed.
Step2: Use PipAuthenticate task to get the $PIP_EXTRA_INDEX_URL
Step3: Use the $PIP_EXTRA_INDEX_URL to install package from feed.
pip install -r requirements.txt --index-url $PIP_EXTRA_INDEX_URL
Note: The step 2 and 3 are already existing in your yaml file. But the pip install script seems to have issue. You need to directly add the --index-url parameter.
Then the packages are installed from feed upstream source.
In this case, these packages will also exist in the feed.
I'm setting up Gitlab CI/CD to automate deployment to heroku app with every push.
currently my .gitlab-ci.yml file looks like
production:
type: deploy
script:
- apt-get update -qy
- apt-get install -y ruby-dev
- gem install dpl
- dpl --provider=heroku --app=koober-production --api-key=$HEROKU_PRODUCTION_API_KEY
only:
- master
This works fine and deployment is successful and application is working.
But, I need to run few commands after successful deployment to migrate database.
At present, I need to do this manually by running command from terminal
heroku run python manage.py migrate -a myapp
How can I automate this to run this command after deployment?
First types are deprecated, you should use stages.
Back to the original question, I think you can use a new stage/type for this purpose.
Declaring something like:
stages:
- build
- test
- deploy
- post_deploy
post_production:
stage: post_deploy
script:
- heroku run python manage.py migrate -a myapp
only:
- master
This should then execute only in the case the deplyment succeds.
Solved using --run flag to run command using dpl
stages:
- deploy
production:
stage: deploy
script:
- apt-get update -qy
- apt-get install -y ruby-dev
- gem install dpl
- dpl --provider=heroku --app=koober-production --api-key=$HEROKU_PRODUCTION_API_KEY --run='python manage.py migrate && python manage.py create_initial_users'
only:
- master
I have the following configuration as .gitlab-ci.yml
but I found out after successfully pass build stage (which
would create a virtualenv called venv), it seems that
in test stage you would get a brand new environment(there's
no venv directory at all). So I wonder should I put setup
script in before_script therefor it would run in each phase(build/test/deploy). Is it a right way to do it ?
before_script:
- uname -r
types:
- build
- test
- deploy
job_install:
type: build
script:
- apt-get update
- apt-get install -y libncurses5-dev
- apt-get install -y libxml2-dev libxslt1-dev
- apt-get install -y python-dev libffi-dev libssl-dev
- apt-get install -y python-virtualenv
- apt-get install -y python-pip
- virtualenv --no-site-packages venv
- source venv/bin/activate
- pip install -q -r requirements.txt
- ls -al
only:
- master
job_test:
type: test
script:
- ls -al
- source venv/bin/activate
- cp crawler/settings.sample.py crawler/settings.py
- cd crawler
- py.test -s -v
only:
- master
adasd
Gitlab CI jobs supposed to be independent, because they could run on different runners. It is not issue. There two ways to pass files between stages:
The right way. Using artefacts.
The wrong way. Using cache. With cache key "hack". Still need same runner.
So yes, supposed by gitlab way to have everything your job depends on in before script.
Artifacts example:
artifacts:
when: on_success
expire_in: 1 mos
paths:
- some_project_files/
Cache example:
cache:
key: "$CI_BUILD_REF_NAME"
untracked: true
paths:
- node_modules/
- src/bower_components/
For correct running environment i suggest using docker with image containing apt-get dependencies. And use artefacts for passing job results between jobs. Note that artefact also uploaded to gitlab web interface and being able to download them. So if they are quite heavy use small expire_in time, for removing them after all jobs done.
I have just installed gitlab-ci-multi-runner by following the documentation https://gitlab.com/gitlab-org/gitlab-ci-multi-runner/blob/master/docs/install/linux-repository.md
I use the public server ci.gitlab.com and the registration of the runner seems OK (the runner appears with a green light).
With debug activated I can see that the runner fetch regularly the CI server.
But when a new commit is pushed no build is done.
Everything is green: https://ci.gitlab.com/projects/4656 but no test is done...
My .gitlab-ci.yml is pretty simple:
before_script:
- apt install python3-pip
- pip3 install -q -r requirements.txt
master:
script: "make test"
only:
- master
script:
- python setup.py test
By the way I can find any error message and I don't know where to search.
I am pretty knew to CI and there is perhaps an obvious point I am missing.
Give this a try. this is assuming your pyunit tests are in a file called runtests.py in the working directory.
before_script:
- apt install python3-pip
- pip3 install -q -r requirements.txt
master:
script: "python runtests.py"
only:
- master