I am using dbx cli to deploy my workflow into databricks. I have .dbx/project.json configured below:
{
"environments": {
"default": {
"profile": "test",
"storage_type": "mlflow",
"properties": {
"workspace_directory": "/Shared/dbx/projects/test",
"artifact_location": "dbfs:/dbx/test"
}
}
},
"inplace_jinja_support": false,
"failsafe_cluster_reuse_with_assets": false,
"context_based_upload_for_execute": false
}
Everytime when I run dbx deploy ..., it stores my tasks scripts into the DBFS with some hash folder. If I ran 100 times dbx deploy ..., it creates 100 hash folders to store my artifacts.
Questions
How do I clean up the folders ?
Any retention policy or rolling policy that keeps the last X folders only ?
Is there a way to reuse the same folder everytime we deploy ?
As you can see, there are alot of folders generated whenever we ran dbx deploy. We just want to use the latest, the older one is not needed any more
author of dbx here.
There is a built-in command that cleans up the workspace and the artifact location:
dbx destroy ...
Please carefully read the documentation before running this command.
I finally found a way to remove the old DBFS files. I just ran dbfs rm -r dbfs:/dbx/test before running deploy. This method is not ideal because if you have running cluster or cluster pending to start, it will fail due to the previous hash folder is being removed. Instead of depending on DBFS, i have configure my workflow to use GIT, this way i can remove the DBFS data without worrying any job is using it. It is strange that databricks still generate hash folder although no artifacts is uploaded to DBFS file system while using GIT as workspace
Related
For one of my project the TestResults.zip file is publishing on url https://dev.azure.com/MyProject/Project/_TestManagement/Runs?runId=8309822&_a=runCharts.
I want to change storage location for TestResults.zip file from above given URL to my own defined repository location.(Like: Myproject/target/surefire-reports.zip) How to do that?
Because in my azure pipeline the test are running and when it comes to create a zip for TestResults it's storing in given above URL and i want to store in one of my project sub-module under target directory so that i can create a zip file.
Please help me to resolve this issue.
Many Thanks
How to modify the repository / storage location for Test Results in azure pipeline
It is not recommended to publish the test results to the repo.
If you still want do that, you could use git command line to submit the test file to the repo in the pipeline after the test task, like:
cd $(System.DefaultWorkingDirectory)
copy Path\surefire-reports.zip target
cd target
git add surefire-reports.zip
git commit -m "Add a test file"
git push https://PATn#dev.azure.com/YourOrganization/YourProject/_git/xxx.test HEAD:master
You could check the similar thread for some more details.
I'm running Cypress in one of my release stages and it gives me this output:
Finished processing: D:\a\r1\a\_ClientWeb-Build-CI\ShellArtifact\tests\integration\cypress\videos\onboarding.spec.js.mp4 (0 seconds)
I have 2 questions:
Is the path name relative to the app service? If I have a app service called randomname and run the Cypress Stage on that randomname app service should I be able to find tCypresshe output in randomname.scm.azurewebsites.net.
If I go into the scm debug console and I do cd D:\a\ I get:
cd : Cannot find path 'D:\a\' because it does not exist.
So how do I actually access my Cypress test results?
I've also tried archiving the files into a zip file:
In the output of the task step I see:
Creating archive: d:\home\testing\somefile.zip
But when I try to access the D:/home/testing folder on my appname.scm.azurewebsites.net I get:
cd : Cannot find path 'D:\home\testing' because it does not exist.
The path D:\a\r1\a is inside the hosted agent that run the release pipeline, is not in your application.
The same thing is for the zip file, when you specify d:/home/... is in the agent.
After the release is finish all the files are deleted, so you need to save the file in another place (maybe in azure?) during the pipeline, for example, with "Azure File Copy" task.
So we are currently moving away from our current deployment provider: Beanstalk, which is great but we are on the top tier and we keep running out of space or hitting our repository limits. So we are moving away so please do not suggest any other SaaS provider.
I personally use Gitlab for my own projects and a few company projects and it's amazing we use a self hosted version on our local server in our company building.
We have CI setup and currently are using the following deployment code (I have minified the bits just to the deployment for development) - this uses the shell executer for deploying as we deploy to an existing linux server.
variables:
HOSTNAME: '<hostname>'
USERNAME: '<username>'
PASSWORD: '<password>'
PATH_DEV: '/path/to/www'
# Define the stages (we can add as many as we want)
stages:
# - build
- deploy
# The code for development deployment
deploy_dev:
stage: deploy
script:
- echo "Deploying to development environment..."
- rm .gitlab-ci.yml
- rsync -urltvz --filter=':- .gitignore' --exclude=".git" -e "sshpass -p"$PASSWORD" ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null" * $USERNAME#$HOSTNAME:$PATH_DEV
- echo "Finished deploying."
environment:
name: Development
url: http://dev.domain.com
only:
- envdev
The Problem:
When we use the above code to deploy it's perfect and works really well, and it deploys all the code after optimisation etc, but we have found a little bug here.
When you delete a file then the rsync command will not delete the file, now I did some searching and found the --remove flag you can add, and it worked - but it deleted all the user uploaded content as well. Now I added the .gitignore in to the filtering, so it would ignore some the files in their (which are usually user generated) or configuration files or/and libraries (npm, etc.). This is fine until a user started uploading files using the media manager in our framework which stores in a folder that is not in the .gitignore file and it can't because it contains other files, as we also add our own files in there so they're editable by the user, so now I am unsure how to manage this.
What we are looking for is a CI setup, which will upload file changes to the server, so it would search through the latest commits, and find the latest files that have been changed and then push only them files up. Of course I would like to do this with the Gitlab CI still, so any ideas examples or tutorials would be amazing.
Thanks in advance.
~ Danny
May it helps: https://github.com/banago/PHPloy
Looks this tool designed for php project, but I think it can use other web deployment.
how it works:
PHPloy stores a file called .revision on your server. This file contains the hash of the commit that you have deployed to that server. When you run phploy, it downloads that file and compares the commit reference in it with the commit you are trying to deploy to find out which files to upload. PHPloy also stores a .revision file for each submodule in your repository.
I'm using Azure functions with GitHub deployment. I would like to place the function in src/server/functionName/ within the repo, but the deployment works only if the function is placed directly in functionName/
How do I deploy functions that are placed in subdirectories?
The documentation states
your host.json file and function folders should be directly at the root of what you deploy.
but "should" doesn't mean "must", right?
What I tried:
Various combinations of locations of host.json and function.json
In host.json I set routePrefix but it seems to affect only the function's URL: "http": { "routePrefix": "src/server" },
There are a few ways you can customize the deployment process. One way is by adding a custom deployment script to your repository root. When a .deployment script exists Azure will run that script as part of the deployment process as detailed here. E.g. you can write a simple script that copies the files and directories from your sub directory \src\server to the root, e.g.:
#echo off
echo Deploying Functions ...
xcopy "%DEPLOYMENT_SOURCE%\src\server" %DEPLOYMENT_TARGET% /Y /S
If you don't want to commit a .deployment file to your repo and your requirements are relatively simple, you can also do this via app settings by adding a PROJECT app setting to your function app with the value being your sub directory, e.g. src\server.
Try setting the AZURE-FUNCTIONAPP_PACKAGE_PATH variable in .github/workflows/<something>.yml:
env:
AZURE_FUNCTIONAPP_PACKAGE_PATH: 'azure/api' # set this to the path of your web app project, defaults to the repository root
DOTNET_VERSION: '3.1.301' # set this to the dotnet version to use
Is it possible to rename a folder without the whole process of making a new directory, copy files into new directory, then remove the old directory?
This process takes several minutes to complete, I am forced to use a batch script to rename the folders, I'd prefer it all to be handled by Grunt. Looking through the Node docs it appears there is no way to rename folders similar to the way 'mv' or 'rename' commands work.
The use-case is for a faster deployment workflow with Grunt on an intranet site. I'd like minimal down time, 2 minutes of downtime to copy files is not ideal.
I stage my website on the server in www/test.
I then rename www/prod to www/archived
Then rename www/test to www/prod making the new site live.
Using grunt-shell solved my problem, however you have to warn future developers which platform the shell commands are meant for, in my case Windows.
shell: {
options: {
stderr: false
},
'archiveToDelete': {
command: 'rename <%= yeoman.winserver %>\archived delete-this'
},
'liveToArchive': {
command: 'rename <%= yeoman.winserver %>\prod archived'
},
'deployToLive': {
command: 'rename <%= yeoman.winserver %>\test prod'
},
}
I think you are looking for https://nodejs.org/api/fs.html#fs_fs_rename_oldpath_newpath_callback
which will do a 'mv' equivalent command.
Not a direct answer, but for this exact same task I prefer to use symbolic links rather than renames:
I upload my files as www/20150602-214412 (exact timestamp of the build)
I remove existing symlinks www/prod and create new symlink to my timestamped build
That way I have as many archives as I want, and I know exactly when the files were deployed.
In grunt I use grunt-contrib-symlink and grunt-contrib-clean