Automating Linux EBS snapshots backup and clean-up - linux

Are there any good updated shell scripts for EBS snapshots to S3, and clean-up of older snapshots?
I looked through SO, but mostly are from 2009, referring to link that either broken or outdated.
Thanks.

Try the following shell-script, I use this to create snapshot for most of my projects and it works well.
https://github.com/rakesh-sankar/Tools/blob/master/AmazonAWS/EBS/EBS-Snapshot.sh
You can give me pull-request/fork the project to add the functionality of cleaning-up the old entries. Also watch for this repo, when I find some time I will update the code to have clean-up functionality.

If it is ok to use PHP as shel script you can use my latest script with latest AWS PHP SDK. This is much simpler because you do not need to setup environment. Just feed script your API keys.
How to setup
Open SSH connection to your server.
Navigate to folder
$ cd /usr/local/
Clon this gist into ec2 folder
$ git clone https://gist.github.com/9738785.git ec2
Go to that folder
$ cd ec2
Make backup.php executable
$ chmod +x backup.php
Open releases of the AWS PHP SDK github project and copy URL of aws.zip button. Now download it into your server.
$ wget https://github.com/aws/aws-sdk-php/releases/download/2.6.0/aws.zip
Unzip this file into aws directory.
$ unzip aws.zip -d aws
Edit backup.php php file and set all settings in line 5-12
$dryrun = FALSE;
$interval = '24 hours';
$keep_for = '10 Days';
$volumes = array('vol-********');
$api_key = '*********************';
$api_secret = '****************************************';
$ec2_region = 'us-east-1';
$snap_descr = "Daily backup";
Test it. Run this script
$ ./backup.php
Test is snapshot was created.
If everything is ok just add cronjob.
* 23 * * * /usr/local/ec2/backup.php

Related

Execute shell script for database backup

I have a ReactJS-neo4j application, deployed on a cloud server. Currently, i create backups of my databases manually.
Now I want to automate this process. I want to automatically execute the above query every day
Can anyone tell me how to automate the above process ?
You need to change your neo4j configuration file found in <HOME_neo4j>/conf/neo4j.conf as below. The location of the file is different if you are not using Linux server, like Debian.
apoc.export.file.enabled=true
apoc.import.file.use_neo4j_config=false
The 2nd line will enable you to save the json file from default folder "import" to any folder you want.
Then open a terminal (or ssh) that connects to your cloud server. Go to <HOME_neo4j> directory where cypher-shell is installed. Copy and run this one liner script below.
echo "CALL apoc.export.json.all(\"/home/backups/deploymentName/backup_mydeployment.json\", { useTypes: true } )" | bin/cypher-shell -u neo4j -p <awesome_psw> --format plain
This will save the json file in /home/backups/deploymentName just like what you do in your neo4j browser.
I will leave it up to you on 1) how to add the timestamp YYMMDD0000_ in the filename via linux command and 2) schedule the job every midnight via crontab. Goodluck!

Databricks init scripts not working sometimes

Ok, it is very strange. I have some init scripts that I would like to run when a cluster starts
cluster has the init script , which is in a file (in dbfs)
basically this
dbfs:/databricks/init-scripts/custom-cert.sh
Now , when I make the init script like this, it works (no ssl errors for my endpoints. Also, the event logs for the cluster shows the duration as 1 second for the init script
dbutils.fs.put("/databricks/init-scripts/custom-cert.sh", """#!/bin/bash
cp /dbfs/orgcertificates/orgcerts.crt /usr/local/share/ca-certificates/
sudo update-ca-certificates
echo "export REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt" >> /databricks/spark/conf/spark-env.sh
""")
However, if I just put the init script in an bash script and upload it to DBFS through a pipeline, the init script does not do anything. It executes , as per the event log but the execution duration is 0 sec.
I have the sh script in a file named
custom-cert.sh
with the same contents as above, i.e.
#!/bin/bash
cp /dbfs/orgcertificates/orgcerts.crt /usr/local/share/ca-certificates/
sudo update-ca-certificates
echo "export REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt"
but when I check /usr/local/share/ca-certificates/ , it does not contain /dbfs/orgcertificates/orgcerts.crt, even though the cluster init script has run.
Also, I have compared the contents of the init script in both cases and it least to the naked eye, I can't figure out any difference
i.e.
%sh
cat /dbfs/databricks/init-scripts/custom-cert.sh
shows the same contents in both the scenarios. What is the problem with the 2nd case?
EDIT: I read a bit more about init scripts and found that the logs of init scripts are written here
%sh
ls /databricks/init_scripts/
Looking at the err file in that location, it seems there is an error
sudo: update-ca-certificates
: command not found
Why is it that update-ca-certificates found in the first case but not when I put the same script in a sh script and upload it to dbfs (instead of executing the dbutils.fs.put within a notebook) ?
EDIT 2: In response to the first answer. After running the command
dbutils.fs.put("/databricks/init-scripts/custom-cert.sh", """#!/bin/bash
cp /dbfs/orgcertificates/orgcerts.crt /usr/local/share/ca-certificates/
sudo update-ca-certificates
echo "export REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt" >> /databricks/spark/conf/spark-env.sh
""")
the output is the file custom-cert.sh and then I restart the cluster with the init script location as dbfs:/databricks/init-scripts/custom-cert.sh and then it works. So, it is essentially the same content that the init script is reading (which is the generated sh script). Why can't it read it if I do not use dbfs put but just put the contents in bash file and upload it during the CI/CD process?
As we aware, An init script is a shell script that runs during startup of each cluster node before the Apache Spark driver or worker JVM start. case-2 When you run bash
command by using of %sh magic command means you are trying to execute this command in Local driver node. So that workers nodes is not able to access . But based on
case-1 , By using of %fs magic command you are trying run copy command (dbutils.fs.put )from root . So that along with driver node , other workers node also can access path .
Ref : https://docs.databricks.com/data/databricks-file-system.html#summary-table-and-diagram
It seems that my observations I made in the comments section of my question is the way to go.
I now create the init script using a databricks job that I run during the CI/CD pipeline from Azure DevOps.
The notebook has the commands
dbutils.fs.rm("/databricks/init-scripts/custom-cert.sh")
dbutils.fs.put("/databricks/init-scripts/custom-cert.sh", """#!/bin/bash
cp /dbfs/internal-certificates/certs.crt /usr/local/share/ca-certificates/
sudo update-ca-certificates
echo "export REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt" >> /databricks/spark/conf/spark-env.sh
""")
I then create a Databricks job (pointing to this notebook), the cluster is a job cluster which is just temporary . Of course , in my case , even this job creation is automated using a powershell script.
I then call this Databricks job in the release pipeline using again a Powershell script.
This creates the file
/databricks/init-scripts/custom-cert.sh
I then use this file in any other cluster that accesses my org's endpoints (without certificate errors).
I do not know (or still understand), why can't the same script file be just part of a repo and uploaded during the release process (instead of it being this Databricks job calling a notebook). I would love to know the reason . The other answer on this question does not hold true as you can see, that the cluster script is created by a job cluster and then accessed from another cluster as part of its init script.
It simply boils down to how the init script gets created.
But I get my job done. Just if it helps someone get their job done too.
I have raised a support case though to understand the reason.

NPM package `pem` doesn't seem to work in AWS lambda NodeJS 10.x (results in OpenSSL error)

When I run the function locally on NodeJS 11.7.0 it works, when I run it in AWS Lambda NodeJS 8.10 it works, but I've recently tried to run it in AWS Lambda NodeJS 10.x and get this response and this error in Cloud Watch.
Any thoughts on how to correct this?
Response
{
"success": false,
"error": "Error: Could not find openssl on your system on this path: openssl"
}
Cloudwatch Error
ERROR (node:8) [DEP0005] DeprecationWarning: Buffer() is deprecated due to security and usability issues. Please use the Buffer.alloc(), Buffer.allocUnsafe(), or Buffer.from() methods instead.
Function
...
const util = require('util');
const pem = require('pem');
...
return new Promise((fulfill) => {
require('./certs').get(req, res, () => {
return fulfill();
});
}).then(() => {
const createCSR = util.promisify(pem.createCSR);
//This seems to be where the issue is coming from
return createCSR({
keyBitsize: 1024,
hash: HASH,
commonName: id.toString(),
country: 'US',
state: 'Maryland',
organization: 'ABC', //Obfuscated
organizationUnit: 'XYZ', //Obfuscated
});
}).then(({ csr, clientKey }) => {
...
}).then(async ({ certificate, clientKey }) => {
...
}, (err) => {
return res.status(404).json({
success: false,
error: err,
});
});
...
I've tried with
"pem": "^1.14.3", and "pem": "^1.14.2",
I tried the answer documented by #Kris White, but I was not able to get it to work. Each execution resulted in the error Could not find openssl on your system on this path: /opt/openssl. I tried several different paths and approaches, but none worked well. It's entirely possible that I simply didn't copy the OpenSSL executable correctly.
Since I needed a working solution, I used the answer provided by #Wilfred Dittmer. I modified it slightly since I wasn't using Docker. I launched an Amazon Linux 2 server, built OpenSSL on it, transferred the package to my local machine, and deployed it via Serverless.
Create a file named create-openssl-zip.sh with the following contents. The script will create the Lambda Layer OpenSSL package.
#!/bin/bash -x
# This file should be copied to and run inside the /tmp folder
yum update -y
yum install autoconf bison gcc gcc-c++ libcurl-devel libxml2-devel -y
curl -sL http://www.openssl.org/source/openssl-1.1.1d.tar.gz | tar -xvz
cd openssl-1.1.1d
./config --prefix=/tmp/nodejs/openssl --openssldir=/tmp/nodejs/openssl && make && make install
cd /tmp
rm -rf nodejs/openssl/share nodejs/openssl/include
zip -r lambda-layer-openssl.zip nodejs
rm -rf nodejs openssl-1.1.1d
Then, follow these steps:
Open a terminal session in this project's root folder.
Run the following command to upload the Linux bash script.
curl -F "file=#create-openssl-zip.sh" https://file.io
Note: The command above uses the popular tool File.io to copy the script to the cloud temporarily so it can be securely retrieved from the build server.
Note: If curl is not installed on your dev machine, you can also upload the script manually using the File.io website.
Copy the URL for the uploaded file from either the terminal session or the File.io website.
Note: The url will look similar to this example: https://file.io/a1B2c3
Open the AWS Console to the EC2 Instances list.
Launch a new instance with these attributes:
AMI: Amazon Linux 2 AMI (HVM), SSD Volume Type (id: ami-0a887e401f7654935)
Instance Type: t2.micro
Instance Details: (use all defaults)
Storage: (use all defaults)
Tags: Name - 'build-lambda-layer-openssl'
Security Group: 'Create new security group' (use all defaults to ensure Instance will be publicly accessible via SSH over the internet)
When launching the instance and selecting a key pair, be sure to choose a Key Pair from the list to which you have access.
Launch the instance and wait for it to be accessible.
Once the instance is running, use an SSH Client to connect to the instance.
More details on how to open an SSH connection can be found here.
In the SSH terminal session, navigate to the tmp directory by running cd /tmp.
Download the bash script uploaded earlier by running curl {FILE_IO_URL} --output create-openssl-zip.sh.
Note: In the script above, replace FILE_IO_URL with the URL returned from File.io and copied in step 3.
Execute the bash script by running sudo bash ./create-openssl-zip.sh. The script may take a while to complete. You may need to confirm one or more package install prompts.
When the script completes, run the following command to upload the package to File.io: curl -F "file=#lambda-layer-openssl.zip" https://file.io.
Copy the URL for the uploaded file from the terminal session.
In the terminal session on the local development machine, run the following command to download the file: curl {FILE_IO_URL} --output lambda-layer-openssl.zip.
Note: In the script above, replace FILE_IO_URL with the URL returned from File.io and copied in step 13.
Note: If curl is not installed on your dev machine, you can also download the file manually by pasting the copied URL in the address bar of your favorite browser.
Close the SSH session.
In the EC2 Instances list, terminate the build-lambda-layer-openssl EC2 instance since it is not needed any longer.
The OpenSSL Lambda Layer is now ready to be deployed.
For completeness, here is a portion of my serverless.yml file:
functions:
functionName:
# ...
layers:
- { Ref: OpensslLambdaLayer }
layers:
openssl:
name: ${self:provider.stage}-openssl
description: Contains openssl command line utility for lambdas that need it
package:
artifact: 'path\to\lambda-layer-openssl.zip'
compatibleRuntimes:
- nodejs10.x
- nodejs12.x
retain: false
...and here is how I configured PEM in the code file:
import * as pem from 'pem';
process.env.LD_LIBRARY_PATH = '/opt/nodejs/openssl/lib';
pem.config({
pathOpenSSL: '/opt/nodejs/openssl/bin/openssl',
});
// other code...
I contacted AWS Support about this and it turns out that the openssl library is still on the Node10x image, just not the command line utility. However, it's pretty easy to just grab it off a standard AMI and use it as a Lambda layer.
Steps:
Launch an Amazon Linux 2 AMI as an EC2
SSH into the box, or use an SFTP utility to connect to the box
Copy the command line utility for openssl at /usr/bin/openssl somewhere you can work with it locally. In my case I downloaded it to my Mac even though it is a Linux file.
Verify that it's still marked as executable (chmod a+x openssl if necessary if you've downloaded it elsewhere)
Zip up the file
Optional: Upload it to an S3 bucket you can get to
Go to Lambda Layers in the AWS console
Create a new lambda layer. I named mine openssl and used the S3 pointer to the file on S3. You can also upload the zip directly if you have it on a local file system.
Attach the arn provided for the layer to your Lambda function. I use serverless so it was defined in the function setup per their documentation.
In your code, reference openssl as /opt/openssl or you can avoid pathing it in your code (or may not have an option if it's a package you don't control) by adding /opt to you path, i.e.
process.env['PATH'] = process.env['PATH'] + ':' + process.env['LAMBDA_TASK_ROOT'] + ':/opt';
The layer will have been unzipped for you and because you set it to be executable beforehand, it should just work. The underlying openssl libraries are there, so just copying the cli works just fine.
What you can do is to create a lambda layer with the openssl library.
Using the lambdaci/lambda:build-nodejs10.x you can compile the openssl library and create a zip file from the install. The zip file you can then use as a layer for your lambda.
Create a file called create-openssl-zip.sh and make sure to chmod u+x it.
#!/bin/bash -x
# This file should be run inside the lambci/lambda:build-nodejs10.x container
yum update -y
yum install autoconf bison gcc gcc-c++ libcurl-devel libxml2-devel -y
curl -sL http://www.openssl.org/source/openssl-1.1.1d.tar.gz | tar -xvz
cd openssl-1.1.1d
./config --prefix=/var/task/nodejs/openssl --openssldir=/var/task/nodejs/openssl && make && make install
cd /var/task/
rm -rf nodejs/openssl/share
rm -rf nodejs/openssl/include
zip -r lambda-openssl-layer.zip nodejs
cp lambda-openssl-layer.zip /opt/layer/
Then run:
docker run -it -v `pwd`:/opt/layer lambci/lambda:build-nodejs10.x /opt/layer/create-openssl-zip.sh
This will run the script inside the docker container and when it is done you have a file called lambda-openssl-layer.zip in your current directory.
Upload this lambda to an s3 bucket and create a lambda layer.
On your original lambda, add this layer and modify your code so that the PEM library knows where to look for the OpenSSL library as follows:
PEM.config({
pathOpenSSL: '/opt/nodejs/openssl/bin/openssl'
})
And finally add an extra environment variable to your lambda called LD_LIBRARY_PATH with value /opt/nodejs/openssl/lib
Otherwise it will fail with:
/opt/nodejs/openssl/bin/openssl: error while loading shared libraries: libssl.so.1.1: cannot open shared object file: No such file or directory
PEM NPM docs says:
Setting openssl location
In some systems the openssl executable might not be available by the default name or it is not included in $PATH. In this case you can define the location of the executable yourself as a one time action after you have loaded the pem module:
So I think it is not able to find OpenSSL path in system you can try configuring it programmatically :
var pem = require('pem')
pem.config({
pathOpenSSL: '/usr/local/bin/openssl'
})
As you are using AWS Lambda so just try printing process.env.path you will get idea of whether OpenSSL is included in path env variable or not.
You can also check 'OpenSSL' by running below code
const exec = require('child_process').exec;
exec('which openssl',function(err,stdopt,stderr){
console.log(err ? err : stdopt);
})
UPDATE
As #hoangdv mentioned in his answer openssl is seems to be removed for node10.x runtime and I think he is right. Also, we have read-only access to file system so we can't do much.
#Seth McClaine, you can give try for node-forge npm module. One of the module built on top of this is 'https://github.com/jfromaniello/selfsigned' which will make your task easier
https://github.com/lambci/git-lambda-layer/issues/13#issue-444697784 (announcement email)
It seem openssl has been removed in nodejs10.x runtime.
I have checked again on lambci/lambda:build-nodejs10.x docker image and confirmed that. Maybe, you need to change your runtime version or find another way to createCSR.
which: no openssl in (/var/lang/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/bin)

gcloud app deploy does not remove previous versions

I am running a Node.js app on Google App Engine, using the following command to deploy my code:
gcloud app deploy --stop-previous-version
My desired behavior is for all instances running previous versions to be terminated, but they always seem to stick around. Is there something I'm missing?
I realize they are not receiving traffic, but I am still paying for them and they cause some background telemetry noise. Is there a better way of running this command?
Example output of the gcloud app instances list:
As you can see I have two different versions running.
We accidentally blew through our free Google App Engine credit in less than 30 days because of an errant flexible instance that wasn't cleared by subsequent deployments. When we pinpointed it as the cause it had scaled up to four simultaneous instances that were basically idling away.
tl;dr: Use the --version flag when deploying to specify a version name. An existing instance with the same version will be
replaced then next time you deploy.
That led me down the rabbit hole that is --stop-previous-version. Here's what I've found out so far:
--stop-previous-version doesn't seem to be supported anymore. It's mentioned under Flags on the gcloud app deploy reference page, but if you look at the top of the page where all the flags are listed, it's nowhere to be found.
I tried deploying with that flag set to see what would happen but it seemingly had no effect. A new version was still created, and I still had to go in and manually delete the old instance.
There's an open Github issue on the gcloud-maven-plugin repo that specifically calls this out as an issue with that plugin but the issue has been seemingly ignored.
At this point our best bet at this point is to add --version=staging or whatever to gcloud deploy app. The reference docs for that flag seem to indicate that that it'll replace an existing instance that shares that "version":
--version=VERSION, -v VERSION
The version of the app that will be created or replaced by this deployment. If you do not specify a version, one will be generated for you.
(emphasis mine)
Additionally, Google's own reference documentation on app.yaml (the link's for the Python docs but it's still relevant) specifically calls out the --version flag as the "preferred" way to specify a version when deploying:
The recommended approach is to remove the version element from your app.yaml file and instead, use a command-line flag to specify your version ID
As far as I can tell, for Standard Environment with automatic scaling at least, it is normal for old versions to remain "serving", though they should hopefully have zero instances (even if your scaling configuration specifies a nonzero minimum). At least that's what I've seen. I think (I hope) that those old "serving" instances won't result in any charges, since billing is per instance.
I know most of the above answers are for Flexible Environment, but I thought I'd include this here for people who are wondering.
(And it would be great if someone from Google could confirm.)
I had same problem as OP. Using the flex environment (some of this also applies to standard environment) with Docker (runtime: custom in app.yaml) I've finally solved this! I tried a lot of things and I'm not sure which one fixed it (or whether it was a combination) so I'll list the things I did here, the most likely solutions being listed first.
SOLUTION 1) Ensure that cloud storage deletes old versions
What does cloud storage have to do with anything? (I hear you ask)
Well there's a little tooltip (Google Cloud Platform Web UI (GCP) > App Engine > Versions > Size) that when you hover over it says:
(Google App Engine) Flexible environment code is stored and billed from Google Cloud Storage ... yada yada yada
So based on this info and this answer I visited GCP > Cloud Storage > Browser and found my storage bucket AND a load of other storage buckets I didn't know existed. It turns out that some of the buckets store cached cloud functions code, some store cached docker images and some store other cached code/stuff (you can tell which is which by browsing the buckets).
So I added a deletion policy to all the buckets (except the cloud functions bucket) as follows:
Go to GCP > Cloud Storage > Browser and click the link (for the relevant bucket) in the Lifecycle Rules column > Click ADD A RULE > THEN:
For SELECT ACTION choose "Delete Object" and click continue
For SELECT OBJECT choose "Number of newer versions" and enter 1 in the input
Click CREATE
This will return you to the table view and you should now see the rule in the lifecycle rules column.
REPEAT this process for all relevant buckets (the relevant buckets were described earlier).
THEN delete the contents of the relevant buckets. WARNING: Some buckets warn you NOT to delete the bucket itself, only the contents!
Now re-deploy and your latest version should now get deployed and hopefully you will never have this problem again!
SOLUTION 2) Use deploy flags
I added these flags
gcloud app deploy --quiet --promote --stop-previous-version
This probably doesn't help since these flags seem to be the default but worth adding just in case.
Note that for the standard environment only (I heard on the grapevine) you can also use the --no-cache flag which might help but with flex, this flag caused the deployment to fail (when I tried).
SOLUTION 3)
This probably does not help at all, but I added:
COPY app.yaml .
to the Dockerfile
TIP 1)
This is probably more of a helpful / useful debug approach than a fix.
Visit GCP > App Engine > Versions
This shows all versions of your app (1 per deployment) and it also shows which version each instance is running (instances are configured in app.yaml).
Make sure all instances are running the latest version. This should happen by default. Probably worth deleting old versions.
You can determine your version from the gcloud app deploy logs (at the start of the logs) but it seems that the versions are listed by order of deployment anyway (most recent at top).
TIP 2)
Visit GCP > App Engine > Instances
SSH into an instance. This is just a matter of clicking a few buttons (see screenshot below). Once you have SSH'd in run:
docker exec -it gaeapp /bin/bash
Which will get you into the docker container running your code. Now you can browse around to make sure it has your latest code.
Well I think my answer is long enough now. If this helps, don't thank me, J-ES-US is the one you should thank ;) I belong to Him ^^
Google may have updated their documentation cited in #IAmKale's answer
Note that if the version is running on an instance of an auto-scaled service, using --stop-previous-version will not work and the previous version will continue to run because auto-scaled service instances are always running.
Seems like that flag only works with manually scaled services.
This is a supplementary and optional answer in addition to my other main answer.
I am now, in addition to my other answer, auto incrementing version manually on deploy using a script.
My script contents are below.
Basically, the script auto increments version every time you deploy. I am using node.js so the script uses npm version to bump the version but this line could easily be tweaked to whatever language you use.
The script requires a clean git working directory for deployment.
The script assumes that when the version is bumped, this will result in file changes (e.g. changes to package.json version) that need pushing.
The script essentially tries to find your SSH key and if it finds it then it starts an SSH agent and uses your SSH key to git commit and git push the file changes. Else it just does a git commit without a push.
It then does a deploy using the --version flag ... --version="${deployVer}"
Thought this might help someone, especially since the top answer talks a lot about using the --version flag on a deploy.
#!/usr/bin/env bash
projectName="vehicle-damage-inspector-app-engine"
# Find SSH key
sshFile1=~/.ssh/id_ed25519
sshFile2=~/Desktop/.ssh/id_ed25519
sshFile3=~/.ssh/id_rsa
sshFile4=~/Desktop/.ssh/id_rsa
if [ -f "${sshFile1}" ]; then
sshFile="${sshFile1}"
elif [ -f "${sshFile2}" ]; then
sshFile="${sshFile2}"
elif [ -f "${sshFile3}" ]; then
sshFile="${sshFile3}"
elif [ -f "${sshFile4}" ]; then
sshFile="${sshFile4}"
fi
# If SSH key found then fire up SSH agent
if [ -n "${sshFile}" ]; then
pub=$(cat "${sshFile}.pub")
for i in ${pub}; do email="${i}"; done
name="Auto Deploy ${projectName}"
git config --global user.email "${email}"
git config --global user.name "${name}"
echo "Git SSH key = ${sshFile}"
echo "Git email = ${email}"
echo "Git name = ${name}"
eval "$(ssh-agent -s)"
ssh-add "${sshFile}" &>/dev/null
sshKeyAdded=true
fi
# Bump version and git commit (and git push if SSH key added) and deploy
if [ -z "$(git status --porcelain)" ]; then
echo "Working directory clean"
echo "Bumping patch version"
ver=$(npm version patch --no-git-tag-version)
git add -A
git commit -m "${projectName} version ${ver}"
if [ -n "${sshKeyAdded}" ]; then
echo ">>>>> Bumped patch version to ${ver} with git commit and git push"
git push
else
echo ">>>>> Bumped patch version to ${ver} with git commit only, please git push manually"
fi
deployVer="${ver//"."/"-"}"
gcloud app deploy --quiet --promote --stop-previous-version --version="${deployVer}"
else
echo "Working directory unclean, please commit changes"
fi
For node.js users if you call the script deploy.sh you should add:
"deploy": "sh deploy.sh"
In your package.json scripts and deploy with npm run deploy

Azure batch job start tasks failed

I'm using Azure batch python API. When I'm creating a new job, I see exit code 128 (image attached). How can I know what is the reason for that?
I'm creating a new job using this code :
def wrap_commands_in_shell(commands):
return "/bin/bash -c 'set -e; set -o pipefail; {}; wait'".format(';'.join(commands))
job_tasks = ['cd /mnt/batch/tasks/shared/ && git clone https://github.com/cryptobiu/OSPSI.git',
'cd /mnt/batch/tasks/shared/OSPSI && git checkout cloud',
'cd /mnt/batch/tasks/shared/OSPSI && cmake CMake',
'cd /mnt/batch/tasks/shared/OSPSI && mkdir -p assets'
]
job_creation_information = batch.models.JobAddParameter(job_id, batch.models.PoolInformation(pool_id=pool_id),
job_preparation_task=batch.models.JobPreparationTask(
command_line=wrap_commands_in_shell(
job_tasks),
run_elevated=True,
wait_for_success=True
)
)
To diagnose, you can look at the stderr.txt and stdout.txt for the Job Preparation task that has failed in the Azure Portal, using Azure Batch Explorer, or using an SDK via code. If you look at which node ran the job prep task, navigate to that node, then the job directory. Under the job directory, you should see a jobpreparation directory. In that directory will have the stderr.txt and stdout.txt.
With regard to the exit code, there are a few potential problems that could cause this:
Did you install git, cmake and any other dependencies as part of a start task?
I get a 404 when I try to navigate to: https://github.com/cryptobiu/OSPSI. Does this repo exist? If it's a private repository, are you providing the correct credentials?
A few notes about your job_tasks array:
You should not hardcode the paths /mnt/batch/tasks/shared. This path to the "shared" directory may not be the same between Linux distributions. You should use the environment variable $AZ_BATCH_NODE_SHARED_DIR instead. You can view a full list of Azure Batch pre-filled environment variables here.
You do not need to cd into the directory for each command, you only need to do it once. You can rewrite job_tasks as:
['cd $AZ_BATCH_NODE_SHARED_DIR',
'TODO: INSERT YOUR COMMANDS TO SETUP AUTH WITH GITHUB FOR PRIVATE REPO',
'git clone https://github.com/cryptobiu/OSPSI.git',
'cd OSPSI',
'cmake CMake',
'mkdir -p assets']

Resources