Groovy script fails to call Slack notification parameter from Jenkins DSL job - groovy

I'm attempting to use the Jenkins Job DSL plugin for the first time to create some basic job "templates" before getting into more complex stuff.
Jenkins is running on a Windows 2012 server. The Jenkins version is 1.650 and we are using the Job DSL plugin version 1.51.
Ideally what I would like is for the seed job to be parameterised so that when it is being run the user can enter four things: the Job DSL script location, the name of the generated job, a Slack channel for failure notifications, and an email address for failure notifications.
The first two are fine: I can call the parameters in the groovy script, for example the script understands job("${JOB_NAME}") and takes the name I enter for the job when I run the seed job.
However when I try to do the same thing with a Slack channel the groovy script doesn't seem to want to play. Note that if I specify a Slack channel rather than trying to call a parameter it works fine.
My Job DSL script is here:
job("${JOB_NAME}") {
triggers {
cron("#daily")
}
steps {
shell("echo 'Hello World'")
}
publishers {
slackNotifier {
room("${SLACK_CHANNEL}")
notifyAborted(true)
notifyFailure(true)
notifyNotBuilt(false)
notifyUnstable(true)
notifyBackToNormal(true)
notifySuccess(false)
notifyRepeatedFailure(false)
startNotification(false)
includeTestSummary(false)
includeCustomMessage(false)
customMessage(null)
buildServerUrl(null)
sendAs(null)
commitInfoChoice('NONE')
teamDomain(null)
authToken(null)
}
}
logRotator {
numToKeep(3)
artifactNumToKeep(3)
publishers {
extendedEmail {
recipientList('me#mydomain.com')
defaultSubject('Seed job failed')
defaultContent('Something broken')
contentType('text/html')
triggers {
failure ()
fixed ()
unstable ()
stillUnstable {
subject('Subject')
content('Body')
sendTo {
developers()
requester()
culprits()
}
}
}
}
}
}
}
But starting the seed job fails and gives me this output:
Started by user
Building on master in workspace D:\data\jenkins\workspace\tutorial-job-dsl-2
Disk space threshold is set to :5Gb
Checking disk space Now
Total Disk Space Available is: 28Gb
Node Name: master
Running Prebuild steps
Processing DSL script jobBuilder.groovy
ERROR: (jobBuilder.groovy, line 10) No signature of method: javaposse.jobdsl.plugin.structs.DescribableContext.room() is applicable for argument types: (org.codehaus.groovy.runtime.GStringImpl) values: [#dev]
Possible solutions: wait(), find(), dump(), grep(), any(), wait(long)
[BFA] Scanning build for known causes...
[BFA] No failure causes found
[BFA] Done. 0s
Started calculate disk usage of build
Finished Calculation of disk usage of build in 0 seconds
Started calculate disk usage of workspace
Finished Calculation of disk usage of workspace in 0 seconds
Finished: FAILURE
This is the first time I have tried to do anything with Groovy and I'm sure it's a basic error but would appreciate any help.

Hm, that's a bug in Job DSL, see JENKINS-39153.
You actually do not need to use the template string syntax "${FOO}" if you just want to use the value of FOO. All parameters are string variables which can be used directly:
job(JOB_NAME) {
// ...
publishers {
slackNotifier {
room(SLACK_CHANNEL)
notifyAborted(true)
notifyFailure(true)
notifyNotBuilt(false)
notifyUnstable(true)
notifyBackToNormal(true)
notifySuccess(false)
notifyRepeatedFailure(false)
startNotification(false)
includeTestSummary(false)
includeCustomMessage(false)
customMessage(null)
buildServerUrl(null)
sendAs(null)
commitInfoChoice('NONE')
teamDomain(null)
authToken(null)
}
}
// ...
}
This syntax is more concise and does not trigger the bug.

Related

Cancel remaining stages in Jenkins if a prior stage condition is true

I have a piece of groovy code in Jenkins that looks up for new software versions. If a new version is found, it downloads it; if there is no new versions available, the job stops.
I was wondering if there is a less messy way to stop all remaining stages if the condition that compare versions is true.
Here is my code:
pipeline {
...
stages {
...
stage('Compare Versions') {
steps {
script {
// these two variables are the ones that I use as a boolean.
if(CURRENT_VERSION == NEW_VERSION) {
echo "Current Version $CURRENT_VERSION, Available Version $NEW_VERSION"
echo 'The current version is already the latest one. Cancelling update.'
currentBuild.result = 'SUCCESS'
error 'Cancelled Job'
}
}
}
}
}
}
The problem with the code above is that it causes a train wreck of java errors due to the remaining stages still trying to unsuccessfully run. I would need to improve this code without any plugins.
Any assistance would be helpful.

Want to deploy to servers in parallel using Groovy

I am trying to Deploy to a list of servers in parallel to save some time. The names of servers are listed in a collection: serverNames
The original code was:
serverNames.each({
def server = new Server([steps: steps, hostname: it, domain: "test"])
server.stopTomcat()
server.ssh("rm -rf ${WEB_APPS_DIR}/pc*")
PLMScriptUtils.secureCopy(steps, warFileLocation, it, WEB_APPS_DIR)
})
Basically i want to stop the tomcat, rename a file and then copy a war file to a location using the following lines:
server.stopTomcat()
server.ssh("rm -rf ${WEB_APPS_DIR}/pc*")
PLMScriptUtils.secureCopy(steps, warFileLocation, it, WEB_APPS_DIR)
The original code was working properly and it took 1 server from the collection serverNames and performed the 3 line to do the deploy.
But now i have requirement to run the deployment to the servers listed in serverNames parallely
Below is my new modified code:
def threads = []
def th
serverNames.each({
def server = new Server([steps: steps, hostname: it, domain: "test"])
th = new Thread({
steps.echo "doing deployment"
server.stopTomcat()
server.ssh("rm -rf ${WEB_APPS_DIR}/pc*")
PLMScriptUtils.secureCopy(steps, warFileLocation, it, WEB_APPS_DIR)
})
threads << th
})
threads.each {
steps.echo "joining thread"
it.join()
}
threads.each {
steps.echo "starting thread"
it.start()
}
The echo statements were added to visualize the flow.
With this the output is coming as:
joining thread
joining thread
joining thread
joining thread
starting thread
starting thread
starting thread
starting thread
The number of servers in the collection was 4 hence 4 times the thread is being added and started. but it is not executing the 3 lines i want to run in parallel, which means "doing deployment" is not being printed at all and later the build is failing with an exception.
Note that i am running this Groovy code as a pipeline through Jenkins this whole piece of code is actually a function called deploy of the class deployment and my pipeline in jenkins is creating an object of the class deployment and then calling the deploy function
Can anyone help me with this ? I am stuck like hell with this one. :-(
Have a look at the parallel step. In scripted pipelines (which you seem to be using), you can pass it a map of thread name to action (as a Groovy closure) which is then run in parallel.
deployActions = [
Server1: {
// stop tomcat etc.
},
Server2: {
...
}
]
parallel deployActions
It is much simpler and the recommended way of doing it.

Cypress: interrupt all tests on first failure

How to interrupt all Cypress tests on the first test failure?
We are using semaphore to launch complete e2e tests with Cypress for each PR. But it takes too much time.
I'd like to interrupt all tests on the first test failure.
Getting the complete errors is each developer's business when they develop. I just want to be informed ASAP if there is anything wrong prior to deploy, and don't have to wait for the full tests to complete.
So far the only solution I came up with was interrupting the tests on the current spec file with Cypress.
afterEach(() => {
if (this.currentTest.state === 'failed') {
Cypress.runner.end();
}
});
But this is not enough since it only interrupts the tests located on the spec file, not ALL the other files. I've done some intensive search on this matter today and it doesn't seem like this is a thing on Cypress.
So I'm trying other solutions.
1: with Semaphore
fail_fast:
stop:
when: "true"
It is supposed to interrupt the script on error. But it doesn't work: tests keep running after error. My guess is that Cypress will throw an error only when all tests are complete.
2: maybe with the script launching Cypress, but I'm out of ideas
Right now here are my scripts
"cy:run": "npx cypress run",
"cy:run:dev": "CYPRESS_env=dev npx cypress run",
"cy:test": "start-server-and-test start http-get://localhost:4202 cy:run"
EDIT: It seems like this feature was introduced, but it requires paid version of Cypress (Business Plan). More about it: Docs, comment in the thread
Original answer:
This has been a long-requested feature in Cypress for some reason still has not been introduced. There are some workarounds proposed by the community, however it is not guaranteed they will work. Check this thread on Cypress' Github for more details, maybe you will find a workaround that works for your case.
The solution by #user3504541 is excellent! Thanks a ton. I already started giving up on using Cypress since these issues keep popping up. But in any case, here's my config:
support/index.ts
declare global {
// eslint-disable-next-line
namespace Cypress {
interface Chainable {
interrupt: () => void
}
}
}
function abortEarly() {
if (this.currentTest.state === 'failed') {
return cy.task('shouldSkip', true)
}
cy.task('shouldSkip').then(value => {
if (value) return cy.interrupt()
})
}
commands/index.ts
Cypress.Commands.add('interrupt', () => {
eval("window.top.document.body.querySelector('header button.stop').click()")
})
In my case the Cypress tests were left pending indefinitely on the CI (Github action workflow) but with this fix they interrupt properly.
A little hack that worked for me
Cypress.Commands.add('interrupt', () => {
eval("window.top.document.body.querySelector('header button.stop').click()");
});
This is available as the Auto Cancelation feature, which is part of Smart Orchestration, but is only available to Business Plan. From the Auto Cancelation docs:
Continuous Integration (CI) pipelines are typically costly processes that can demand significant compute time. When a test failure occurs in CI, it often does not make sense to continue running the remainder of a test suite since the process has to start again upon merging of subsequent fixes and other code changes. When Auto Cancellation is enabled, once the number of failed tests goes over a preset threshold, the entire test run is canceled. Note that any in-progress specs will continue to run to completion.

Amazon EMR - how to set a timeout for a step

is there a way to set a timeout for a step in Amazon Aws EMR?
I'm running a batch Apache Spark job on EMR and I would like the job to stop with a timeout if it doesn't end within 3 hours.
I cannot find a way to set a timeout not in Spark, nor in Yarn, nor in EMR configuration.
Thanks for your help!
I would like to offer an alternative approach, without any timeout/shutdown logic making application itself more complex than needed - although I am obviously quite late to the party. Maybe it proves useful for someone in the future.
You can:
write a Python script and use it as a wrapper around regular Yarn commands
execute those Yarn commands via subprocess lib
parse their output according to your will
decide which Yarn applications should be killed
More details about what I am talking about follow...
Python wrapper script and running the Yarn commands via subprocess lib
import subprocess
running_apps = subprocess.check_output(['yarn', 'application', '--list', '--appStates', 'RUNNING'], universal_newlines=True)
This snippet would give you an output similar to something like this:
Total number of applications (application-types: [] and states: [RUNNING]):1
Application-Id Application-Name Application-Type User Queue State Final-State Progress Tracking-URL
application_1554703852869_0066 HIVE-645b9a64-cb51-471b-9a98-85649ee4b86f TEZ hadoop default RUNNING UNDEFINED 0% http://ip-xx-xxx-xxx-xx.eu-west-1.compute.internal:45941/ui/
You can than parse this output (beware there might be more than one app running) and extract application-id values.
Then, for each of those application ids, you can invoke another yarn command to get more details about the specific application:
app_status_string = subprocess.check_output(['yarn', 'application', '--status', app_id], universal_newlines=True)
Output of this command should be something like this:
Application Report :
Application-Id : application_1554703852869_0070
Application-Name : com.organization.YourApp
Application-Type : HIVE
User : hadoop
Queue : default
Application Priority : 0
Start-Time : 1554718311926
Finish-Time : 0
Progress : 10%
State : RUNNING
Final-State : UNDEFINED
Tracking-URL : http://ip-xx-xxx-xxx-xx.eu-west-1.compute.internal:40817
RPC Port : 36203
AM Host : ip-xx-xxx-xxx-xx.eu-west-1.compute.internal
Aggregate Resource Allocation : 51134436 MB-seconds, 9284 vcore-seconds
Aggregate Resource Preempted : 0 MB-seconds, 0 vcore-seconds
Log Aggregation Status : NOT_START
Diagnostics :
Unmanaged Application : false
Application Node Label Expression : <Not set>
AM container Node Label Expression : CORE
Having this you can also extract application's start time, compare it with current time and see for how long it is running.
If it is running for more than some threshold number of minutes, for example you kill it.
How do you kill it?
Easy.
kill_output = subprocess.check_output(['yarn', 'application', '--kill', app_id], universal_newlines=True)
This should be it, from the killing of the step/application perspective.
Automating the approach
AWS EMR has a wonderful feature called "bootstrap actions".
It runs a set of actions on EMR cluster creation and can be utilized for automating this approach.
Add a bash script to bootstrap actions which is going to:
download the python script you just wrote to the cluster (master node)
add the python script to a crontab
That should be it.
P.S.
I assumed Python3 is at our disposal for this purpose.
Well, as many have already answered, an EMR step cannot be killed/stopped/terminated via an API call at this moment.
But to achieve your goals, you can introduce a timeout as part of your application code itself. When you submit EMR steps, a child process is created to run your application - be it MapReduce Application, Spark Application, etc. and the step completion is determined by the exit code this child process (which is your application) returns.
For example, if you are submitting a MapReduce Application, you can use something like below :
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
final Runnable stuffToDo = new Thread() {
#Override
public void run() {
job.submit();
}
};
final ExecutorService executor = Executors.newSingleThreadExecutor();
final Future future = executor.submit(stuffToDo);
executor.shutdown(); // This does not cancel the already-scheduled task.
try {
future.get(180, TimeUnit.MINUTES);
}
catch (InterruptedException ie) {
/* Handle the interruption. Or ignore it. */
}
catch (ExecutionException ee) {
/* Handle the error. Or ignore it. */
}
catch (TimeoutException te) {
/* Handle the timeout. Or ignore it. */
}
System.exit(job.waitForCompletion(true) ? 0 : 1);
Reference - Java: set timeout on a certain block of code?.
Hope this helps.

Have Jenkins Fail Fast When Node Is Offline

I have a MultiJob Project (made with the Jenkins Multijob plugin), with a series of MultiJob Phases. Let's say one of these jobs is called SubJob01. The jobs that are built are each configured with the "Restrict where this project can be run" option to be tied to one node. SubJob01 is tied to Slave01.
I would like it if these jobs would fail fast when the node is offline, instead of saying "(pending—slave01 is offline)". Specifically, I want there to be a record of the build attempt in SubJob01, with the build being marked as failed. This way, I can configure my MultiJob project to handle the situation as I'd like, instead of using the Jenkins build timeout plugin to abort the whole thing.
Does anyone know of a way to fail-fast a build if all nodes are offline? I could intersperse the MultiJob project with system Groovy scripts to check whether the desired nodes are offline, but that seems like it'd be reinventing, in the wrong place, what should already be a feature.
I ended up creating this solution which has worked well. The first build step of SubJob01 is an Execute system Groovy script, and this is the script:
import java.util.regex.Matcher
import java.util.regex.Pattern
int exitcode = 0
println("Looking for Offline Slaves:");
for (slave in hudson.model.Hudson.instance.slaves) {
if (slave.getComputer().isOffline().toString() == "true"){
println(' * Slave ' + slave.name + " is offline!");
if (slave.name == "Slave01") {
println(' !!!! This is Slave01 !!!!');
exitcode++;
} // if slave.name
} // if slave offline
} // for slave in slaves
println("\n\n");
println "Slave01 is offline: " + hudson.model.Hudson.instance.getNode("Slave01").getComputer().isOffline().toString();
println("\n\n");
if (exitcode > 0){
println("The Slave01 slave is offline - we can not possibly continue....");
println("Please contact IT to resolve the slave down issue before retrying the build.");
return 1;
} // if
println("\n\n");
The jenkins pipeline statement 'beforeAgent true' can be used in evaluating the when condition previous to entering the agent.
stage('Windows') {
when {
beforeAgent true
expression { return ("${TARGET_NODES}".contains("windows")) }
}
agent { label 'win10' }
steps {
cleanWs()
...
}
Ref:
https://www.jenkins.io/doc/book/pipeline/syntax/
https://www.jenkins.io/blog/2018/04/09/whats-in-declarative/

Resources