I have a little issue. I'm creating an WPF GUI, with powershell code. I have a function which will perform a task on multiple computers (using parallel workflows). The issue here, is when my task is running, my UI freeze until the task is complete.
I would like to work around with jobs, but i am unable to receive my job when the task is ended.
Here, my simplified function :
parallelPingComputer -ips $ip_list | Select-Object date, Computer, result | out-gridview
The function :
workflow parallelPingCOmputer {
Param($ips)
foreach -parallel($ip in $ips)
{
PingComputer($ip)
}
}
And finally, the "pingcomputer($ip)" is only a ping plus an other task on multiple targets.
I tried to add -AsJob after the parallel ping, and i'm not able to call back the job result when he ended (and not before...)
Can you please help me ? :)
Thank's a lot
Related
I am trying to Deploy to a list of servers in parallel to save some time. The names of servers are listed in a collection: serverNames
The original code was:
serverNames.each({
def server = new Server([steps: steps, hostname: it, domain: "test"])
server.stopTomcat()
server.ssh("rm -rf ${WEB_APPS_DIR}/pc*")
PLMScriptUtils.secureCopy(steps, warFileLocation, it, WEB_APPS_DIR)
})
Basically i want to stop the tomcat, rename a file and then copy a war file to a location using the following lines:
server.stopTomcat()
server.ssh("rm -rf ${WEB_APPS_DIR}/pc*")
PLMScriptUtils.secureCopy(steps, warFileLocation, it, WEB_APPS_DIR)
The original code was working properly and it took 1 server from the collection serverNames and performed the 3 line to do the deploy.
But now i have requirement to run the deployment to the servers listed in serverNames parallely
Below is my new modified code:
def threads = []
def th
serverNames.each({
def server = new Server([steps: steps, hostname: it, domain: "test"])
th = new Thread({
steps.echo "doing deployment"
server.stopTomcat()
server.ssh("rm -rf ${WEB_APPS_DIR}/pc*")
PLMScriptUtils.secureCopy(steps, warFileLocation, it, WEB_APPS_DIR)
})
threads << th
})
threads.each {
steps.echo "joining thread"
it.join()
}
threads.each {
steps.echo "starting thread"
it.start()
}
The echo statements were added to visualize the flow.
With this the output is coming as:
joining thread
joining thread
joining thread
joining thread
starting thread
starting thread
starting thread
starting thread
The number of servers in the collection was 4 hence 4 times the thread is being added and started. but it is not executing the 3 lines i want to run in parallel, which means "doing deployment" is not being printed at all and later the build is failing with an exception.
Note that i am running this Groovy code as a pipeline through Jenkins this whole piece of code is actually a function called deploy of the class deployment and my pipeline in jenkins is creating an object of the class deployment and then calling the deploy function
Can anyone help me with this ? I am stuck like hell with this one. :-(
Have a look at the parallel step. In scripted pipelines (which you seem to be using), you can pass it a map of thread name to action (as a Groovy closure) which is then run in parallel.
deployActions = [
Server1: {
// stop tomcat etc.
},
Server2: {
...
}
]
parallel deployActions
It is much simpler and the recommended way of doing it.
I'm currently learning about runspaces in Powershell (my end goal is to set up a job scheduling system) to do this I wrote a basic script in order to learn and use runspaces.
What I Expected To Happen:
I expected that when I run the code up to the commented line, this will queue up the 8 jobs and run them within the RunspacePool , running a maximum of 2 at a time.
Running the single line $JobList.AsynchronousObject a few times and should then see more and more IsComplete flags turning from false to true as the jobs complete as they take 20 seconds each due to the Start-Sleep command.
The BeginInvoke command apparently returns an object implementing the IAsycResult interface.
https://learn.microsoft.com/en-us/dotnet/api/system.iasyncresult?redirectedfrom=MSDN&view=netframework-4.8#examples
In the IAsyncResult remarks in mentions polling the IsComplete property to see if an asychronous operation is completed which although not ideal is what I was trying to do below for learning purposes.
Actual:
All the IsComplete flags are true a second after running the top portion of code which is not what I expected
Question:
Does the IsComplete flag represent just whether the script has started executing and maybe that is why they're all true a second after queuing up?
I'm grateful for any assistance or references to further reading anyone is able to provide.
Many Thanks
Nick
#Set up runspace
$RunspacePool = [runspacefactory]::CreateRunspacePool()
$RunspacePool.SetMinRunspaces(1)
$RunspacePool.SetMaxRunspaces(2)
#Create arraylist to hold references to all the instances running jobs
$JobList = New-Object System.Collections.ArrayList
#Queue up 8 jobs that will take 20 seconds each to complete
#Add the job details to the list so I can poll it's IsComplete property
$RunspacePool.Open()
1..8 | ForEach {
Write-Verbose "Counter: $_" -Verbose
$PowershellInstance = [powershell]::Create()
$PowershellInstance.RunspacePool = $RunspacePool
[void]$PowershellInstance.AddScript({
Start-Sleep -Seconds 20
$ThreadID = [appdomain]::GetCurrentThreadId()
Write-Verbose "$ThreadID thread completed" -Verbose
})
$AsynchronousObject = $PowershellInstance.BeginInvoke()
$JobList.Add(([PSCustomObject]#{
Id = $_
PowerShellInstance = $PowershellInstance
AsynchronousObject = $AsynchronousObject
}))
}
#----------------------------------------------
#List IsComplete should show true as jobs become complete
$JobList.AsynchronousObject
#Clean up
$RunspacePool.Close()
$RunspacePool.Dispose()
There is no issue. You are forgetting what Asynchronous really means.
When you launch Asynchronous jobs, they don't block the current thread (aka. you're current PowerShell prompt) instead, they create a new thread and run from there. The whole point about Asynchronous jobs is that you can run multiple things at once.
So what happens is that Runspace is created, everything gets set up, Jobs are queued and start to run in new threads, then it keeps going (everything is Async and running in separate threads). It then goes right on to execute the last three lines:
#List IsComplete should show true as jobs become complete
$JobList.AsynchronousObject
#Clean up
$RunspacePool.Close()
$RunspacePool.Dispose()
Which kills the Runspace and Disposes of it, thereby "completing" the jobs.
If you run everything up to the commented line first. Then start watching $JobList.AsynchronousObject from the PowerShell prompt, then you will see it stepping through the jobs as expected.
Once complete, then you can execute the final two lines to close and dispose of your runspace.
You will have to look at the Job Wait functions if you want to have things wait for you.
I've been working with node for the first time in a while again and stumbled upon node-schedule, which for the most part has been a breeze, however, I've found resuming a scheduled task after canceling it via job.cancel() pretty difficult.
For the record, I'm using schedule to perform specific actions at a specific date (non-recurring) and under some circumstances cancel the task at a specific date but would later like to resume it.
I tried using job.cancel(true) after cancelling it via plain job.cancel() first as the documentation states that that would reschedule the task, but this has not worked for me. Using job.reschedule() after having cancelled job first yields the same result.
I could probably come up with an unelegant solution, but I thought I'd ask if anyone knows of an elegant one first.
It took me a while to understand node-schedule documentation ^^
To un-cancel a job, You have to give to reschedule some options.
If you don't pass anything to reschedule, this function returns false (Error occured)
For exemple, you can declare options, and pass this variable like this :
const schedule = require('node-schedule');
let options = {rule: '*/1 * * * * *'}; // Declare schedule rules
let job = schedule.scheduleJob(options, () => {
console.log('Job processing !');
});
job.cancel(); // Cancel Job
job.reschedule(options); // Reschedule Job
Hope it helps.
is there a way to set a timeout for a step in Amazon Aws EMR?
I'm running a batch Apache Spark job on EMR and I would like the job to stop with a timeout if it doesn't end within 3 hours.
I cannot find a way to set a timeout not in Spark, nor in Yarn, nor in EMR configuration.
Thanks for your help!
I would like to offer an alternative approach, without any timeout/shutdown logic making application itself more complex than needed - although I am obviously quite late to the party. Maybe it proves useful for someone in the future.
You can:
write a Python script and use it as a wrapper around regular Yarn commands
execute those Yarn commands via subprocess lib
parse their output according to your will
decide which Yarn applications should be killed
More details about what I am talking about follow...
Python wrapper script and running the Yarn commands via subprocess lib
import subprocess
running_apps = subprocess.check_output(['yarn', 'application', '--list', '--appStates', 'RUNNING'], universal_newlines=True)
This snippet would give you an output similar to something like this:
Total number of applications (application-types: [] and states: [RUNNING]):1
Application-Id Application-Name Application-Type User Queue State Final-State Progress Tracking-URL
application_1554703852869_0066 HIVE-645b9a64-cb51-471b-9a98-85649ee4b86f TEZ hadoop default RUNNING UNDEFINED 0% http://ip-xx-xxx-xxx-xx.eu-west-1.compute.internal:45941/ui/
You can than parse this output (beware there might be more than one app running) and extract application-id values.
Then, for each of those application ids, you can invoke another yarn command to get more details about the specific application:
app_status_string = subprocess.check_output(['yarn', 'application', '--status', app_id], universal_newlines=True)
Output of this command should be something like this:
Application Report :
Application-Id : application_1554703852869_0070
Application-Name : com.organization.YourApp
Application-Type : HIVE
User : hadoop
Queue : default
Application Priority : 0
Start-Time : 1554718311926
Finish-Time : 0
Progress : 10%
State : RUNNING
Final-State : UNDEFINED
Tracking-URL : http://ip-xx-xxx-xxx-xx.eu-west-1.compute.internal:40817
RPC Port : 36203
AM Host : ip-xx-xxx-xxx-xx.eu-west-1.compute.internal
Aggregate Resource Allocation : 51134436 MB-seconds, 9284 vcore-seconds
Aggregate Resource Preempted : 0 MB-seconds, 0 vcore-seconds
Log Aggregation Status : NOT_START
Diagnostics :
Unmanaged Application : false
Application Node Label Expression : <Not set>
AM container Node Label Expression : CORE
Having this you can also extract application's start time, compare it with current time and see for how long it is running.
If it is running for more than some threshold number of minutes, for example you kill it.
How do you kill it?
Easy.
kill_output = subprocess.check_output(['yarn', 'application', '--kill', app_id], universal_newlines=True)
This should be it, from the killing of the step/application perspective.
Automating the approach
AWS EMR has a wonderful feature called "bootstrap actions".
It runs a set of actions on EMR cluster creation and can be utilized for automating this approach.
Add a bash script to bootstrap actions which is going to:
download the python script you just wrote to the cluster (master node)
add the python script to a crontab
That should be it.
P.S.
I assumed Python3 is at our disposal for this purpose.
Well, as many have already answered, an EMR step cannot be killed/stopped/terminated via an API call at this moment.
But to achieve your goals, you can introduce a timeout as part of your application code itself. When you submit EMR steps, a child process is created to run your application - be it MapReduce Application, Spark Application, etc. and the step completion is determined by the exit code this child process (which is your application) returns.
For example, if you are submitting a MapReduce Application, you can use something like below :
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
final Runnable stuffToDo = new Thread() {
#Override
public void run() {
job.submit();
}
};
final ExecutorService executor = Executors.newSingleThreadExecutor();
final Future future = executor.submit(stuffToDo);
executor.shutdown(); // This does not cancel the already-scheduled task.
try {
future.get(180, TimeUnit.MINUTES);
}
catch (InterruptedException ie) {
/* Handle the interruption. Or ignore it. */
}
catch (ExecutionException ee) {
/* Handle the error. Or ignore it. */
}
catch (TimeoutException te) {
/* Handle the timeout. Or ignore it. */
}
System.exit(job.waitForCompletion(true) ? 0 : 1);
Reference - Java: set timeout on a certain block of code?.
Hope this helps.
I have a list on which I have an ItemUpdated handler.
When I edit using the datasheet view and modify every item, the ItemUpdated event will obviously run for every single item.
In my ItemUpdated event, I want it to check if there is a Timer Job scheduled to run. If there is, then extend the SPOneTimeSchedule schedule of this job to delay it by 5 seconds. If there isn't, then create the Timer Job and schedule it for 5 seconds from now.
I've tried looking to see if the job definition exists in the handler and if it does exist, then extend the schedule by 5 seconds. If it doesn't exist, then create the job definition to run in a minutes time.
MyTimerJob rollupJob = null;
foreach (SPJobDefinition job in web.Site.WebApplication.JobDefinitions)
{
if (job.Name == Constants.JOB_ROLLUP_NAME)
{
rollupJob = (MyTimerJob)job;
}
}
if (rollupJob == null)
{
rollupJob = new MyTimerJob(Constants.JOB_ROLLUP_NAME, web.Site.WebApplication);
}
SPOneTimeSchedule schedule = new SPOneTimeSchedule(DateTime.Now.AddSeconds(5));
rollupJob.Schedule = schedule;
rollupJob.Update();
When I try this out on the server, I get a lot of errors
"An update conflict has occurred, and you must re-try this action. The object MyTimerJob Name=MyTimerJobName Parent=SPWebApplication Name=SharePoint -80 is being updated by NT AUTHORITY\NETWORK SERVICE in the w3wp process
I think the job is probably running for the first time and once running, the other ItemUpdated events are coming in and finding the existing Job definition. It then tries to Update this definition even though it is currently being used. Should I make a new Job Definition name so that it doesn't step on top of the first? Or raise the time to a minute?
I solved this myself by just setting the delay to a minutes time from now regardless of whether a definition is found. This way, while it is busy, it will keep pushing back the scheduling of the job until it is done processing
This is because the event is asynchronous. You'll need to rethink exactly what you're trying to solve with this code and potentially re-factor it.
Maybe you should try using "lock" on the timer job object?