How can I create jobs that yield results as they complete? - multithreading

The Problem
Consider you have 4 machines.
Machine A is slow
Machine B is medium speed,
Machine C is fast.
LocalHost is ultra-fast.
On each remote machine, you want to sum the first 1 million prime numbers. You can do this from the local host with:
$servers = #("MachineA","MachineB","MachineC")
Invoke-Command -ComputerName $servers -ScriptBlock {
Sum-FirstMillionPrimes
}
As this is written, results will not be displayed (yielded) until the slowest machine is finished.
To speed this up, you try to perform this as a job:
$servers = #("MachineA","MachineB","MachineC")
Invoke-Command -ComputerName $servers -ScriptBlock {
Sum-FirstMillionPrimes
} -AsJob
while ($null -ne (Get-Job)) {
$doneChildJob = Get-Job | Wait-Job -Any
$processResult = $doneChildJob | Receive-Job -AutoRemoveJob -Wait
$processResult
}
This still has the same problem, because according to the documentation (example 8):
The command uses the AsJob parameter to run the command as a background
job. This command returns a job object that contains two child job
objects, one for each of the jobs run on the two remote computers.
This means for us that we are running three child jobs, but the parent job will not return until all child jobs are completed.
How can you write this in a way that the results from the child jobs will be yielded back as they finish?
What I've Tried
We have come up with a solution that appears to work, but this problem seems common enough that there should be a PowerShell way to handle this.
# Create a HashSet of jobs that have already been processed. This is important
# because child jobs cannot be removed via Remove-Job. There doesn't seem to be
# a way to determine if the job has been received
[System.Collections.Generic.HashSet[int]]$processedJobIds = #()
while ($null -ne (Get-Job)) {
# We only want to attempt to process jobs that have no children that we
# haven't seen. The -IncludeChildJob parameter allows us to see the nested
# children jobs from Invoke-Command -AsJob. Because we can't determine if a
# child job has already been received, we filter based on our above hashset.
$doneChildJob = Get-Job -IncludeChildJob | Where-Object { $_.ChildJobs.Count -eq 0 -and (-not ($processedJobIds.Contains($_.Id))) } | Wait-Job -Any
if ($null -eq $doneChildJob) {
# The $doneChildJob filter will exclude the parent job created by
# Invoke-Command -AsJob. However, we still need to eventually remove
# this job, otherwise we'd hit an infinite loop.
# The assumption is that the only way that $doneChildJob will evaluate to
# $null is if all child jobs have completed. If all child jobs are
# completed, the remaining job(s) should be safe to remove as they are
# expected to be parent jobs.
Get-Job | Remove-Job
}
else {
# We need to process the child jobs
$processResult = $doneChildJob | Receive-Job -Wait
$processResult
$processedJobIds.Add($doneChildJob.Id) | Out-Null
# By default, Get-Job does not return children jobs (i.e they are
# parents and can be removed by Remove-Job). Based on this behavior,
# if $processedJobIds contains any of these jobs, they are safe to
# remove, and should also be removed from our $processedJobIds list.
Get-Job | Where-Object { $processedJobIds.Contains($_.Id) } | ForEach-Object {
$processedJobIds.Remove($_.Id) | Out-Null
Remove-Job $_
}
}
}
Given this following code, we have ran it with these examples and it appears to work:
Import-Module ThreadJob
$servers = #("MachineA", "MachineB", "MachineC")
$sessions = New-PSSession -ComputerName $servers
Invoke-Command -Session $sessions -ScriptBlock {
$computerName = [System.Net.Dns]::GetHostName()
$firstMillionPrimes = Sum-FirstMillionPrimes
Write-Output "$computerName - $firstMillionPrimes"
} -AsJob | Out-Null
# It should also handle when one of the child jobs fails but not all
Invoke-Command -ComputerName $servers -ScriptBlock {
$computerName = [System.Net.Dns]::GetHostName()
if ($computerName -eq "MachineA") {
Throw "This is a remote invoke FAILURE on $computerName"
}
else{
$computerName = [System.Net.Dns]::GetHostName()
$firstMillionPrimes = Sum-FirstMillionPrimes
Write-Output "$computerName - $firstMillionPrimes"
}
} -AsJob | Out-Null
# In addition to the jobs started on multiple sessions, this also needs
# to be robust enough to handle other jobs running locally.
Start-Job -ScriptBlock { Sum-FirstMillionPrimes } | Out-Null
# It also needs to handle jobs created by Start-ThreadJob
Start-ThreadJob -ScriptBlock { Sum-FirstMillionPrimes } | Out-Null
# It also needs to handle jobs that have a state of Failed
Start-ThreadJob -ScriptBlock { throw "My job State will be Failed" } | Out-Null
# It should handle nested jobs that are successful
Start-Job -ScriptBlock { Start-ThreadJob -ScriptBlock { Sum-FirstMillionPrimes } | Receive-Job -Wait} | Out-Null
Start-Job -ScriptBlock { Start-Job -ScriptBlock { Sum-FirstMillionPrimes } | Receive-Job -Wait} | Out-Null
Start-ThreadJob -ScriptBlock { Start-ThreadJob -ScriptBlock { Sum-FirstMillionPrimes } | Receive-Job -Wait} | Out-Null
# It should handle nested jobs that are failures
Start-Job -ScriptBlock { Start-ThreadJob -ScriptBlock { throw "Handles nested thread jobs that fail" } | Receive-Job -Wait} | Out-Null
Start-Job -ScriptBlock { Start-Job -ScriptBlock { throw "Handles nested jobs that fail" } | Receive-Job -Wait} | Out-Null
Start-ThreadJob -ScriptBlock { Start-ThreadJob -ScriptBlock { throw "Handles nested thread jobs in thread jobs that fail" } | Receive-Job -Wait} | Out-Null
Expected output (simulated), this will be yielded back to the terminal as processing finishes. In the case of exceptions, it will be almost instantaneous, but on long calculations, the results may be interspersed as they complete:
This is a remote invoke FAILURE on MachineA
+ CategoryInfo : OperationStopped: (This is a remote invoke FAILURE on MachineA:String) [], RuntimeException
+ FullyQualifiedErrorId : This is a remote invoke FAILURE on MachineA
+ PSComputerName : MachineA
My job State will be Failed
+ CategoryInfo : InvalidResult: (:) [], RuntimeException
+ FullyQualifiedErrorId : JobStateFailed
Handles nested thread jobs that fail
+ CategoryInfo : InvalidResult: (:) [], RuntimeException
+ FullyQualifiedErrorId : JobStateFailed
Handles nested jobs that fail
+ CategoryInfo : InvalidResult: (:) [], RuntimeException
+ FullyQualifiedErrorId : JobStateFailed
Handles nested thread jobs in thread jobs that fail
+ CategoryInfo : InvalidResult: (:) [], RuntimeException
+ FullyQualifiedErrorId : JobStateFailed
Localhost - (FirstMillionPrimes)
MachineC - (FirstMillionPrimes)
Localhost - (FirstMillionPrimes)
Localhost - (FirstMillionPrimes)
MachineC - (FirstMillionPrimes)
Localhost - (FirstMillionPrimes)
MachineB - (FirstMillionPrimes)
Localhost - (FirstMillionPrimes)
MachineB - (FirstMillionPrimes)
MachineA - (FirstMillionPrimes)
This solution that we've come up with appears to work, but it seems really heavy handed. Is there a better way/pattern in PowerShell to yield the results as they complete?

Sounds like the PSRemotingJob.StateChanged Event might work for you. Something like this:
$global:results = [System.Collections.ArrayList]::new()
# create action scriptblock for eventhandling
$onJobFinish = {
# only run action if job has terminated
if ($Event.Sender.State -in #('Completed', 'Failed', 'Stopped', 'Suspended', 'Disconnected')) {
$localResults = $Event.Sender | Receive-Job
# immediately send output to screen
$localResults | Out-Host
# also add output to collection to work with later
$global:results.Add($localResults) | Out-Null
}
}
Invoke-Command -Session $sessions -ScriptBlock {
$computerName = [System.Net.Dns]::GetHostName()
$firstMillionPrimes = Sum-FirstMillionPrimes
Write-Output "$computerName - $firstMillionPrimes"
} -AsJob |
Select-Object -ExpandProperty ChildJobs | ForEach-Object {
# Register our action to run wheneven a child job's state changes
Register-ObjectEvent -InputObject $_ -EventName 'StateChanged' -Action $onJobFinish
}
Start-Job -ScriptBlock { Sum-FirstMillionPrimes } | Select-Object -ExpandProperty ChildJobs | ForEach-Object {
# Register our action to run wheneven a child job's state changes
Register-ObjectEvent -InputObject $_ -EventName 'StateChanged' -Action $onJobFinish
}
# access all results that have been received thus far
$global:results | Format-Table
Update
You can also do something like this where you just add all the jobs to a single collection and perform a loop while they are running/have data. You can output data as it is available this way instead of having to wait for job completion.
$jobs = #()
$jobs += Invoke-Command -ScriptBlock $sb -ComputerName $computers -AsJob
$jobs += Start-Job -ScriptBlock $sb2
$jobs += Start-ThreadJob -ScriptBlock $sb3
$results = [System.Collections.ArrayList]::new()
while ($jobs | Where-Object {
$_.State -notin #('Completed', 'Failed', 'Stopped', 'Suspended', 'Disconnected')
}) {
$localData = $jobs | Receive-Job
$localData | Format-Table
$results.Add($localData) | Out-Null
Start-Sleep -Seconds 1
}
# Add one more collection of data for good measure
$localData = $jobs | Receive-Job
$localData | Format-Table
$results.Add($localData) | Out-Null

Related

Need to run app on N remotes hosts and handle them all

Im using this code to run app on N remote hosts and need to wait until all of them are finished and get their results. But the execution just passes throuhg, all jobs are being marked Completed and code exits.
How to make it wait till apps finished their execution?
$procArray = [System.Collections.ArrayList]#()
foreach ($key in $simulatorServers.keys) {
$unitHost = $simulatorServers[$key][0]
$EXE="C:\app.exe"
Wr-DebugReg "Running $EXE on $unitHost "
$ScriptString="Start-Process -FilePath $EXE "
$ScriptBlock=[System.Management.Automation.ScriptBlock]::Create($ScriptString)
$Session=New-PSSession -ComputerName $unitHost -EnableNetworkAccess -Name "session$counter" -Credential $crNonProd
$rez2 = Invoke-Command -Session $Session -ScriptBlock $ScriptBlock -AsJob
$rez00=$procArray.Add($rez2)
Wr-DebugReg "Running process id=$($rez2.id) name=$($proc.Name)on $unitHost"
Wr-DebugReg ""
}
$procArray | Wait-Job
$procArray | Receive-Job
these jobs gone to status Completed even if launched processes still running
let invoke-command handle the the amount of jobs/sessions to open in parallel - you will receive 1 job with childs:
$scriptBlock = {
start-process -FilePath 'C:\app.exe'
}
$sessions = #(
foreach ($key in $simulatorServers.keys) {
$unitHost = $simulatorServers[$key][0]
New-PSSession -ComputerName $unitHost -EnableNetworkAccess -Name "session$counter" -Credential $crNonProd
}
)
$job = Invoke-Command -Session $Sessions -ScriptBlock $ScriptBlock -AsJob
$result = receive-job -Wait -Job $job
btw. I do not see, based on this sample, what you want to receive. you want to execute "start-process -FilePath 'C:\app.exe' " on the target machines but this won't give you anything back.
to get the information back modify the scriptblock like this:
$scriptBlock = {
$result = start-process -FilePath 'C:\app.exe' -wait -PassThru
return $result
}
This code is working. -Wait is the key to make it wait until all jobs are finished.
$procArray = [System.Collections.ArrayList]#()
foreach ($key in $hosts.keys) {
$unitHost = $hosts[$key][0]
$EXE="c:\app.exe"
$Session=New-PSSession -ComputerName $unitHost -EnableNetworkAccess -Name "session$counter" -Credential $crNonProd
$rez2 = Invoke-Command -Session $Session -ScriptBlock {Start-Process -FilePath $args[0] -Wait} -ArgumentList $EXE -AsJob
$rez00=$procArray.Add($rez2)
Wr-DebugReg "Starting process id=$($rez2.id) name=$($proc.Name)on $unitHost"
Wr-DebugReg ""
}
while ($procArray.State -contains 'Running')
{
Start-Sleep -Seconds 1
}

Register trackingevent for all background jobs?

Good afternoon,
I've been working with trying to register an event based on when all jobs are completed. Im able to successfully register one, but id like to get a message pop-up once all background jobs are completed. Anyone familiar with how to do so?
I attempted the following, but it errors out saying jobs is null:
1..10 | ForEach-Object -Process {
Start-Job { Start-Sleep $_ } -Name "$_" | Out-Null} -OutVariable $jobs
Register-ObjectEvent $jobs StateChanged -Action {
[System.Windows.MessageBox]::Show('Done')
$eventSubscriber | Unregister-Event
$eventSubscriber.Action | Remove-Job
} | Out-Null
I feel like a Do{}Until() loop can do it but, im not sure how to register that to check until the job has completed. Also tried to follow along with some ways other people have done it using different languages, but, I cant pick it up.
I don't want to post everything ive tried so this post doesn't bore anyone. Searched on google as well but, I couldn't find much on registering an object for multiple jobs.
EDIT
Heres what does work:
$job = Start-Job -Name GetLogFiles { Start-Sleep 10 }
Register-ObjectEvent $job StateChanged -Action {
[System.Windows.MessageBox]::Show('Done')
$eventSubscriber | Unregister-Event
$eventSubscriber.Action | Remove-Job
} | Out-Null
Which is what id like to happened, but to evaluate all jobs, not just one.
This is what a personally use when monitoring running jobs:
$jobs= 1..10 | ForEach-Object -Process {
Start-Job { Start-Sleep $using:_ ; "job {0} done" -f $using:_ } -Name "$_"
}
do{
$i = (Get-Job -State Completed).count
$progress = #{
Activity = 'Jobs completed'
Status = "$i of {0}" -f $jobs.Count
PercentComplete = $i / $jobs.count * 100
}
Write-Progress #progress
Start-Sleep -Milliseconds 10
}
until($i -eq $jobs.Count)
$result = Get-Job | Receive-Job
$jobs | Remove-Job
Of course, under certain scenarios where I know some jobs might fail I change the until(...) condition for something different and the do {...} contains the logic for restarting failing jobs.
Edit 1:
It's worth mentioning that Start-Job is not worth your time if you're interested in multithreading, it has been proven to be slower than a linear loop in many scenarios. You should be looking at the ThreadJob Module
Edit 2:
After some testing, this worked for me:
# Clear the Event queue
Get-EventSubscriber|Unregister-Event
# Clear the Job queue
Get-Job|Remove-Job
1..10 | ForEach-Object -Process {
$job = Start-Job { Sleep -Seconds (1..20|Get-Random) } -Name "$_"
Register-ObjectEvent -InputObject $job -EventName StateChanged -Action {
$eventSubscriber | Unregister-Event
$eventSubscriber.Action | Remove-Job
if(-not (Get-EventSubscriber))
{
[System.Windows.MessageBox]::Show('Done')
}
} | Out-Null
}
At first I didn't even know this was possible so thanks for pointing this out. Great question :)

Wait until all threads complete before running next task

I would wrap everything inside foreach($computer in $computers) in a Start-Job to make them run simultaneously. The only problem is, I need to wait for all the jobs to complete before I do the ConvertTo-Json at the bottom.
$sb = "OU=some,OU=ou,DC=some,DC=domain"
$computers = Get-ADComputer -Filter {(Enabled -eq $true)} -SearchBase "$sb" -Properties *
$hasmanufacturer = New-Object System.Collections.Generic.List[System.Object]
foreach($computer in $computers)
{
$drives = try{#(Get-WMIObject -Class Win32_CDROMDrive -Property * -ComputerName $computer.Name -ErrorAction Stop)} catch {$null}
foreach($drive in $drives)
{
if($drive.Manufacturer)
{
$hasmanufacturer.Add($computer)
continue
}
} # inner foreach
}
ConvertTo-Json $hasmanufacturer
Use a Get-Job | Wait-Job before executing the ConvertTo-Json
How about using the array of computer names as a parameter to Invoke-Command. It will run, by default, 32 concurrent remote sessions. The number can be changed with the -Throttle parameter.
$computers = Get-ADComputer -Filter {(Enabled -eq $true)} -SearchBase "OU=Servers,DC=xxx,DC=com" -Properties Name |
Where-Object { $_.Name -match 'LAX_*' } |
ForEach-Object { $_.Name }
$computers
$j = Invoke-Command `
-ComputerName $computers `
-ScriptBlock { Get-WMIObject -Class Win32_CDROMDrive -Property * -ErrorAction Stop } `
-AsJob
while ( (Get-Job -Id $j.Id).Status -eq 'Running') {}
Get-Job -Id $j.Id | Wait-Job
$results = Receive-Job -Id $j.Id
$results

Need to Multithread this script and log the results of each [duplicate]

I'm trying to run a virus scan on a list of servers in our environment. There are hundreds of machines, so we'd like to run the scan (using a command line prompt that we already have) around 10 at a time. We're totally new to PowerShell so any help would be really appreciated. We have a general idea of what commands we need to use -- here's how we think it might work for now:
$server = Get-Content "serverlist.txt"
$server | % {
$VirusScan = { Scan32.exe }
Invoke-Command -ScriptBlock { $VirusScan } -computerName $server -ThrottleLimit 10 -Authentication domain/admin
}
Does anyone have any suggestions on how we might orchestrate this?
I'm using something like this for running tasks in parallel on remote hosts:
$maxSlots = 10
$hosts = "foo", "bar", "baz", ...
$job = {
Invoke-Command -ScriptBlock { Scan32.exe } -Computer $ARGV[0] -ThrottleLimit 10 -Authentication domain/admin
}
$queue = [System.Collections.Queue]::Synchronized((New-Object System.Collections.Queue))
$hosts | ForEach-Object { $queue.Enqueue($_) }
while ( $queue.Count -gt 0 -or #(Get-Job -State Running).Count -gt 0 ) {
$freeSlots = $maxSlots - #(Get-Job -State Running).Count
for ( $i = $freeSlots; $i -gt 0 -and $queue.Count -gt 0; $i-- ) {
Start-Job -ScriptBlock $job -ArgumentList $queue.Dequeue() | Out-Null
}
Get-Job -State Completed | ForEach-Object {
Receive-Job -Id $_.Id
Remove-Job -Id $_.Id
}
Sleep -Milliseconds 100
}
# Remove all remaining jobs.
Get-Job | ForEach-Object {
Receive-Job -Id $_.Id
Remove-Job -Id $_.Id
}

What is the best way to collect and transform output from multiple PowerShell threads?

I am new to PowerShell scripting and would like to do the following:
Given a list of config names and servers, return the values for the configs from each server.
Transform them in such a way to group them by config name, and not server.
Currently, I have a script that spawns one job per server and calls a script remotely on the server to return the list of configs for that server.
However, I do not know how to aggregate and transform the output from these jobs so that instead of getting config names by server, I would like to sort them by config name first, then server.
Current output:
Server1:
Config1 = 'abc'
Config2 = 'def'
Server2:
Config1 = 'xyz'
Config2 = '123'
Desired output:
Config1:
Server1 : 'abc'
Server2 : 'xyz'
Config2:
Server1 : 'def'
Server2 : '123'
I don't want to iterate over the config names because that would waste time in connecting to the server for every call. Therefore I'd like to iterate over the servers and do some kind of transformation.
I'm wondering if this is a matter of having each job return some kind of dictionary, then iterate over them after all the threads finish to transform?
Here is the code that calls the jobs:
$all_servers = #('server1', 'server2')
$config_names = #('config1', 'config2')
foreach($servername in $all_servers) {
Start-Job -FilePath C:\scripts\get_config_from_servers.ps1
-ArgumentList $servername,$config_names
}
Get-Job | Wait-Job
Get-Job | Receive-Job | Out-GridView
Here is the job script:
Param($servername,$config_names)
$session = Get-Session -computername $servername
-username $$$$
-pwd ####
try {
$sb = {
param($servername,$config_names)
$output = #{}
foreach ($cfg in $config_names) {
$config_value = Get-Config -configname $cfg
$output.Add("$servername : $cfg", "($config_value)")
}
write-host $output | Out-String
return $output | Out-String
}
$out = Invoke-Command -session $session
-ScriptBlock $sb
-ArgumentList $servername,$config_names
write-host $out
return $out
}
finally {
Remove-PSSession $session
}
Instead of making a hash table and converting to a string you could create some custom object in you job script just like this SO Question
Instead of this:
$output = #{}
foreach ($cfg in $config_names) {
$config_value = Get-Config -configname $cfg
$output.Add("$servername : $cfg", "($config_value)")
}
write-host $output | Out-String
return $output | Out-String
You could try something like this:
$output = New-Object System.Object
Add-Member -MemberType NoteProperty -Name Server -Value $servername -InputObject $output
foreach ($cfg in $config_names) {
$config_value = Get-Config -configname $cfg
Add-Member -MemberType NoteProperty -Name "Config$cfg" -Value $config_value -InputObject $output
}
write-host $output
return $output
I can't test this accurately as i'm not sure what Get-Config is but hopefully it should be enough to get you thinking.

Resources