In node.js you can kick off several asynchronous functions at one time and then have them return the results to a function when they're all complete: understanding-node-js-async-parallel
In PowerShell I've seen a similar functionality using Start-Job but! when actually trying to run the tasks they seem to lag too much to be non-blocking and being started in parallel:
Write-Host "Running jobs $($start) - $($end)"
for($i = [int]$start; $i -le $end; $i++) {
$jobResults += Start-Job -ScriptBlock $someFunc | wait-job | receive-job
}
I believe this stems from the wait-job function which waits for the kicked off job to complete before receiving it.
Is there a way to wait for all of them and receive them into an array like node async does? Or am I looking at apples and oranges here?
Your snippet is actually running jobs synchronously. To run them async do something like this:
$jobs = {sleep 5; Write-Output "job1"}, {sleep 6; Write-Output "job2"}, {sleep 7; Write-Output "job3"}
$result = $jobs | foreach { start-job $_ } | wait-job | receive-job
Although Start-Job will create a new process (probably not the best option if your jobs run just few seconds). If you need to run jobs in separate threads you'll need to use runspaces. You can code it manually, but there are some tools for that as well (like Invoke-Parallel)
I have access to a single-core single-processor VM with which to do logging for my team. I have the following code:
$sb = {
Param($_)
if($_.CONTROLLER -ne ".xx" ){
$posIP = "10." + $_.IP + $_.CONTROLLER
if (Test-Connection -ComputerName $posIP -Count 1 -Quiet) {
$mapPath = "\\" + $posIP + "\c$"
net use $mapPath $password /user:$userName | Out-Null
if(Test-Path $mapPath$dataFile) {
[xml]$periods = Get-Content $mapPath$dataFile
$endDate = $periods.IndataDbf.ingredient.PeriodDetail.PeriodEndDate | select -last 1
$output = "$($_.STORE);$endDate" }
else {
$outPut = $_.STORE + ';' + "$dataFile Not Found" }
net use $mapPath /de | Out-Null
}
else {
$outPut = $_.STORE + ';' + "Map FAILED" }
Write-Output $OutPut
}
}
Import-Csv $inFile | ForEach-Object {
while ((Get-Job -State Running).Count -ge 100) {
Start-Sleep -Seconds 5;
}
Write-Output $_.STORE
Start-Job -Scriptblock $sb -ArgumentList $_ | Write-Verbose
Get-Job -State Completed -HasMoreData 1 | Receive-Job | Out-File -Append -FilePath $outLog
}
Get-Job | Wait-Job | Receive-Job | Out-File -Append -FilePath $outLog
Which runs well, but takes the same amount of time as running the same code without Start-Job and just a loop. However, the previous logging command used BATCH files and automatically opened a couple dozen child command windows to process data, then return, and it runs in under half the time. The code used is the same, so I don't understand why adding more threads didn't make the script run faster. Can anyone tell me why a BATCH file program with a couple dozen child windows runs so much faster with arguably the same code? Any why does the Start-Job command not improve the speed at all? I would think it would try to execute multiple threads simultaneously.
Because there is a lot of overhead when using start-job and whenever you use pipeline.
If you use runspaces instead it maybe faster.Take a look at http://newsqlblog.com/2012/05/22/concurrency-in-powershell-multi-threading-with-runspaces/
I have a little performance issue in my script, so i would like to implement some sort of worker theads. but so far i have not been able to find a solution..
what im hoping for is something like this:
start a pool of worker threads - these threads takes "commands" from a queue and process them
the main script will write "commands" to the queue as it runs
once complete the main will tell each thread to stop
main will wait for all workers to end before exiting.
does anybody have en idea on how to do this?
You can do this with Powershell workflows.
From Windows PowerShell: What is Windows PowerShell Workflow?
Workflows can also execute things in parallel, if you like. For
example, if you have a set of tasks that can run in any order, with no
interdependencies, then you can have them all run at more or less the
same time
Just do a search on "Powershell workflows" and you will find a good amount of documentation to get you started.
The basic approach to using a job is this:
$task1 = { ls c:\windows\system32 -r *.dll -ea 0 | where LastWriteTime -gt (Get-Date).AddDays(-21) }
$task2 = { ls E:\Symbols -r *.dll | where LastWriteTime -gt (Get-Date).AddDays(-21) }
$task3 = { Invoke-WebRequest -Uri http://blogs.msdn.com/b/mainfeed.aspx?Type=BlogsOnly | % Content }
$job1 = Start-Job $task1; $job2 = Start-Job $task2; $job3 = Start-Job $task3
Wait-Job $job1,$job2,$job3
$job1Data = Receive-Job $job1
$job2Data = Receive-Job $job2
$job3Data = Receive-Job $job3
If you need to have those background jobs waiting in a loop to do work as the main script dictates have a look at this SO answer to see how to use MSMQ to do this.
With some help from the pointers made by Keith hill - i got it working - thanks a bunch...
Here is a snipping of the code that did my prove of concept:
function New-Task([int]$Index,[scriptblock]$ScriptBlock) {
$ps = [Management.Automation.PowerShell]::Create()
$res = New-Object PSObject -Property #{
Index = $Index
Powershell = $ps
StartTime = Get-Date
Busy = $true
Data = $null
async = $null
}
[Void] $ps.AddScript($ScriptBlock)
[Void] $ps.AddParameter("TaskInfo",$Res)
$res.async = $ps.BeginInvoke()
$res
}
$ScriptBlock = {
param([Object]$TaskInfo)
$TaskInfo.Busy = $false
Start-Sleep -Seconds 1
$TaskInfo.Data = "test $($TaskInfo.Data)"
}
$a = New-Task -Index 1 -ScriptBlock $ScriptBlock
$a.Data = "i was here"
Start-Sleep -Seconds 5
$a
And here is the result proving that the data was communicated into the thread and back again:
Data : test i was here
Busy : False
Powershell : System.Management.Automation.PowerShell
Index : 1
StartTime : 11/25/2013 7:37:07 AM
async : System.Management.Automation.PowerShellAsyncResult
as you can see the $a.data now have "test" in front
So thanks a lot...
I have one URL for which its query changes. The queries are stored in an array so changing the URLs isn't a problem within a loop (I'm not interested in any particular query).
I'm having a hard time creating jobs for each URL and starting a group of jobs at the same time and monitoring them.
I figure to start to iterate through the array of queries 5 at a time, I'd be calling 5 new URLs so every iteration needs to have an array of jobs whose elements are the URLs for that iteration.
Is my approach right? Any pointers will be appreciated!
This is sample code to demonstrate my approach:
$queries = 1..10
$jobs = #()
foreach ($i in $queries) {
if ($jobs.Count -lt 5) {
$ScriptBlock = {
$query = $queries[$i]
$path = "http://mywebsite.com/$query"
Invoke-WebRequest -Uri $path
}
$jobs += Start-Job -ScriptBlock $ScriptBlock
} else {
$jobs | Wait-Job -Any
}
}
You will run into a couple of issues with the code above. The scriptblock gets transferred to a different PowerShell.exe process to execute so it won't have acess to $queries. You will to pass that it like so:
...
$scriptblock = {param($queries)
...
}
...
$jobs += Start-Job $scriptblock -Arg $queries
The other issue is that you never remove a completed job from $job so once this $jobs.Count -lt 5 expression evals to false because the count has reached 5, you'll never add anymore jobs. Try something like this:
$jobs | Wait-Job -Any
$jobs = $jobs | Where ($_.State -eq 'Running'}
Then you'll wind up with only the running jobs in $jobs which will allow you to start more jobs as previous jobs complete (or fail).
I have a powershell script to do some batch processing on a bunch of images and I'd like to do some parallel processing. Powershell seems to have some background processing options such as start-job, wait-job, etc, but the only good resource I found for doing parallel work was writing the text of a script out and running those (PowerShell Multithreading)
Ideally, I'd like something akin to parallel foreach in .net 4.
Something pretty seemless like:
foreach-parallel -threads 4 ($file in (Get-ChildItem $dir))
{
.. Do Work
}
Maybe I'd be better off just dropping down to c#...
You can execute parallel jobs in Powershell 2 using Background Jobs. Check out Start-Job and the other job cmdlets.
# Loop through the server list
Get-Content "ServerList.txt" | %{
# Define what each job does
$ScriptBlock = {
param($pipelinePassIn)
Test-Path "\\$pipelinePassIn\c`$\Something"
Start-Sleep 60
}
# Execute the jobs in parallel
Start-Job $ScriptBlock -ArgumentList $_
}
Get-Job
# Wait for it all to complete
While (Get-Job -State "Running")
{
Start-Sleep 10
}
# Getting the information back from the jobs
Get-Job | Receive-Job
The answer from Steve Townsend is correct in theory but not in practice as #likwid pointed out. My revised code takes into account the job-context barrier--nothing crosses that barrier by default! The automatic $_ variable can thus be used in the loop but cannot be used directly within the script block because it is inside a separate context created by the job.
To pass variables from the parent context to the child context, use the -ArgumentList parameter on Start-Job to send it and use param inside the script block to receive it.
cls
# Send in two root directory names, one that exists and one that does not.
# Should then get a "True" and a "False" result out the end.
"temp", "foo" | %{
$ScriptBlock = {
# accept the loop variable across the job-context barrier
param($name)
# Show the loop variable has made it through!
Write-Host "[processing '$name' inside the job]"
# Execute a command
Test-Path "\$name"
# Just wait for a bit...
Start-Sleep 5
}
# Show the loop variable here is correct
Write-Host "processing $_..."
# pass the loop variable across the job-context barrier
Start-Job $ScriptBlock -ArgumentList $_
}
# Wait for all to complete
While (Get-Job -State "Running") { Start-Sleep 2 }
# Display output from all jobs
Get-Job | Receive-Job
# Cleanup
Remove-Job *
(I generally like to provide a reference to the PowerShell documentation as supporting evidence but, alas, my search has been fruitless. If you happen to know where context separation is documented, post a comment here to let me know!)
There's so many answers to this these days:
jobs (or threadjobs in PS 6/7 or the module for PS 5)
start-process
workflows (PS 5 only)
powershell api with another runspace
invoke-command with multiple computers, which can all be localhost (have to be admin)
multiple session (runspace) tabs in the ISE, or remote powershell ISE tabs
Powershell 7 has a foreach-object -parallel as an alternative for #4
Using start-threadjob in powershell 5.1. I wish this worked like I expect, but it doesn't:
# test-netconnection has a miserably long timeout
echo yahoo.com facebook.com |
start-threadjob { test-netconnection $input } | receive-job -wait -auto
WARNING: Name resolution of yahoo.com microsoft.com facebook.com failed
It works this way. Not quite as nice and foreach-object -parallel in powershell 7 but it'll do.
echo yahoo.com facebook.com |
% { $_ | start-threadjob { test-netconnection $input } } |
receive-job -wait -auto | ft -a
ComputerName RemotePort RemoteAddress PingSucceeded PingReplyDetails (RTT) TcpTestS
ucceeded
------------ ---------- ------------- ------------- ---------------------- --------
facebook.com 0 31.13.71.36 True 17 ms False
yahoo.com 0 98.137.11.163 True 97 ms False
Here's workflows with literally a foreach -parallel:
workflow work {
foreach -parallel ($i in 1..3) {
sleep 5
"$i done"
}
}
work
3 done
1 done
2 done
Or a workflow with a parallel block:
function sleepfor($time) { sleep $time; "sleepfor $time done"}
workflow work {
parallel {
sleepfor 3
sleepfor 2
sleepfor 1
}
'hi'
}
work
sleepfor 1 done
sleepfor 2 done
sleepfor 3 done
hi
Here's an api with runspaces example:
$a = [PowerShell]::Create().AddScript{sleep 5;'a done'}
$b = [PowerShell]::Create().AddScript{sleep 5;'b done'}
$c = [PowerShell]::Create().AddScript{sleep 5;'c done'}
$r1,$r2,$r3 = ($a,$b,$c).begininvoke() # run in background
$a.EndInvoke($r1); $b.EndInvoke($r2); $c.EndInvoke($r3) # wait
($a,$b,$c).streams.error # check for errors
($a,$b,$c).dispose() # clean
a done
b done
c done
In Powershell 7 you can use ForEach-Object -Parallel
$Message = "Output:"
Get-ChildItem $dir | ForEach-Object -Parallel {
"$using:Message $_"
} -ThrottleLimit 4
http://gallery.technet.microsoft.com/scriptcenter/Invoke-Async-Allows-you-to-83b0c9f0
i created an invoke-async which allows you do run multiple script blocks/cmdlets/functions at the same time. this is great for small jobs (subnet scan or wmi query against 100's of machines) because the overhead for creating a runspace vs the startup time of start-job is pretty drastic. It can be used like so.
with scriptblock,
$sb = [scriptblock] {param($system) gwmi win32_operatingsystem -ComputerName $system | select csname,caption}
$servers = Get-Content servers.txt
$rtn = Invoke-Async -Set $server -SetParam system -ScriptBlock $sb
just cmdlet/function
$servers = Get-Content servers.txt
$rtn = Invoke-Async -Set $servers -SetParam computername -Params #{count=1} -Cmdlet Test-Connection -ThreadCount 50
Backgrounds jobs are expensive to setup and are not reusable. PowerShell MVP Oisin Grehan
has a good example of PowerShell multi-threading.
(10/25/2010 site is down, but accessible via the Web Archive).
I'e used adapted Oisin script for use in a data loading routine here:
http://rsdd.codeplex.com/SourceControl/changeset/view/a6cd657ea2be#Invoke-RSDDThreaded.ps1
To complete previous answers, you can also use Wait-Job to wait for all jobs to complete:
For ($i=1; $i -le 3; $i++) {
$ScriptBlock = {
Param (
[string] [Parameter(Mandatory=$true)] $increment
)
Write-Host $increment
}
Start-Job $ScriptBlock -ArgumentList $i
}
Get-Job | Wait-Job | Receive-Job
If you're using latest cross platform powershell (which you should btw) https://github.com/powershell/powershell#get-powershell, you can add single & to run parallel scripts. (Use ; to run sequentially)
In my case I needed to run 2 npm scripts in parallel: npm run hotReload & npm run dev
You can also setup npm to use powershell for its scripts (by default it uses cmd on windows).
Run from project root folder: npm config set script-shell pwsh --userconfig ./.npmrc
and then use single npm script command: npm run start
"start":"npm run hotReload & npm run dev"
This has been answered thoroughly. Just want to post this method i have created based on Powershell-Jobs as a reference.
Jobs are passed on as a list of script-blocks. They can be parameterized.
Output of the jobs is color-coded and prefixed with a job-index (just like in a vs-build-process, as this will be used in a build)
Can be used to startup multiple servers at a time or running build steps in parallel or so..
function Start-Parallel {
param(
[ScriptBlock[]]
[Parameter(Position = 0)]
$ScriptBlock,
[Object[]]
[Alias("arguments")]
$parameters
)
$jobs = $ScriptBlock | ForEach-Object { Start-Job -ScriptBlock $_ -ArgumentList $parameters }
$colors = "Blue", "Red", "Cyan", "Green", "Magenta"
$colorCount = $colors.Length
try {
while (($jobs | Where-Object { $_.State -ieq "running" } | Measure-Object).Count -gt 0) {
$jobs | ForEach-Object { $i = 1 } {
$fgColor = $colors[($i - 1) % $colorCount]
$out = $_ | Receive-Job
$out = $out -split [System.Environment]::NewLine
$out | ForEach-Object {
Write-Host "$i> "-NoNewline -ForegroundColor $fgColor
Write-Host $_
}
$i++
}
}
} finally {
Write-Host "Stopping Parallel Jobs ..." -NoNewline
$jobs | Stop-Job
$jobs | Remove-Job -Force
Write-Host " done."
}
}
sample output:
There is a new built-in solution in PowerShell 7.0 Preview 3.
PowerShell ForEach-Object Parallel Feature
So you could do:
Get-ChildItem $dir | ForEach-Object -Parallel {
.. Do Work
$_ # this will be your file
}-ThrottleLimit 4