starting a job with dot source - multithreading

quick story:
for the sake of experimentation with powershell I'm trying to learn how to effectively multithread a script.
now i know how to start jobs and pass variables to my second script, however i have decided to try and figure out how to turn this:
start-job ((Split-Path -parent $PSCommandPath) + "\someScript.ps1") -ArgumentList (,$argList)
into this:
start-job (. ((Split-Path -parent $PSCommandPath) + "\someScript.ps1")) -ArgumentList (,$argList)
reason for this is i have a variable declared in the parent script like this:
New-Variable var -value 0 -Option AllScope
and in the child script: var = "something"
the first start-job passes my argument but the child doesn't set the global 'var' variable
the second doesn't pass my argument but the child script sets the global variable defined in the the parent just fine. $argList variable will be populated right up to this line of code in the second start-job but right after execution of the line, debug reveals the $argList variable to be null and i get "Start-Job : Cannot bind argument to parameter 'ScriptBlock' because it is null."
for the sake of argument assume that right up to the stated lines of code the variables contain the data they should.
can someone help me out with what is wrong with both attempts.
Google has failed to give me any specifics answers to my problem.
thanks in advance for any help i can get.
EDIT:
using Start-Job (. ((Split-Path -parent $PSCommandPath) + "\someScript.ps1") $argList)
accomplishes my goals however i keep getting Start-Job : Cannot bind argument to parameter 'ScriptBlock' because it is null.
even though the arguments are in the script block and the child script is getting and processing the argument.

When you call Start-Job, the script runs in an entirely separate scope (PowerShell Runspace). You can't dot-source a script called directly through Start-Job. You'll have to have the external script process the parameter that's passed in via -ArgumentList, and then return it to the original host Runspace via Receive-Job.
Here's a complete example:
$a = 1;
$Job = Start-Job -FilePath C:\test\script.ps1 -ArgumentList $a;
Write-Host -Object "Before: $a"; # Before
Wait-Job -Job $Job;
$a = Receive-Job -Job $Job -Keep;
Write-Host -Object "After: $a"; # After
c:\test\script.ps1
Here's the contents of the file c:\test\script.ps1:
Write-Output -InputObject (([int]$args[0]) += 5);
Thread and Runspace Exploration
If you want to prove my earlier point about Start-Job creating a new thread and PowerShell Runspace, and a new Thread, then run this script:
# 1. Declare a thread block that retrieves the Runspace ID & ThreadID
$ThreadBlock = {
[runspace]::DefaultRunspace.InstanceId.ToString();
[System.Threading.Thread]::CurrentThread.ManagedThreadId;
};
# 2. Start a job and wait for it to finish
$Job = Start-Job -ScriptBlock $ThreadBlock;
[void](Wait-Job -Job $Job);
Receive-Job -Job $Job -Keep;
# 3. Call the same ScriptBlock locally
& $ThreadBlock;
# 4. Note the differences in the Runspace InstanceIDs and ThreadIDs
Receiving Results Before Job Finish
You can call Receive-Job multiple times before a PowerShell Job has completed, to retrieve results. Here's an example of how that could theoretically work:
$ScriptBlock = {
1..5 | % { Start-Sleep -Seconds 2; Write-Output -InputObject $_; };
};
$Job = Start-Job -ScriptBlock $ScriptBlock;
while ($Job.JobStateInfo.State -notin ([System.Management.Automation.JobState]::Completed,[System.Management.Automation.JobState]::Failed)) {
Start-Sleep -Seconds 3;
$Results = Receive-Job -Job $Job;
Write-Host -Object ('Received {0} items: {1}' -f $Results.Count, ($Results -join ' '));
}

Related

$MyInvocation in a runspace Thread

A script (foo.ps1) creates a thread, and that thread creates more threads. Foo's thread is my control thread, and it creates one or more worker threads. The worker threads run a script block. The script block calls functions from a library-script. The library script has a configuration file.
The script block loads the library-script by dot-sourcing it.
$block = {
Param($library_script)
. $library_script
...stuff...
}
When the script loads, the first thing it does is find its configuration file, which is in the script's directory. The code for that looks like...
## Global variables and enumerations
$script:self_location = $script:MyInvocation.MyCommand.Path
$script:configuration_file_location = "{0}.config" -f $script:self_location
My problem is $MyInvocation doesn't appear to exist. As result, the library script can't find it's configuration file.
I'm running Powershell 5.1 on Windows 10. The control thread was made in a runspace. The worker threads are made in a runspace pool.
Does anyone know the rules around the automatic $MyInvocation variable in runspace threads?
Create a file foo.ps1 and add the following to it:
Write-Output '[1] Executed in-scope'
$MyInvocation.MyCommand.Path
Write-Output '[2] Executed in-thread'
$p1 = [PowerShell]::Create()
$p1.AddScript({ $MyInvocation.MyCommand.Path }) | Out-Null
$p1.Invoke()
$p1.Dispose()
Write-Output '[3] Executed in-thread in-thread'
$t = {
$p = [PowerShell]::Create()
$p.AddScript({ $MyInvocation.MyCommand.Path }) | Out-Null
$p.Invoke()
}
$p2 = [PowerShell]::Create()
$p2.AddScript( $t ) | Out-Null
$p2.Invoke()
$p2.Dispose()
Run it. You should see something like the following...
[1] Executed in-scope
C:\Users\deezNuts\development\comcast\sandbox\thing.ps1
[2] Executed in-thread
[3] Executed in-thread in-thread
And, I think I just answered my own question.
I see variable in thread.
$rsp = [runspacefactory]::CreateRunspacePool(1, 2, $iss, $Host)
$rsp.ApartmentState = "STA"
$rsp.ThreadOptions = "ReuseThread"
$rsp.Open()
$p = [PowerShell]::Create()
$p.RunspacePool = $rsp
$p.AddScript({ write-host $MyInvocation.MyCommand.Path })
$h = $p.BeginInvoke()
$p.EndInvoke($h)
$p.Dispose()
$rsp.Dispose()
What are you doing differently?
Im not sure that $MyInvocation variable is supported in a background job
Start-Job -Name Test -ScriptBlock {Get-Variable}
Receive-Job Test
Can you pass the path as a parameter ?

Powershell Throttle Multi thread jobs via job completion

All the tuts I have found use a pre defined sleep time to throttle jobs.
I need the throttle to wait until a job is completed before starting a new one.
Only 4 jobs can be running at one time.
So The script will run up 4 and currently pauses for 10 seconds then runs up the rest.
What I want is for the script to only allow 4 jobs to be running at one time and as a job is completed a new one is kicked off.
Jobs are initialised via a list of servers names.
Is it possible to archive this?
$servers = Get-Content "C:\temp\flashfilestore\serverlist.txt"
$scriptBlock = { #DO STUFF }
$MaxThreads = 4
foreach($server in $servers) {
Start-Job -ScriptBlock $scriptBlock -argumentlist $server
While($(Get-Job -State 'Running').Count -ge $MaxThreads) {
sleep 10 #Need this to wait until a job is complete and kick off a new one.
}
}
Get-Job | Wait-Job | Receive-Job
You can test the following :
$servers = Get-Content "C:\temp\flashfilestore\serverlist.txt"
$scriptBlock = { #DO STUFF }
invoke-command -computerName $servers -scriptblock $scriptBlock -jobname 'YourJobSpecificName' -throttlelimit 4 -AsJob
This command uses the Invoke-Command cmdlet and its AsJob parameter to start a background job that runs a scriptblock on numerous computers. Because the command must not be run more than 4 times concurrently, the command uses the ThrottleLimit parameter of Invoke-Command to limit the number of concurrent commands to 4.
Be careful that the file contains the computer names in a domain.
In order to avoid inventing a wheel I would recommend to use one of the
existing tools.
One of them is the script
Invoke-Parallel.ps1.
It is written in PowerShell, you can see how it is implemented directly. It is
easy to get and it does not require any installation for using it.
Another one is the module SplitPipeline.
It may work faster because it is written in C#. It also covers some more use
cases, for example slow or infinite input, use of initialization and cleanup scripts.
In the latter case the code with 4 parallel pipelines will be
$servers | Split-Pipeline -Count 4 {process{ <# DO STUFF on $_ #> }}
I wrote a blog article which covers multithreading any given script via actual threads. You can find the full post here:
http://www.get-blog.com/?p=189
The basic setup is:
$ISS = [system.management.automation.runspaces.initialsessionstate]::CreateDefault()
$RunspacePool = [runspacefactory]::CreateRunspacePool(1, $MaxThreads, $ISS, $Host)
$RunspacePool.Open()
$Code = [ScriptBlock]::Create($(Get-Content $FileName))
$PowershellThread = [powershell]::Create().AddScript($Code)
$PowershellThread.RunspacePool = $RunspacePool
$Handle = $PowershellThread.BeginInvoke()
$Job = "" | Select-Object Handle, Thread, object
$Job.Handle = $Handle
$Job.Thread = $PowershellThread
$Job.Object = $Object.ToString()
$Job.Thread.EndInvoke($Job.Handle)
$Job.Thread.Dispose()
Instead of sleep 10 you could also just wait on a job (-any job):
Get-Job | Wait-Job -Any | Out-Null
When there are no more jobs to kick off, start printing the output. You can also do this within the loop immediately after the above command. The script will receive jobs as they finish instead of waiting until the end.
Get-Job -State Completed | % {
Receive-Job $_ -AutoRemoveJob -Wait
}
So your script would look like this:
$servers = Get-Content "C:\temp\flashfilestore\serverlist.txt"
$scriptBlock = { #DO STUFF }
$MaxThreads = 4
foreach ($server in $servers) {
Start-Job -ScriptBlock $scriptBlock -argumentlist $server
While($(Get-Job -State Running).Count -ge $MaxThreads) {
Get-Job | Wait-Job -Any | Out-Null
}
Get-Job -State Completed | % {
Receive-Job $_ -AutoRemoveJob -Wait
}
}
While ($(Get-Job -State Running).Count -gt 0) {
Get-Job | Wait-Job -Any | Out-Null
}
Get-Job -State Completed | % {
Receive-Job $_ -AutoRemoveJob -Wait
}
Having said all that, I prefer runspaces (similar to Ryans post) or even workflows if you can use them. These are far less resource intensive than starting multiple powershell processes.
Your script looks good, try and add something like
Write-Host ("current count:" + ($(Get-Job -State 'Running').Count) + " on server:" + $server)
after your while loop to work out whether the job count is going down where you wouldn't expect it.
I noticed that every Start-Job command resulted in an additional conhost.exe process in the task manager. Knowing this, I was able to throttle using the following logic, where 5 is my desired number of concurrent threads (so I use 4 in my -gt statement since I am looking for a count greater than):
while((Get-Process conhost -ErrorAction SilentlyContinue).Count -gt 4){Start-Sleep -Seconds 1}

Launch PowerShell scriptblock asynchronously without dependency on parent process

Below is a small PowerShell script that runs some PowerShell code asynchronously to show a dialog box (for the purpose of demonstrating my issue).
If I close the parent PowerShell process, the child process will also close and the dialog box disappears. Is there any way to launch a PowerShell scriptblock, complete with functions and arguments, asynchronously and without a dependence on the parent PowerShell process?
$testjob = [PowerShell]::Create().AddScript({ $a = new-object -comobject wscript.shell
$b = $a.popup('This is a test',5,'Test Message Box',1) })
$result = $testJob.BeginInvoke()
Update #2
I am trying to execute a script block, rather than an external script. The script block should use functions and variables from the parent script. The problem is, I can't pass those functions or variables in to the new process unless they are contained directly within the script block. Any idea if this is doable?
Function Show-Prompt {
Param ($title,$message)
$a = new-object -comobject wscript.shell
$b = $a.popup($message,5,$title,1)
}
$scriptContent = {Show-Prompt -Message 'This is a test' -Title 'Test Message Box'}
$scriptBlock = [scriptblock]::Create($scriptContent)
Start-process powershell -argumentlist "-noexit -command &{$scriptBlock}"
You can use an intermediate PowerShell process. There has to be a direct parent to process the return value from the async script. For instance, put your script in a file called popup.ps1 and try execute like so:
Start-Process PowerShell -Arg c:\popup.ps1
You might want to bump the timeout up a bit from say 5 to 10 seconds. You can close the original PowerShell and the popup will stay. When you close the popup (or it times out) the secondary PowerShell window will disappear.
You can do this with WMI. If you use Win32_Process to create the process, it will persist after you close PowerShell.
For instance:
invoke-wmimethod -path win32_process -name create -argumentlist calc
function GeneratePortableFunction {
param ([string] $name, [scriptblock] $sb)
$block = [ScriptBlock]::Create("return `${function:$name};");
$script = $block.Invoke();
$block = [ScriptBlock]::Create($script);
return ("function {0} {{ {1} }}" -f $name, $block.ToString());
}
function RemoteScript {
param ([string] $header, [string[]] $functions, [string] $footer)
$sb = New-Object System.Text.StringBuilder;
[void]$sb.Append("$header`n");
$functions | % {
[void]$sb.Append($_);
[void]$sb.Append("`n");
}
[void]$sb.Append($footer);
return [ScriptBlock]::Create($sb.ToString());
}
$fnc = #();
$fnc += GeneratePortableFunction -name "NameOfYourFunction1";
$fnc += GeneratePortableFunction -name "NameOfYourFunction2(CallsNameOfYourFunction1)";
$script = RemoteScript -header "param([int] `$param1)" -functions #($fnc) -footer "NameOfYourFunction2 -YourParameter `$param1;";
$p1 = 0;
$job = Start-Job -ScriptBlock $script -ArgumentList #($p1);
while($job.State -eq "Running")
{
write-host "Running...";
Start-Sleep 5;
}
$result = $job | Receive-Job;
Remove-Job -Id $job.Id;

Calling multiple URLs at a time using multithreading in powershell

I have one URL for which its query changes. The queries are stored in an array so changing the URLs isn't a problem within a loop (I'm not interested in any particular query).
I'm having a hard time creating jobs for each URL and starting a group of jobs at the same time and monitoring them.
I figure to start to iterate through the array of queries 5 at a time, I'd be calling 5 new URLs so every iteration needs to have an array of jobs whose elements are the URLs for that iteration.
Is my approach right? Any pointers will be appreciated!
This is sample code to demonstrate my approach:
$queries = 1..10
$jobs = #()
foreach ($i in $queries) {
if ($jobs.Count -lt 5) {
$ScriptBlock = {
$query = $queries[$i]
$path = "http://mywebsite.com/$query"
Invoke-WebRequest -Uri $path
}
$jobs += Start-Job -ScriptBlock $ScriptBlock
} else {
$jobs | Wait-Job -Any
}
}
You will run into a couple of issues with the code above. The scriptblock gets transferred to a different PowerShell.exe process to execute so it won't have acess to $queries. You will to pass that it like so:
...
$scriptblock = {param($queries)
...
}
...
$jobs += Start-Job $scriptblock -Arg $queries
The other issue is that you never remove a completed job from $job so once this $jobs.Count -lt 5 expression evals to false because the count has reached 5, you'll never add anymore jobs. Try something like this:
$jobs | Wait-Job -Any
$jobs = $jobs | Where ($_.State -eq 'Running'}
Then you'll wind up with only the running jobs in $jobs which will allow you to start more jobs as previous jobs complete (or fail).

Can Powershell Run Commands in Parallel?

I have a powershell script to do some batch processing on a bunch of images and I'd like to do some parallel processing. Powershell seems to have some background processing options such as start-job, wait-job, etc, but the only good resource I found for doing parallel work was writing the text of a script out and running those (PowerShell Multithreading)
Ideally, I'd like something akin to parallel foreach in .net 4.
Something pretty seemless like:
foreach-parallel -threads 4 ($file in (Get-ChildItem $dir))
{
.. Do Work
}
Maybe I'd be better off just dropping down to c#...
You can execute parallel jobs in Powershell 2 using Background Jobs. Check out Start-Job and the other job cmdlets.
# Loop through the server list
Get-Content "ServerList.txt" | %{
# Define what each job does
$ScriptBlock = {
param($pipelinePassIn)
Test-Path "\\$pipelinePassIn\c`$\Something"
Start-Sleep 60
}
# Execute the jobs in parallel
Start-Job $ScriptBlock -ArgumentList $_
}
Get-Job
# Wait for it all to complete
While (Get-Job -State "Running")
{
Start-Sleep 10
}
# Getting the information back from the jobs
Get-Job | Receive-Job
The answer from Steve Townsend is correct in theory but not in practice as #likwid pointed out. My revised code takes into account the job-context barrier--nothing crosses that barrier by default! The automatic $_ variable can thus be used in the loop but cannot be used directly within the script block because it is inside a separate context created by the job.
To pass variables from the parent context to the child context, use the -ArgumentList parameter on Start-Job to send it and use param inside the script block to receive it.
cls
# Send in two root directory names, one that exists and one that does not.
# Should then get a "True" and a "False" result out the end.
"temp", "foo" | %{
$ScriptBlock = {
# accept the loop variable across the job-context barrier
param($name)
# Show the loop variable has made it through!
Write-Host "[processing '$name' inside the job]"
# Execute a command
Test-Path "\$name"
# Just wait for a bit...
Start-Sleep 5
}
# Show the loop variable here is correct
Write-Host "processing $_..."
# pass the loop variable across the job-context barrier
Start-Job $ScriptBlock -ArgumentList $_
}
# Wait for all to complete
While (Get-Job -State "Running") { Start-Sleep 2 }
# Display output from all jobs
Get-Job | Receive-Job
# Cleanup
Remove-Job *
(I generally like to provide a reference to the PowerShell documentation as supporting evidence but, alas, my search has been fruitless. If you happen to know where context separation is documented, post a comment here to let me know!)
There's so many answers to this these days:
jobs (or threadjobs in PS 6/7 or the module for PS 5)
start-process
workflows (PS 5 only)
powershell api with another runspace
invoke-command with multiple computers, which can all be localhost (have to be admin)
multiple session (runspace) tabs in the ISE, or remote powershell ISE tabs
Powershell 7 has a foreach-object -parallel as an alternative for #4
Using start-threadjob in powershell 5.1. I wish this worked like I expect, but it doesn't:
# test-netconnection has a miserably long timeout
echo yahoo.com facebook.com |
start-threadjob { test-netconnection $input } | receive-job -wait -auto
WARNING: Name resolution of yahoo.com microsoft.com facebook.com failed
It works this way. Not quite as nice and foreach-object -parallel in powershell 7 but it'll do.
echo yahoo.com facebook.com |
% { $_ | start-threadjob { test-netconnection $input } } |
receive-job -wait -auto | ft -a
ComputerName RemotePort RemoteAddress PingSucceeded PingReplyDetails (RTT) TcpTestS
ucceeded
------------ ---------- ------------- ------------- ---------------------- --------
facebook.com 0 31.13.71.36 True 17 ms False
yahoo.com 0 98.137.11.163 True 97 ms False
Here's workflows with literally a foreach -parallel:
workflow work {
foreach -parallel ($i in 1..3) {
sleep 5
"$i done"
}
}
work
3 done
1 done
2 done
Or a workflow with a parallel block:
function sleepfor($time) { sleep $time; "sleepfor $time done"}
workflow work {
parallel {
sleepfor 3
sleepfor 2
sleepfor 1
}
'hi'
}
work
sleepfor 1 done
sleepfor 2 done
sleepfor 3 done
hi
Here's an api with runspaces example:
$a = [PowerShell]::Create().AddScript{sleep 5;'a done'}
$b = [PowerShell]::Create().AddScript{sleep 5;'b done'}
$c = [PowerShell]::Create().AddScript{sleep 5;'c done'}
$r1,$r2,$r3 = ($a,$b,$c).begininvoke() # run in background
$a.EndInvoke($r1); $b.EndInvoke($r2); $c.EndInvoke($r3) # wait
($a,$b,$c).streams.error # check for errors
($a,$b,$c).dispose() # clean
a done
b done
c done
In Powershell 7 you can use ForEach-Object -Parallel
$Message = "Output:"
Get-ChildItem $dir | ForEach-Object -Parallel {
"$using:Message $_"
} -ThrottleLimit 4
http://gallery.technet.microsoft.com/scriptcenter/Invoke-Async-Allows-you-to-83b0c9f0
i created an invoke-async which allows you do run multiple script blocks/cmdlets/functions at the same time. this is great for small jobs (subnet scan or wmi query against 100's of machines) because the overhead for creating a runspace vs the startup time of start-job is pretty drastic. It can be used like so.
with scriptblock,
$sb = [scriptblock] {param($system) gwmi win32_operatingsystem -ComputerName $system | select csname,caption}
$servers = Get-Content servers.txt
$rtn = Invoke-Async -Set $server -SetParam system -ScriptBlock $sb
just cmdlet/function
$servers = Get-Content servers.txt
$rtn = Invoke-Async -Set $servers -SetParam computername -Params #{count=1} -Cmdlet Test-Connection -ThreadCount 50
Backgrounds jobs are expensive to setup and are not reusable. PowerShell MVP Oisin Grehan
has a good example of PowerShell multi-threading.
(10/25/2010 site is down, but accessible via the Web Archive).
I'e used adapted Oisin script for use in a data loading routine here:
http://rsdd.codeplex.com/SourceControl/changeset/view/a6cd657ea2be#Invoke-RSDDThreaded.ps1
To complete previous answers, you can also use Wait-Job to wait for all jobs to complete:
For ($i=1; $i -le 3; $i++) {
$ScriptBlock = {
Param (
[string] [Parameter(Mandatory=$true)] $increment
)
Write-Host $increment
}
Start-Job $ScriptBlock -ArgumentList $i
}
Get-Job | Wait-Job | Receive-Job
If you're using latest cross platform powershell (which you should btw) https://github.com/powershell/powershell#get-powershell, you can add single & to run parallel scripts. (Use ; to run sequentially)
In my case I needed to run 2 npm scripts in parallel: npm run hotReload & npm run dev
You can also setup npm to use powershell for its scripts (by default it uses cmd on windows).
Run from project root folder: npm config set script-shell pwsh --userconfig ./.npmrc
and then use single npm script command: npm run start
"start":"npm run hotReload & npm run dev"
This has been answered thoroughly. Just want to post this method i have created based on Powershell-Jobs as a reference.
Jobs are passed on as a list of script-blocks. They can be parameterized.
Output of the jobs is color-coded and prefixed with a job-index (just like in a vs-build-process, as this will be used in a build)
Can be used to startup multiple servers at a time or running build steps in parallel or so..
function Start-Parallel {
param(
[ScriptBlock[]]
[Parameter(Position = 0)]
$ScriptBlock,
[Object[]]
[Alias("arguments")]
$parameters
)
$jobs = $ScriptBlock | ForEach-Object { Start-Job -ScriptBlock $_ -ArgumentList $parameters }
$colors = "Blue", "Red", "Cyan", "Green", "Magenta"
$colorCount = $colors.Length
try {
while (($jobs | Where-Object { $_.State -ieq "running" } | Measure-Object).Count -gt 0) {
$jobs | ForEach-Object { $i = 1 } {
$fgColor = $colors[($i - 1) % $colorCount]
$out = $_ | Receive-Job
$out = $out -split [System.Environment]::NewLine
$out | ForEach-Object {
Write-Host "$i> "-NoNewline -ForegroundColor $fgColor
Write-Host $_
}
$i++
}
}
} finally {
Write-Host "Stopping Parallel Jobs ..." -NoNewline
$jobs | Stop-Job
$jobs | Remove-Job -Force
Write-Host " done."
}
}
sample output:
There is a new built-in solution in PowerShell 7.0 Preview 3.
PowerShell ForEach-Object Parallel Feature
So you could do:
Get-ChildItem $dir | ForEach-Object -Parallel {
.. Do Work
$_ # this will be your file
}-ThrottleLimit 4

Resources