Powershell: Recursive Functions and Multiple Parameters in Start-Job - multithreading

I have been banging my head against a wall for a couple of days now trying to get Start-Job and background jobs working to no avail.
I have tried ScriptBlock and ScriptFile but neither seem to do what I want, or I can't seem to get the syntax right.
I have a number of recursive functions and need to split up the script to work in parallel accross many chunks of a larger data set.
No matter how I arrange the Start-Job call, nothing seems to work, and the recursive functions seem to be making everything twice as hard.
Can anyone give me a working example of Start-Job calling a recursive function and having multiple parameters, or point me somewhere where one exists?
Any help appreciated

This works for me:
$sb = {param($path, $currentDepth, $maxDepth) function EnumFiles($dir,$currentDepth,$maxDepth) { if ($currentDepth -gt $maxDepth) { return }; Get-ChildItem $dir -File; Get-ChildItem $dir -Dir | Foreach {EnumFiles $_.FullName ($currentDepth+1) $maxDepth}}; EnumFiles $path $currentDepth $maxDepth }
$job = Start-Job -ScriptBlock $sb -ArgumentList $pwd,0,2
Wait-Job $job | Out-Null
Receive-Job $job
Keep in mind your functions have to be defined in the scriptblock because the script runs in a completely separate PowerShell process. Same goes for any externally defined variables - they have to be passed into Start-Job via the -ArgumentList parameter. The values will be serialized, passed to the PowerShell process executing the job, where they will then be provided to the scriptblock.

Related

Management Api stops reporting data after almost 4 hours of execution

I was working to generate DLP logs with the help of the script present here
body = #{grant_type="client_credentials";resource=$APIResource;client_id=$AppClientID;client_secret=$ClientSecretValue}
Write-Host -ForegroundColor Blue -BackgroundColor white "Obtaining authentication token..." -NoNewline
try{
$oauth = Invoke-RestMethod -Method Post -Uri "$loginURL/$tenantdomain/oauth2/token?api-version=1.0" -Body $body -ErrorAction Stop
$OfficeToken = #{'Authorization'="$($oauth.token_type) $($oauth.access_token)"}
Write-Host -ForegroundColor Green "Authentication token obtained"
} catch {
write-host -ForegroundColor Red "FAILED"
write-host -ForegroundColor Red "Invoke-RestMethod failed."
Write-host -ForegroundColor Red $error[0]
exit
}
This part of the code helps to generate the access token.
The script seems to work fine, until about 700mb of data is exported. For me, it takes around 3-4 hours to run and get this 700mb worth of data.
Now, the issue is, I believe that the only reason this is happening is because the Access token is defaulting. And since I am new to PowerShell, I haven't been able to write the part of the code that could help me generate a new token every 3-4 hours.
Any documentation that might help me would be of great help since I have not been able to find anything of substance on the internet regarding my request.
If it's not about the access token, what else could it be about?
Please help!
thanks in advance!
$StopWatch = [System.Diagnostics.Stopwatch]::StartNew() if($StopWatch.Elapsed.TotalSeconds -ge 3599){$OfficeToken = Get-Token; $StopWatch.Restart()}
I tried using this particular piece of code to generate it every hour, but that doesn't seem to help us either.I was expecting it to work flawlessly after this, but alas, that isn't the case.
The Script in question does not produce information for large sources of data, as confirmed from the author of the script..
I ran the script for other days and it worked fine.

IsComplete of AsynchronousObject returned by BeginInvoke() all true

I'm currently learning about runspaces in Powershell (my end goal is to set up a job scheduling system) to do this I wrote a basic script in order to learn and use runspaces.
What I Expected To Happen:
I expected that when I run the code up to the commented line, this will queue up the 8 jobs and run them within the RunspacePool , running a maximum of 2 at a time.
Running the single line $JobList.AsynchronousObject a few times and should then see more and more IsComplete flags turning from false to true as the jobs complete as they take 20 seconds each due to the Start-Sleep command.
The BeginInvoke command apparently returns an object implementing the IAsycResult interface.
https://learn.microsoft.com/en-us/dotnet/api/system.iasyncresult?redirectedfrom=MSDN&view=netframework-4.8#examples
In the IAsyncResult remarks in mentions polling the IsComplete property to see if an asychronous operation is completed which although not ideal is what I was trying to do below for learning purposes.
Actual:
All the IsComplete flags are true a second after running the top portion of code which is not what I expected
Question:
Does the IsComplete flag represent just whether the script has started executing and maybe that is why they're all true a second after queuing up?
I'm grateful for any assistance or references to further reading anyone is able to provide.
Many Thanks
Nick
#Set up runspace
$RunspacePool = [runspacefactory]::CreateRunspacePool()
$RunspacePool.SetMinRunspaces(1)
$RunspacePool.SetMaxRunspaces(2)
#Create arraylist to hold references to all the instances running jobs
$JobList = New-Object System.Collections.ArrayList
#Queue up 8 jobs that will take 20 seconds each to complete
#Add the job details to the list so I can poll it's IsComplete property
$RunspacePool.Open()
1..8 | ForEach {
Write-Verbose "Counter: $_" -Verbose
$PowershellInstance = [powershell]::Create()
$PowershellInstance.RunspacePool = $RunspacePool
[void]$PowershellInstance.AddScript({
Start-Sleep -Seconds 20
$ThreadID = [appdomain]::GetCurrentThreadId()
Write-Verbose "$ThreadID thread completed" -Verbose
})
$AsynchronousObject = $PowershellInstance.BeginInvoke()
$JobList.Add(([PSCustomObject]#{
Id = $_
PowerShellInstance = $PowershellInstance
AsynchronousObject = $AsynchronousObject
}))
}
#----------------------------------------------
#List IsComplete should show true as jobs become complete
$JobList.AsynchronousObject
#Clean up
$RunspacePool.Close()
$RunspacePool.Dispose()
There is no issue. You are forgetting what Asynchronous really means.
When you launch Asynchronous jobs, they don't block the current thread (aka. you're current PowerShell prompt) instead, they create a new thread and run from there. The whole point about Asynchronous jobs is that you can run multiple things at once.
So what happens is that Runspace is created, everything gets set up, Jobs are queued and start to run in new threads, then it keeps going (everything is Async and running in separate threads). It then goes right on to execute the last three lines:
#List IsComplete should show true as jobs become complete
$JobList.AsynchronousObject
#Clean up
$RunspacePool.Close()
$RunspacePool.Dispose()
Which kills the Runspace and Disposes of it, thereby "completing" the jobs.
If you run everything up to the commented line first. Then start watching $JobList.AsynchronousObject from the PowerShell prompt, then you will see it stepping through the jobs as expected.
Once complete, then you can execute the final two lines to close and dispose of your runspace.
You will have to look at the Job Wait functions if you want to have things wait for you.

Receive jobs results only when ended

I have a little issue. I'm creating an WPF GUI, with powershell code. I have a function which will perform a task on multiple computers (using parallel workflows). The issue here, is when my task is running, my UI freeze until the task is complete.
I would like to work around with jobs, but i am unable to receive my job when the task is ended.
Here, my simplified function :
parallelPingComputer -ips $ip_list | Select-Object date, Computer, result | out-gridview
The function :
workflow parallelPingCOmputer {
Param($ips)
foreach -parallel($ip in $ips)
{
PingComputer($ip)
}
}
And finally, the "pingcomputer($ip)" is only a ping plus an other task on multiple targets.
I tried to add -AsJob after the parallel ping, and i'm not able to call back the job result when he ended (and not before...)
Can you please help me ? :)
Thank's a lot

Running paralell threads on Powershell

I need to execute some operation on a PS script that should be ran in parallel. Using PS Jobs is not a real option since the tasks that must be paralized depends on custom functions that are defined inside a separete Module. Although I know that I can use the -InitializationScript flag and import the module that contains my custom function, I think that I loose speed since importing the hole module is "time consuming" operation.
Bearing in mind all those things I'm trying launching those "tasks" in separate threads that share the runspace. My code looks like:
$ps = [Powershell]::Create().AddScript({ Get-CustomADDomain -dnsdomain $env: })
$threadRes = $ps.beginInvoke()
$ps.EndInvoke($threadRes)
The drawback of this approach is that, since I'm creating a new "powershell process" this runspace do not have my custom modules loaded and thus I'm in the same situation that I got with Jobs.
If I try to attach current runspace to the newly created $ps by using following code:
$ps = [Powershell]::Create()
$ps.runspace = $host.runspace
$ps.AddScript({ Get-CustomADDomain -dnsdomain $env: })
$threadRes = $ps.beginInvoke()
$ps.EndInvoke($threadRes)
I get an error because I'm trying to close current pipeline (bad thing).
I think my second shot is on the right way but I cannot retrieve results from the invocation of the script, or at least I'm not able to see the way to do it.
It's obvious that I must missing something so any advice you may have will be very appretiated!!!!
A new job or runspace isn't going to inherit functions from a module that was imported into the current session. That being said, you don't have to import the entire module. If you've got specific functions in the current session you need to have available in the job, you can add just those functions like this:
function test_function {'This is a test'}
function test_function2 {'This is also a test'}
$job_functions = 'test_function','test_function2'
$init = [scriptblock]::Create(
$(foreach ($job_function in $job_functions)
{
#"
function $job_function
{$((get-item function:$job_function).definition)}
"#
}))
$init
function test_function
{'This is a test'}
function test_function2
{'This is also a test'}

How does threading in powershell work?

I want to parallelize some file-parsing actions with network activity in powershell. Quick google for it,
start-thread looked like a solution, but:
The term 'start-thread' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the spelling of the name, or if a path was included, verify that the path is correct and try again.
The same thing happened when I tried start-job.
I also tried fiddling around with System.Threading.Thread
[System.Reflection.Assembly]::LoadWithPartialName("System.Threading")
#This next errors, something about the arguments I can't figure out from the documentation of .NET
$tstart = new-object System.Threading.ThreadStart({DoSomething})
$thread = new-object System.Threading.Thread($tstart)
$thread.Start()
So, I think the best would be to know what I do wrong when I use start-thread, because it seems to work for other people. I use v2.0 and I don't need downward compatibility.
Powershell does not have a built-in command named Start-Thread.
V2.0 does, however, have PowerShell jobs, which can run in the background, and can be considered the equivalent of a thread. You have the following commands at your disposal for working with jobs:
Name Category Synopsis
---- -------- --------
Start-Job Cmdlet Starts a Windows PowerShell background job.
Get-Job Cmdlet Gets Windows PowerShell background jobs that are running in the current ...
Receive-Job Cmdlet Gets the results of the Windows PowerShell background jobs in the curren...
Stop-Job Cmdlet Stops a Windows PowerShell background job.
Wait-Job Cmdlet Suppresses the command prompt until one or all of the Windows PowerShell...
Remove-Job Cmdlet Deletes a Windows PowerShell background job.
Here is an example on how to work with it. To start a job, use start-job and pass a script block which contains the code you want run asynchronously:
$job = start-job { get-childitem . -recurse }
This command will start a job, that gets all children under the current directory recursively, and you will be returned to the command line immediately.
You can examine the $job variable to see if the job has finished, etc. If you want to wait for a job to finish, use:
wait-job $job
Finally, to receive the results from a job, use:
receive-job $job
You can't use threads directly like this, but you can't be blamed for trying since once the whole BCL is lying in front of you it's not entirely silly to expect most of it to work :)
PowerShell runs scriptblocks in pipelines which in turn require runspaces to execute them. I blogged about how to roll your own MT scripts some time ago for v2 ctp3, but the technique (and API) is still the same. The main tools are the [runspacefactory] and [powershell] types. Take a look here:
http://www.nivot.org/2009/01/22/CTP3TheRunspaceFactoryAndPowerShellAccelerators.aspx
The above is the most lightweight way to approach MT scripting. There is background job support in v2 by way of start-job, get-job but I figured you already spotted that and saw that they are fairly heavyweight.
The thing that comes closest to threads and is way more performant than jobs is PowerShell runspaces.
Here is a very basic example:
# the number of threads
$count = 10
# the pool will manage the parallel execution
$pool = [RunspaceFactory]::CreateRunspacePool(1, $count)
$pool.Open()
try {
# create and run the jobs to be run in parallel
$jobs = New-Object object[] $count
for ($i = 0; $i -lt $count; $i++) {
$ps = [PowerShell]::Create()
$ps.RunspacePool = $pool
# add the script block to run
[void]$ps.AddScript({
param($Index)
Write-Output "Index: $index"
})
# optional: add parameters
[void]$ps.AddParameter("Index", $i)
# start async execution
$jobs[$i] = [PSCustomObject]#{
PowerShell = $ps
AsyncResult = $ps.BeginInvoke()
}
}
foreach ($job in $jobs) {
try {
# wait for completion
[void]$job.AsyncResult.AsyncWaitHandle.WaitOne()
# get results
$job.PowerShell.EndInvoke($job.AsyncResult)
}
finally {
$job.PowerShell.Dispose()
}
}
}
finally {
$pool.Dispose()
}
It also allows you to do more advanced things like
Throttle the number of parallel runspaces on the pool
Import functions and variables from the current session
etc.
The answer, now, is quite simple with the ThreadJob module according to Microsoft Docs.
Install-Module -Name ThreadJob -Confirm:$true
$Job1 = Start-ThreadJob `
-FilePath $YourThreadJob `
-ArgumentList #("A", "B")
$Job1 | Get-Job
$Job1 | Receive-Job

Resources