Given a script the implements multi-threaded operations via a runspace pool, how does one get all the threads to output to a single file? I understand there are synchronization and/or locking issues to deal with, I'm just not sure what options are available.
Here is an example of how my threads are created. The example code hangs.
$_ps = [Powershell]::Create()
$_ps.RunspacePool = $_runspace_pool
$null = $_ps.AddScript({
Param (
[Parameter(Mandatory = $true)]
$ComputerName,
[Parameter(Mandatory = $true)]
$LibPath,
[Parameter(Mandatory=$false)]
$Logger = $null
)
$ErrorActionPreference = 'SilentlyContinue'
Import-Module -Name "$LibPath\MyObjectModule" -Force
$_obj = MyObjectModule\New-MyObject
if ( $Logger ) { $_obj.Logger = $Logger }
$_obj.InvokeDiscovery($ComputerName)
}) # end of $_ps.AddScript()
# set script parameters
$null = $_ps.AddParameters(#{ComputerName = $_computer; LibPath = $_lib_path; Logger = $_logger})
I'm thinking I might create a synchronized sorted list as a queue that is added to a logger object in my root thread and also passed to each child thread. Separate logger objects could be instantiated in each child thread that will put messages into the synchronized queue. The root thread logger would periodically flush the queue via a call like $logger.flush().
Maybe something like this...
$_queue = some synchronized queue-like object
$_logger = MyLoggerModule\New-MyLogger
$_logger.queue = $_queue
$_ps = [Powershell]::Create()
$_ps.RunspacePool = $_runspace_pool
$null = $_ps.AddScript({
Param (
[Parameter(Mandatory = $true)]
$Queue,
...
)
$ErrorActionPreference = 'SilentlyContinue'
Import-Module -Name "$LibPath\MyObjectModule" -Force
Import-Module -Name "$LibPath\MyLoggerModule" -Force
$_obj = MyObjectModule\New-MyObject
$_logger = MyLoggerModule\New-MyLogger
$_logger.queue = $Queue
$_obj.InvokeDiscovery($ComputerName)
}) # end of $_ps.AddScript()
# set script parameters
$null = $_ps.AddParameters(#{ComputerName = $_computer; LibPath = $_lib_path; Queue = $_queue})
# wait for threads to complete
do {
$_logger.flush()
Start-Sleep -seconds 5
} ( threads still running )
Assuming that made sense, is it a workable solution? Are there other options? Am I barking up an impossible tree and should abandon the idea altogether?
I was able to solve my problem by implementing creating a mutex and passing it to relevant functions. Worked like a charm.
Related
I have some functions in a module I would like to call from a runspace but It´s not working. I assume that I somehow have to send the module to the runspace.
The example below works fine.
$hash = [hashtable]::Synchronized(#{})
$hash.OutData
$runspace = [runspacefactory]::CreateRunspace()
$runspace.Open()
$runspace.SessionStateProxy.SetVariable('Hash',$hash)
$powershell = [powershell]::Create()
$powershell.Runspace = $runspace
$powershell.AddScript({
$hash.OutData = Get-Date
}) | Out-Null
$handle = $powershell.BeginInvoke()
While (-Not $handle.IsCompleted) {
Start-Sleep -Milliseconds 100
}
$powershell.EndInvoke($handle)
$runspace.Close()
$powershell.Dispose()
But if I call my own function instead like this, the OutData is blank. The function works fine outside of the runspace.
$powershell.AddScript({
$hash.OutData = Get-customData
}) | Out-Null
What do I have to do be able to call my function?
If your module isn't in one of the directories listed in $env:PSModulePath (or the latter environment variable isn't defined, which could happen on Unix if you're hosting the PowerShell SDK in an external executable), you must import it explicitly:
$yourFullModulePath = '<your-full-module-path-here>'
# Create a default session state and import a module into it.
$iss = [InitialSessionState]::CreateDefault()
$iss.ImportPSModule($yourFullModulePath)
# Create the runspace with the initial session state and open it.
$runspace = [runspacefactory]::CreateRunspace($iss)
$runspace.Open()
# Create a PowerShell instance and assign the runspace to it.
$powershell = [powershell]::Create($runspace)
# ...
Note that you can simplify your code by taking advantage of the fact that a [powershell] instance automatically creates a runspace:
# Pass the initial session state directly to [powershell]::Create(),
# which automatically provides a runspace.
$powershell = [powershell]::Create($iss)
# Access the [powershell] instance's runspace via the `.Runspace` property.
$powerShell.Runspace.SessionStateProxy.SetVariable('Hash', $hash)
# ...
Context: Running an Azure Automation Account solution where a caller PS script executes another PS script (executed on a VM) with parameter passing via 'Invoke-AzureRmVMRunCommand'.
Story: I had running a PowerShell (caller) script that executed another (called) PowerShell script on a remote Azure Win VM. That flow ran via an Automation Account schedule every day but suddenly stopped working two days ago because the parameter passing from the caller to the called script is not working anymore. I currently blame the MSFT Azure people for breaking my PRD solution.
Here the caller PS script code for the arguments to pass on:
$hshParams = #{
strSAName = $hshParameters.strStagingSA
strSAAccessKey = $strSAAccessKey
strFileShare = '"' + $strFileShare + '"'
strCopyObjects = $hshParameters.strCopyObjects
strSrcDriveLetter = $strSrcDriveLetter
strDstDriveLetter = $strDstDriveLetter
}
Here the invocaton of the VM-run PS script:
Invoke-AzureRmVMRunCommand -ResourceGroupName $objVM.ResourceGroupName -Name $objVM.Name `
-CommandId 'RunPowerShellScript' -ScriptPath $strRemoteScriptFileNameTmp -Parameter $hshParams
Here the parameter reception code on the VM-run PS script side:
# Parameters
Param (
[string] $strCopyObjects = $null,
[string] $strSAAccessKey = $null,
[string] $strFileShare = $null,
[string] $strSAName = $null,
[string] $strDstDriveLetter = $null,
[string] $strSrcDriveLetter = $null
)
Until two days ago all those six string values were populated properly and according to the argument setup in the hash table '$hshParams':
$strSAAccessKey = 92LO1Q4tuyeiqxxx
$strFileShare = 129xxxa1.file.core.windows.net\solutionfiles
$strSAName = 12xsa1
$strDstDriveLetter = D
$strSrcDriveLetter = Z
$strCopyObjects = AutoTopUp\Application\Live
Problem: Now I see five string values suddenly not being populated anymore with one being garbage, here is what they look like today:
$strSAAccessKey = []
$strFileShare = []
$strSAName = []
$strDstDriveLetter = []
$strSrcDriveLetter = []
$strCopyObjects = AutoTopUp\Application\Live" -strSAAccessKey 92LO1Q4tuyeiqxxx -strFileShare 129xxxa1.file.core.windows.net\solutionfiles -strSAName 12xsa1 -strDstDriveLetter D -strSrcDriveLetter Z
The solution was not touched, it just had been running as per schedule. $Args.Count on the VM-run script returns '2'.
My Question: Anyone with an explanation on this new behaviour? Frustratingly, I did not manage to arrange the parameter passing in a different way as it is all a bit unclear what the proper way of receiving the hash table values would be. The MSFT help page for 'Invoke-AzureRmVMRunCommand' is (of course) not helping here, also did I not find any other clear ways on the parameter passing on SO or Google...
Related question is raised in this MSDN thread; Just sharing this for the benefit of broader audience who might face similar issue.
What I'm trying to do
The below script loops through every item in an Array of data streams and requests a summary value for output to a text file. This external request is by far the most expensive part of the process, and so I am now using a Runspacepool to run multiple (5) requests in parallel, and whichever finishes first outputs its results.
These requests all write to a synchronised hashtable, $hash, which holds a running total ($hash.counter) and tracks which thread ($hash.thread) is updating the total and a .txt output file, to avoid potential write collisions.
What isn't working
Each thread is able to update the counter easily enough $hash.counter+=$r, but when I try and Read the value into an Add-Content statement:
Add-Content C:\Temp\test.txt "$hash.counter|$r|$p|$ThreadID"
it adds an object reference rather than a number:
System.Collections.Hashtable+SyncHashtable.counter|123|MyStreamName|21252
And so I've ended up passing the counter through a temporary variable that can be used in the string:
[int]$t = $hash.counter+0
Add-Content C:\Temp\test.txt "$t|$r|$p|$ThreadID"
Which does output the true total:
14565423|123|MyStreamName|21252
What I'm asking
Is it possible to remove this temporary variable and output directly from the hashtable? Why does the object reference have a '+' in the middle?
I've had to add logic to 'lock' the hashtable to prevent data collisions. Should this be necessary? I'd been told that synchronised hashtables were supposed to be thread-safe for R/W operations, but without this logic my counter doesn't reach the correct total.
Full code for the loop itself below - I've left out setup of the Runspacepool etc
ForEach($i in $Array){
# Save down the data stream name and create parameter list for passing to the new job
$p = $i.Point.Name
$parameters = #{
hash = $hash
conn = $Conn
p = $p
}
# Instantiate new powershell runspace and send a script to it
$PowerShell = [powershell]::Create()
$PowerShell.RunspacePool = $RunspacePool
[void]$Powershell.AddScript({
# Receive parameter list, retrieve threadid
Param (
$hash,
$conn,
$p
)
$ThreadID = [appdomain]::GetCurrentThreadId()
# Send data request to the PI Data Archive using the existing connection
$q = Get-something (actual code removed)
[int]$r = $q.Values.Values[0].Value
# Lock out other threads from writing to the hashtable and output file to prevent collisions
# If the thread isn't locked, attempt to lock it. If it is locked, sleep for 1ms and try again. Tracked by synchronised Hashtable.
Do{
if($hash.thread -eq 0){
$hash.thread = $ThreadID
}
# Sleep for 1ms and check that the lock wasn't overwritten by another parallel thread.
Start-Sleep -Milliseconds 1
}Until($hash.thread -eq $ThreadID)
# Increment the synchronised hash counter. Save the new result to a temporary variable (can't figure out how to get the hash counter itself to output to the file)
$hash.counter+=$r
[int]$t = $hash.counter+0
# Write to output file new counter total, result, pointName and threadID
Add-Content C:\Temp\test.txt "$t|$r|$p|$ThreadID"
# release lock on the hashtable and output file
$hash.thread = 0
})
# Add parameter list to instance (matching param() list from the script. Invoke the new instance and save a handle for closing it
[void]$Powershell.AddParameters($parameters)
$Handle = $PowerShell.BeginInvoke()
# Save down the handle into the $jobs list for closing the instances afterwards
$temp = [PSCustomObject]#{
PowerShell=$Powershell
Handle=$Handle
}
[void]$jobs.Add($Temp)
}
For various reasons, I started to write a PowerShell portscanner, not least to start learning it.
First iteration used Test-Netconnection. This seemed as if it would be too slow; so I went one level down to use sockets, specifically System.Net.Sockets.TcpClient. (Have started looking at System.Net.Sockets.Socket as the MS docs make mention of the Socket.BeginConnect() method which can begin an asynchronous request for a remote connection, but not sure if this will help yet.)
This still seemed too slow, so I looked at jobs. All this did was consume more resources for not much speed increase, so after much googling, I managed to make threading (or what PowerShell calls threading any way) work through the use of RunSpacePools. I thought it was pretty much done, and performance is ok if you're looking at an input file of 5 IP addresses. However, tried it out with a CIDR /24 this morning, and it took about 20-30 minutes.
[Edit] I should add that this script will take a 'thread' value, but if none is provided uses a default thread value of number of cores + 1 in order to take advantage of the RunSpacePool multithreading.
So I started looking at how Fyodor increased his speed, and in The Art of Scanning in PHRACK article 11 he states (whilst talking about TCP Connect() scanning):
While making a separate connect() call for every targeted port in a
linear fashion would take ages over a slow connection, you can hasten
the scan by using many sockets in parallel.
That is clearly where some optimisation is available.
So, is anyone able to point me in the direction of how I enable this - as I say, still quite new to PoSH, so am pushing the limits of my comprehension with RunSpacePools.
Specifically, I would like some advice on a) if my instincts are right to increase the scan speed to increase socket parallelism, b) how to do that and c) if System.Net.Sockets.Socket is more appropriate.
function doConnect {
$ipLoopCount = 0
$portLoopCount = 0
# check for randomise switch
if ($randomise) {
$ipArray = makeRange | Sort-Object {Get-Random}
$portArray = makePortRange | Sort-Object {Get-Random}
} else {
# Connects to IPs in order
$ipArray = makeRange
$portArray = makePortRange
}
# initialise runspaces
if ($threads) {
$useThreads = $threads
} else {
$useThreads = ([int]$env:NUMBER_OF_PROCESSORS + 1)
}
$pool = [RunspaceFactory]::CreateRunspacePool(1, $useThreads)
$pool.ApartmentState = "MTA"
$pool.Open()
$runspaces = #()
# set higher priority for powershell process
if ($priority) {
$proc = Get-Process -Id $pid;
$proc.PriorityClass = 'High'
} else{
$proc = Get-Process -Id $pid;
$proc.PriorityClass = 'Normal'
}
# info object
$infoDisplay = #{
InputFile = $inFile
Target_IPs = $ipArray
Target_Ports = $portArray
Process_Priority = $proc.PriorityClass
Threads = $useThreads
}
[PSCustomObject]$infoDisplay
# set up scriptblock to pass to runspaces
$scriptblock = {
Param (
[IPAddress]$sb_ip,
[int]$sb_port
)
# This progress bar doesn't work yet
Write-Progress -Activity "Scan range $StartIPaddress - $EndIPAddress" -Status "% Complete:" -PercentComplete((($portLoopCount)/($ipArray.Length*$portArray.Length))*100)
if ($delay) {
$delay = Get-Random -Maximum 1000 -Minimum 1;
Start-Sleep -m $delay
}
$socket = New-Object System.Net.Sockets.TcpClient
$socket.Connect($sb_ip, $sb_Port)
if ($socket.Connected) {
Write-Output "Connected to $sb_port on $sb_ip"
#} else {
# Write-Output "Failed to connect to port $sb_port on $sb_ip"
}
$socket.Close()
}
foreach ($nIP in $ipArray) {
$ipLoopCount++
foreach ($nPort in $portArray) {
$portLoopCount++
$runspace = [PowerShell]::Create()
$null = $runspace.AddScript($scriptblock)
$null = $runspace.AddArgument($nIP)
$null = $runspace.AddArgument($nPort)
$runspace.RunspacePool = $pool
$runspaces += [PSCustomObject]#{
Pipe = $runspace;
Status = $runspace.BeginInvoke()
}
}
}
while ($runspaces.Status -ne $null) {
$completed = $runspaces | Where-Object { $_.Status.IsCompleted -eq $true }
foreach ($runspace in $completed) {
$runspace.Pipe.EndInvoke($runspace.Status)
$runspace.Status = $null
}
}
$pool.Close()
$pool.Dispose()
}
It may be that PowerShell is entirely the wrong thing to attempt this in, but is a useful exercise as the environment is quite locked down, and installing a 'proper' portscanner - i.e. nmap - is impossible.
[Edit 2] I don't think reducing the timeout and plumbing that into the logic is the solutiuon that I'm after.
[Edit 3] The parallel switch didn't help.
[Edit 4] Have been thinking about Asynchronous socket connections, as this may help the overall connections speed - but then you have to have another thread/process looking after the incoming traffic. Unsure as to the efficacy.
I have a document set to request that user open a read only version(Option "Read-only Recommended"). I would like to open the excel document without read on only in powershell (decline the prompt asking to open "Read Only"). Here is my current code.
$dir = "\\file_path\*"
$latest = Get-ChildItem -Path $dir | Sort-Object LastAccessTime -Descending | Select-Object -First 1
$latest.name
$excelObj = New-Object -ComObject Excel.Application
$excelObj.Visible = $True
$excelObj.DisplayAlerts = $False
$workBook = $excelObj.Workbooks.Open($latest)
How do I ignore the read only prompt and open the full version?
There should be a IgnoreReadOnlyRecommended argument that you can supply in the workbook open method:
$workBook = $excelObj.Workbooks.Open($latest,,,,,,$True,,,,,,,)
Workbooks.Open Method (MSDN)
Edit
Based on comments below, it appears that there is a bug preventing this method from working when the $null parameters are supplied. Thanks to this answer on another question it appears there may be a way around this:
1st, this function is required:
Function Invoke-NamedParameter {
[CmdletBinding(DefaultParameterSetName = "Named")]
param(
[Parameter(ParameterSetName = "Named", Position = 0, Mandatory = $true)]
[Parameter(ParameterSetName = "Positional", Position = 0, Mandatory = $true)]
[ValidateNotNull()]
[System.Object]$Object
,
[Parameter(ParameterSetName = "Named", Position = 1, Mandatory = $true)]
[Parameter(ParameterSetName = "Positional", Position = 1, Mandatory = $true)]
[ValidateNotNullOrEmpty()]
[String]$Method
,
[Parameter(ParameterSetName = "Named", Position = 2, Mandatory = $true)]
[ValidateNotNull()]
[Hashtable]$Parameter
,
[Parameter(ParameterSetName = "Positional")]
[Object[]]$Argument
)
end { ## Just being explicit that this does not support pipelines
if ($PSCmdlet.ParameterSetName -eq "Named") {
## Invoke method with parameter names
## Note: It is ok to use a hashtable here because the keys (parameter names) and values (args)
## will be output in the same order. We don't need to worry about the order so long as
## all parameters have names
$Object.GetType().InvokeMember($Method, [System.Reflection.BindingFlags]::InvokeMethod,
$null, ## Binder
$Object, ## Target
([Object[]]($Parameter.Values)), ## Args
$null, ## Modifiers
$null, ## Culture
([String[]]($Parameter.Keys)) ## NamedParameters
)
} else {
## Invoke method without parameter names
$Object.GetType().InvokeMember($Method, [System.Reflection.BindingFlags]::InvokeMethod,
$null, ## Binder
$Object, ## Target
$Argument, ## Args
$null, ## Modifiers
$null, ## Culture
$null ## NamedParameters
)
}
}
}
Which would suggest the Workbooks.Open() method could be called like so:
$workBook = Invoke-NamedParameter $excelObj "Workbooks.Open" #{"FileName"=$latest;"IgnoreReadOnlyRecommended"=$True}
If your looking to just open the file for reading and ignore the prompt then this works:
$workBook = $excelObj.Workbooks.Open($latest,$null,$true)
The 3rd argument denotes true to open read-only.
This approach does not appear to be subject to the aforementioned bug!