Multi-thread powershell script with different arguments - multithreading

I'm trying to run a Powershell script that pulls data from an APM device across a date range. However it can take up to 9 hours for a week date range. When I run it in a for loop day by date, it takes 35 minutes:
for($i = 0; $i -lt $dateList.Length-1; $i++){
& "C:\Scripts\Grabber.ps1" -date $dateList[$i] -date2 $dateList[$i+1]
}
I need to optimize that further. I've looked at PoshRSJob and Invoke-Parallel, but I can't seem to get my head around this! I'd appreciate any help, thanks.

Well, just include your script in the scriptblock?
for($i = 0; $i -lt $dateList.Length-1; $i++){
start-rsjob -name {$_} -ScriptBlock {
& "C:\Scripts\Grabber.ps1" -date ($using:dateList)[$using:i] -date2 ($using:dateList)[$using:i+1]
}
}

Related

Powershell | How can I use Multi Threading for my File Deleter Powershell script?

So I've written a Script to delete files in a specific folder after 5 days. I'm currently implementing this in a directory with hundreds of thousands of files and this is taking a lot of time.
This is currently my code:
#Variables
$path = "G:\AdeptiaSuite\AdeptiaSuite-6.9\AdeptiaServer\ServerKernel\web\repository\equalit\PFRepository"
$age = (Get-Date).AddDays(-5) # Defines the 'x days old' (today's date minus x days)
# Get all the files in the folder and subfolders | foreach file
Get-ChildItem $path -Recurse -File | foreach{
# if creationtime is 'le' (less or equal) than $age
if ($_.CreationTime -le $age){
Write-Output "Older than $age days - $($_.name)"
Remove-Item $_.fullname -Force -Verbose # remove the item
}
else{
Write-Output "Less than $age days old - $($_.name)"
}
}
I've searched around the internet for some time now to find out how to use
Runspaces, however I find it very confusing and I'm not sure how to implement it with this script. Could anyone please give me an example of how to use Runspaces for this code?
Thank you very much!
EDIT:
I've found this post: https://adamtheautomator.com/powershell-multithreading/
And ended up changing my script to this:
$Scriptblock = {
# Variables
$path = "G:\AdeptiaSuite\AdeptiaSuite-6.9\AdeptiaServer\ServerKernel\web\repository\equalit\PFRepository"
$age = (Get-Date).AddDays(-5) # Defines the 'x days old' (today's date minus x days)
# Get all the files in the folder and subfolders | foreach file
Get-ChildItem $path -Recurse -File | foreach{
# if creationtime is 'le' (less or equal) than $age
if ($_.CreationTime -le $age){
Write-Output "Older than $age days - $($_.name)"
Remove-Item $_.fullname -Force -Verbose # remove the item
}
else{
Write-Output "Less than $age days old - $($_.name)"
}
}
}
$MaxThreads = 5
$RunspacePool = [runspacefactory]::CreateRunspacePool(1, $MaxThreads)
$RunspacePool.Open()
$Jobs = #()
1..10 | Foreach-Object {
$PowerShell = [powershell]::Create()
$PowerShell.RunspacePool = $RunspacePool
$PowerShell.AddScript($ScriptBlock).AddArgument($_)
$Jobs += $PowerShell.BeginInvoke()
}
while ($Jobs.IsCompleted -contains $false) {
Start-Sleep 1
}
However I'm not sure if this works correctly now, I don't get any error's however the Terminal doesn't do anything, so I'm not sure wether it works or just doesn't do anything.
I'd love any feedback on this!
The easiest answer is: get PowerShell v7.2.5 (look in the assets for PowerShell-7.2.5-win-x64.zip), download and extract it. It's a no-install PowerShell 7 which has easy multithreading and lets you change foreach { to foreach -parallel {. The executable is pwsh.exe.
But, if it's severely overloading the server, running it several times will only make things worse, right? And I think the Get-ChildItem will be the slowest part, putting the most load on the server, and so doing the delete in parallel probably won't help.
I would first try changing the script to this shape:
$path = "G:\AdeptiaSuite\AdeptiaSuite-6.9\AdeptiaServer\ServerKernel\web\repository\equalit\PFRepository"
$age = (Get-Date).AddDays(-5)
$logOldFiles = [System.IO.StreamWriter]::new('c:\temp\log-oldfiles.txt')
$logNewFiles = [System.IO.StreamWriter]::new('c:\temp\log-newfiles.txt')
Get-ChildItem $path -Recurse -File | foreach {
if ($_.CreationTime -le $age){
$logOldFiles.WriteLine("Older than $age days - $($_.name)")
$_ # send file down pipeline to remove-item
}
else{
$logNewFiles.WriteLine("Less than $age days old - $($_.name)")
}
} | Remove-Item -Force
$logOldFiles.Close()
$logNewFiles.Close()
So it pipelines into remove-item and doesn't send hundreds of thousands of text lines to the console (also a slow thing to do).
If that doesn't help, I would switch to robocopy /L and maybe look at robocopy /L /MINAGE... to do the file listing, then process that to do the removal.
(I also removed the comments which just repeat the lines of code # removed comments which repeat what the code says.
The code tells you what the code says # read the code to see what the code does. Comments should tell you why the code does things, like who wrote the script and what business case was it solving, what is the PFRepository, why is there a 5 day cutoff, or whatever.)

How to change the background color of an Excel report

Basically, I am using this script to run through all the .csv files in a specific folder and merge them all together. But after the merge, I still want to change the background color of each .csv file.
The script that I got so far does not do that, can't figure out how to do it as I am really new in PowerShell.
# Get all the information from .csv files that are in the $IN_FILE_PATH skipping the first line:
$getFirstLine = $true
get-childItem "$IN_FILE_PATH\*.csv" | ForEach {
$filePath = $_
$lines = $lines = Get-Content $filePath
$linesToWrite = switch ($getFirstLine) {
$true { $lines }
$false { $lines | Select -Skip 1 }
}
# Import all the information... and tranfer to the new workbook.
$Report_name = $((get-date).ToString("yyyy.MM.dd-hh.mm"))
$getFirstLine = $false
Add-Content "$OUT_FILE_PATH\Report $Report_Name.csv" $linesToWrite
}
e.g. The .csv file has this pattern:
Name Age
Richard 18
Carlos 20
Jonathan 43
Mathew 25
Making sure to understand that Richard (18 years old) and Carlos (20 years old) are from filenumber1.csv - Jonathan (43 years old) and Mathew (25 years old) are from filenumber2.csv
I want Carlos' and Richard's rows to be with a white background, whereas Jonathan's and Mathew's rows to be grey. So that repeats in white-grey-white-grey dividing it by each file.
I am trying to make it more friendly to observe the report in the end - to make sure that you can this separation from file to file more clear.
Any ideas?
As Vivek Kumar Singh mentioned in comments, .csv doesn't contain any formatting options. It's recommended to work with Excel file instead. And for that purpose, the best module I know and use is ImportExcel.
The code to set formatting is as below (inspired by this thread):
$IN_FILE_PATH = "C:\SO\56870016"
# mkdir $IN_FILE_PATH
# cd $IN_FILE_PATH
# rm out.xlsx
# Define colors
$colors = "White", "Gray"
# Initialization
$colorsTable = #()
$data = #()
$n = 0
Get-ChildItem "$IN_FILE_PATH\*.csv" | % {
$part = Import-Csv $_
$data += $part
for ($i = 0; $i -lt ($part).Count; $i++) {
$colorsTable += $colors[$n%2]
}
$n++
}
$j = 0
$data | Export-Excel .\out.xlsx -WorksheetName "Output" -Append -CellStyleSB {
param(
$workSheet,
$totalRows,
$lastColumn
)
foreach($row in (2..$totalRows )) {
# The magic happens here
# We set colors based on the list which was created while importing
Set-CellStyle $workSheet $row $LastColumn Solid $colorsTable[$j]
$j++
}
}
Hopefully the comments in the code help you to better understanding of what's going on in the code.

using powershell to apply formulas to Exel Groups based on other colums

I am very new to PowerShell and am trying to automate an invoicing process I do.
I have been able to group my excel data into "records" where each repair # in column A has multiple rows of parts and diagnostics.
where I am stuck is having PowerShell loop through each record reading the information in Columns J-M and providing a corresponding payment in Column H.
this is a picture of the excel file that the invoice is processed in
This is the code that groups my rows to their corresponding repair #
$intRowCount = $ws2.usedRange.Rows.Count
for($i=6; $i -le $intRowCount; $i++)
{
if($ws2.Cells.Item($i,1).text -like "")
{
$startRow = $i
for($j=$i+1; $j -le $intRowCount; $j++)
{
if($ws2.cells.Item($j,1).text -ne "" -or $j -eq $intRowCount)
{
$endRow = $j-1
if($j -eq $intRowCount)
{
$endRow = $j
}
break
}
}
$str = "A"+$startRow+":A"+$endRow
$ws2.Range($str).Rows.Group()
$i=$j
}
}
Any help would be great, as I have been doing this manually for months going through thousands of lines of data.

Powershell multithreading not working on cmd.exe

I need to run some code in powershell using multi-thread, i have tested a simple snippet and it runs fine on a powershell console. however when i try to run on a cmd.exe the code doesnt execute and no error was thrown wondering what is going on? if someone you help on this.
sample code as follow
$throttleLimit = 10
$iss = [system.management.automation.runspaces.initialsessionstate]::CreateDefault()
$Pool = [runspacefactory]::CreateRunspacePool(1, $throttleLimit, $iss, $Host)
$Pool.Open()
$ScriptBlock = {
param($id)
Start-Sleep -Seconds 2
Write-Host "Done processing ID $id"
[System.Console]::WriteLine("Done processing ID $id")
}
for ($x = 1; $x -le 40; $x++) {
$powershell = [powershell]::Create().AddScript($ScriptBlock).AddArgument($x)
$powershell.RunspacePool = $Pool
$handle = $powershell.BeginInvoke()
}
my batch file code is as follow
powershell -Command .\multiT.ps1 2>&1
In the ISE, the script finishes before the output from the threads starts to show up. I added
start-sleep -sec 10
to the end of the code and I get output from cmd now. For some reason the output is doubled, though (as in, I get 2 lines for each thread).

Powershell to wake up multiple media drives simultaneously

I have a server with lots of media drives ~43TB. An areca 1882ix-16 is set to spin the drives down after 30 minutes of inactivity since most days an individual drive is not even used. This works nicely to prevent unnecessary power and heat. In this case the drives still show up in windows explorer but when you click to access them it takes about 10 seconds for the folder list to show up since it has to wait for the drive to spin up.
For administrative work I have a need to spin up all the drives to be able to search among them. Clicking on each drive in windows explorer and then waiting for it to spin up before clicking the next drive is very tedious. Obviously multiple explorer windows makes it faster but it is still tedious. I thought a powershell script may ease the pain.
So I started with the following:
$mediaDrives = #('E:', 'F:', 'G:', 'H:', 'I:', 'J:', 'K:', 'L:',
'M:','N:', 'O:', 'P:', 'Q:', 'R:', 'S:')
get-childitem $mediaDrives | foreach-object -process { $_.Name }
This is just requesting that each drive in the array have its root folder name listed. That works to wake the drive but it is again a linear function. The script pauses for each drive before printing. Looking for a solution as to how to wake each drive simultaneously. Is there a way to multi-thread or something else?
Here's a script that will do what you want, but it must be run under powershell using the MTA threading mode (which is the default for powershell.exe 2.0, but powershell.exe 3.0 must be launched with the -MTA switch.)
#require -version 2.0
# if running in ISE or in STA console, abort
if (($host.runspace.apartmentstate -eq "STA") -or $psise) {
write-warning "This script must be run under powershell -MTA"
exit
}
$mediaDrives = #('E:', 'F:', 'G:', 'H:', 'I:', 'J:', 'K:', 'L:',
'M:','N:', 'O:', 'P:', 'Q:', 'R:', 'S:')
# create a pool of 8 runspaces
$pool = [runspacefactory]::CreateRunspacePool(1, 8)
$pool.Open()
$jobs = #()
$ps = #()
$wait = #()
$count = $mediaDrives.Length
for ($i = 0; $i -lt $count; $i++) {
# create a "powershell pipeline runner"
$ps += [powershell]::create()
# assign our pool of 8 runspaces to use
$ps[$i].runspacepool = $pool
# add wake drive command
[void]$ps[$i].AddScript(
"dir $($mediaDrives[$i]) > `$null")
# start script asynchronously
$jobs += $ps[$i].BeginInvoke();
# store wait handles for WaitForAll call
$wait += $jobs[$i].AsyncWaitHandle
}
# wait 5 minutes for all jobs to finish (configurable)
$success = [System.Threading.WaitHandle]::WaitAll($wait,
(new-timespan -Minutes 5))
write-host "All completed? $success"
# end async call
for ($i = 0; $i -lt $count; $i++) {
write-host "Completing async pipeline job $i"
try {
# complete async job
$ps[$i].EndInvoke($jobs[$i])
} catch {
# oops-ee!
write-warning "error: $_"
}
# dump info about completed pipelines
$info = $ps[$i].InvocationStateInfo
write-host "State: $($info.state) ; Reason: $($info.reason)"
}
So, for example, save as warmup.ps1 and run like: powershell -mta c:\scripts\warmup.ps1
To read more about runspace pools and the general technique above, take a look at my blog entry about runspacepools:
http://nivot.org/blog/post/2009/01/22/CTP3TheRunspaceFactoryAndPowerShellAccelerators
I chose 8 pretty much arbitrarily for the parallelism factor - experiment yourself with lower or higher numbers.
Spin up a separate powershell instance for each drive or use workflows in PowerShell 3.0.
Anyhow, you can pass drives directly to the Path parameter and skip Foreach-Object all togeteher:
Get-ChildItem $mediaDrives
Have you considered approaching this with the Start-Job cmdlet:
$mediaDrives = #('E:', 'F:', 'G:', 'H:', 'I:', 'J:', 'K:')
$mediaDrives | ForEach-Object {
Start-Job -ArgumentList $_ -ScriptBlock {param($drive)
Get-ChildItem $drive
}
}
The only clever part is that you need to use the -ArgumentList parameter on the Start-Job cmdlet to pass the correct value through for each iteration. This will create a background task that runs in parallel with the execution of the script. If you are curious
If you don't want to wait, well, don't wait: start those wake-up calls in the background.
In bash one would write
foreach drive ($mediadrives) {tickle_and_wake $drive &}
(note the ampersand, which means: start the command in the background, don't wait for it to complete)
In PowerShell that would translate to something like
foreach ($drive in $mediadrives) {
Start-Job {param($d) tickle_and_wake $d} -Arg $drive
}
If you want confirmation that all background jobs have completed, use wait in bash or Wait-Job in Powershell

Resources