Powershell script using Excel running slow - excel

So i have this script that i coded on my laptop that works just fine, the job is to combine two .csv-files into one .xls-file.
And running the script with two .csv-files containing a couple of thousand rows takes a few seconds max.
But when i try to run it on the server where it should be located, it takes... hours. I haven't done a full run, but writing one line in the .xls-file takes maybe 2-3 seconds.
So what im wondering is what is causing the huge increase in runtime. I'm monitoring the CPU-load while the script is running, and it's at 50-60% load.
The server has loads of Ram, and two CPU-core.
How can i speed this up?
The script looks like this:
$path = "C:\test\*"
$path2 = "C:\test"
$date = Get-Date -Format d
$csvs = Get-ChildItem $path -Include *.csv | Sort-Object LastAccessTime -Descending | Select-Object -First 2
$y = $csvs.Count
Write-Host "Detected the following CSV files: ($y)"
foreach ($csv in $csvs) {
Write-Host " "$csv.Name
}
$outputfilename = "regSCI " + $date
Write-Host Creating: $outputfilename
$excelapp = New-Object -ComObject Excel.Application
$excelapp.sheetsInNewWorkbook = $csvs.Count
$xlsx = $excelapp.Workbooks.Add()
$sheet = 1
$xlleft = -4131
foreach ($csv in $csvs) {
$row = 1
$column = 1
$worksheet = $xlsx.Worksheets.Item($sheet)
$worksheet.Name = $csv.Name
$worksheet.Rows.HorizontalAlignment = $xlleft
$file = (Get-Content $csv)
Write-Host Worksheet created: $worksheet.Name
foreach($line in $file) {
Write-Host Writing Line
$linecontents = $line -split ',(?!\s*\w+")'
foreach($cell in $linecontents) {
Write-Host Writing Cell
$cell1 = $cell.Trim('"')
$worksheet.Cells.Item($row, $column) = $cell1
$column++
}
$column = 1
$row++
$WorkSheet.UsedRange.Columns.Autofit() | Out-Null
}
$sheet++
$headerRange = $worksheet.Range("a1", "q1")
$headerRange.AutoFilter() | Out-Null
}
$output = $path2 + "\" + $outputfilename
Write-Host $output
$xlsx.SaveAs($output)
$excelapp.Quit()

To speed up your existing code, add these just after creating Excel object:
$excelapp.ScreenUpdating = $false
$excelapp.DisplayStatusBar = $false
$excelapp.EnableEvents = $false
$excelapp.Visible = $false
And these just before SaveAs:
$excelapp.ScreenUpdating = $true
$excelapp.DisplayStatusBar = $true
$excelapp.EnableEvents = $true
This causes excel not to render the worksheet in realtime and fire events every time you change the contets. Most probably DisplayStatusBar and ScreenUpdating doesn't matter if you make an application invisible, but I included it just in case.
Also, you're running Autofit() after every line. This certainly doesn't help with performance.

Related

Powershell - Creating Excel Workbook - Getting "Insufficient memory to continue the execution of the program"

I'm trying to create an Excel workbook, then populate the cells with data found from searching many txt files.
I read a file and extract all comments AFTER I find "IDENTIFICATION DIVISION" and BEFORE I find "ENVIRONMENT DIVISION"
I then populate two cells in my excel workbook. cell one if the file and cell two is the comments extracted.
I have 256GB of memory on the work server. less than %5 is being used before Powershell throws the memory error.
Can anyone see where I'm going wrong?
Thanks,
-Ron
$excel = New-Object -ComObject excel.application
$excel.visible = $False
$workbook = $excel.Workbooks.Add()
$diskSpacewksht= $workbook.Worksheets.Item(1)
$diskSpacewksht.Name = "XXXXX_Desc"
$col1=1
$diskSpacewksht.Cells.Item(1,1) = 'Program'
$diskSpacewksht.Cells.Item(1,2) = 'Description'
$CBLFileList = Get-ChildItem -Path 'C:\XXXXX\XXXXX' -Filter '*.cbl' -File -Recurse
$Flowerbox = #()
ForEach($CBLFile in $CBLFileList) {
$treat = $false
Write-Host "Processing ... $CBLFile" -foregroundcolor green
Get-content -Path $CBLFile.FullName |
ForEach-Object {
if ($_ -match 'IDENTIFICATION DIVISION') {
# Write-Host "Match IDENTIFICATION DIVISION" -foregroundcolor green
$treat = $true
}
if ($_ -match 'ENVIRONMENT DIVISION') {
# Write-Host "Match ENVIRONMENT DIVISION" -foregroundcolor green
$col1++
$diskSpacewksht.Cells.Item($col1,1) = $CBLFile.Name
$diskSpacewksht.Cells.Item($col1,2) = [String]$Flowerbox
$Flowerbox = #()
continue
}
if ($treat) {
if ($_ -match '\*(.{62})') {
Foreach-Object {$Flowerbox += $matches[1] + "`r`n"}
$treat = $false
}
}
}
}
$excel.DisplayAlerts = 'False'
$ext=".xlsx"
$path="C:\Desc.txt"
$workbook.SaveAs($path)
$workbook.Close
$excel.DisplayAlerts = 'False'
$excel.Quit()
Not knowing what the contents of the .CBL files could be, I would suggest not to try and do all of this using an Excel COM object, but create a CSV file instead to make things a lot easier.
When finished, you can simply open that csv file in Excel.
# create a List object to collect the 'flowerbox' strings in
$Flowerbox = [System.Collections.Generic.List[string]]::new()
$treat = $false
# get a list of the .cbl files and loop through. Collect all output in variable $result
$CBLFileList = Get-ChildItem -Path 'C:\XXXXX\XXXXX' -Filter '*.cbl' -File -Recurse
$result = foreach ($CBLFile in $CBLFileList) {
Write-Host "Processing ... $($CBLFile.FullName)" -ForegroundColor Green
# using switch -File is an extremely fast way of testing a file line by line.
# instead of '-Regex' you can also do '-WildCard', but then add asterikses around the strings
switch -Regex -File $CBLFile.FullName {
'IDENTIFICATION DIVISION' {
# start collecting Flowerbox lines from here
$treat = $true
}
'ENVIRONMENT DIVISION' {
# stop colecting Flowerbox lines and output what we already have
# output an object with the two properties you need
[PsCustomObject]#{
Program = $CBLFile.Name # or $CBLFile.FullName
Description = $Flowerbox -join [environment]::NewLine
}
$Flowerbox.Clear() # empty the list for the next run
$treat = $false
}
default {
# as I have no idea what these lines may look like, I have to
# assume your regex '\*(.{62})' is correct..
if ($treat -and ($_ -match '\*(.{62})')) {
$Flowerbox.Add($Matches[1])
}
}
}
}
# now you have everything in an array of PSObjects so you can save that as Csv
$result | Export-Csv -Path 'C:\Desc.csv' -UseCulture -NoTypeInformation
Parameter -UseCulture ensures you can double-click the file so it will open correctly in your Excel
You can also create an Excel file from this csv programmatically like:
$excel = New-Object -ComObject Excel.Application
$excel.Visible = $false
$workbook = $excel.Workbooks.Open('C:\Desc.csv')
$worksheet = $workbook.Worksheets.Item(1)
$worksheet.Name = "XXXXX_Desc"
# save as .xlsx
# 51 ==> [Microsoft.Office.Interop.Excel.XlFileFormat]::xlWorkbookDefault
# see: https://learn.microsoft.com/en-us/office/vba/api/excel.xlfileformat
$workbook.SaveAs('C:\Desc.xlsx', 51)
# quit Excel and remove all used COM objects from memory
$excel.Quit()
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($worksheet)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($workbook)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($excel)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()

How to use powershell to select range and dump that to csv file

Actually, this is a version of question here:
How to use powershell to select and copy columns and rows in which data is present in new workbook.
The goal is to grab certain columns from multiple Excel workbooks and dump everything to one csv file. Columns are always the same.
I'm doing that manually:
$xl = New-Object -ComObject Excel.Application
$xl.Visible = $false
$xl.DisplayAlerts = $false
$counter = 0
$input_folder = "C:\Users\user\Documents\excelfiles"
$output_folder = "C:\Users\user\Documents\csvdump"
Get-ChildItem $input_folder -File |
Foreach-Object {
$counter++
$wb = $xl.Workbooks.Open($_.FullName, 0, 1, 5, "")
try {
$ws = $wb.Worksheets.item('Calls') # => This specific worksheet
$rowMax = ($ws.UsedRange.Rows).count
for ($i=1; $i -le $rowMax-1; $i++) {
$newRow = New-Object -Type PSObject -Property #{
'Type' = $ws.Cells.Item(1+$i,1).text
'Direction' = $ws.Cells.Item(1+$i,2).text
'From' = $ws.Cells.Item(1+$i,3).text
'To' = $ws.Cells.Item(1+$i,4).text
}
$newRow | Export-Csv -Path $("$output_folder\$ESO_Output") -Append -noType -Force
}
}
} catch {
Write-host "No such workbook" -ForegroundColor Red
# Return
}
}
Question:
This works, but is extremely slow because Excel has to select every cell, copy that, then Powershell has to create array and save row by row in output csv file.
Is there a method to select a range in Excel (number of columns times ($ws.UsedRange.Rows).count), cut header line and just append this range (array?) to csv file to make everything much faster?
So that's the final solution
Script is 22 times faster!!! than original solution.
Hope somebody will find that useful :)
PasteSpecial is to filter out empty rows. There is no need to save them into csv
$xl = New-Object -ComObject Excel.Application
$xl.Visible = $false
$xl.DisplayAlerts = $false
$counter = 0
$input_folder = "C:\Users\user\Documents\excelfiles"
$output_folder = "C:\Users\user\Documents\csvdump"
Get-ChildItem $input_folder -File |
Foreach-Object {
$counter++
try {
$new_ws1 = $wb.Worksheets.add()
$ws = $wb.Worksheets.item('Calls')
$rowMax = ($ws.UsedRange.Rows).count
$range = $ws.Range("A1:O$rowMax")
$x = $range.copy()
$y = $new_ws1.Range("A1:O$rowMax").PasteSpecial([System.Type]::Missing,[System.Type]::Missing,$true,$false)
$wb.SaveAs("$($output_folder)\$($_.Basename)",[Microsoft.Office.Interop.Excel.XlFileFormat]::xlCSVWindows)
} catch {
Write-host "No such workbook" -ForegroundColor Red
# Return
}
}
$xl.Quit()
Part above will generate a bunch of csv files.
Part below will read these files in separate loop and combine them together into one.
-exclude is an array of something I want to omit
Remove-Item to remove temporary files
Answer below is based on this post: https://stackoverflow.com/a/27893253/6190661
$getFirstLine = $true
Get-ChildItem "$output_folder\*.csv" -exclude $excluded | foreach {
$filePath = $_
$lines = Get-Content $filePath
$linesToWrite = switch($getFirstLine) {
$true {$lines}
$false {$lines | Select -Skip 1}
}
$getFirstLine = $false
Add-Content "$($output_folder)\MERGED_CSV_FILE.csv" $linesToWrite
Remove-Item $_.FullName
}

Powershell script stops working when ran through task scheduler

I am running the script below as a scheduled task with the user logged on the server. It converts an xls file to csv using the Excel.Application COM object. The conversion works, but eventually breaks and I don't know why.
I have the task run the following command which should in theory allow it to run constantly:
powershell.exe -noexit -file "filename.ps1"
Any thoughts on what to try?
$server = "\\server"
$xls = "\path\XLS\"
$csv = "\path\CSV\"
$folder = $server + $xls
$destination = $server + $csv
$filter = "*.xls" # <-- set this according to your requirements
$fsw = New-Object IO.FileSystemWatcher $folder, $filter -Property #{
IncludeSubdirectories = $true # <-- set this according to your requirements
NotifyFilter = [IO.NotifyFilters]"FileName, LastWrite"
}
Register-ObjectEvent $fsw Created -SourceIdentifier FileCreated -Action {
$path = $Event.SourceEventArgs.FullPath
$name = $Event.SourceEventArgs.Name
$changeType = $Event.SourceEventArgs.ChangeType
$timeStamp = $Event.TimeGenerated
$excelFile = $folder + $name
$E = New-Object -ComObject Excel.Application
$E.Visible = $false
$E.DisplayAlerts = $false
$wb = $E.Workbooks.Open($excelFile)
foreach ($ws in $wb.Worksheets) {
$n = "output_" + $name -replace ".XLS"
$ws.SaveAs($destination + $n + ".csv", 6)
}
$E.Quit()
}
I was doing something similar with word. I couldnt use Quit alone. I think using Quit hides Excel. Try releasing the com object by using:
$E.Quit()
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($E)
Remove-Variable E
i dont know if you are opening the Excel application but if you are you can maybe use
$wb.close($false)
Let me know if it works...
Ref: https://technet.microsoft.com/en-us/library/ff730962.aspx?f=255&MSPPError=-2147217396

Merging CSV Files into a XLSX with Tabs

I currently have 5 Registry CSV files which are created during a PowerShell script:
HKCC
HKCR
HKCU
HKLM
HKU
I need these CSV files to open at the end of the script however would like if all of them were contained within one XLSX file with 5 different headings
Is there a way to combine the files through PowerShell?
I understand how to get the data of the CSV files but don't understand how to merge them or convert. Some of the variables I believe which may be helpful.
$Date = Get-Date -Format "d.MMM.yyyy"
$DIR = $WPFlistview.Selecteditem.Ransomware
$path = "F:\Registry_Export\Results\$DIR\$Date\*"
$csvs = Get-ChildItem $path -Include *.csv
$output = "F:\Registry_Export\Results\$DIR\$Date\Results.Xlsx"
Paths to the CSV files if needed:
F:\Registry_Export\Results\$DIR\$Date\HKCR.CSV
F:\Registry_Export\Results\$DIR\$Date\HKCU.CSV
F:\Registry_Export\Results\$DIR\$Date\HKLM.CSV
F:\Registry_Export\Results\$DIR\$Date\HKU.CSV
F:\Registry_Export\Results\$DIR\$Date\HKCC.CSV
This is what I have tried prior. However, it completly scrambles my data into the wrong lines and cells:
function MergeCSV {
$Date = Get-Date -Format "d.MMM.yyyy"
$DIR = $WPFlistview.Selecteditem.Ransomware
$path = "F:\Registry_Export\Results\$DIR\$Date\*"
$csvs = Get-ChildItem $path -Include *.csv
$y = $csvs.Count
Write-Host "Detected the following CSV files: ($y)"
foreach ($csv in $csvs) {
Write-Host " "$csv.Name
}
$outputfilename = "Final Registry Results"
Write-Host Creating: $outputfilename
$excelapp = New-Object -ComObject Excel.Application
$excelapp.SheetsInNewWorkbook = $csvs.Count
$xlsx = $excelapp.Workbooks.Add()
$sheet = 1
foreach ($csv in $csvs) {
$row = 1
$column = 1
$worksheet = $xlsx.Worksheets.Item($sheet)
$worksheet.Name = $csv.Name
$file = (Get-Content $csv)
foreach ($line in $file) {
$linecontents = $line -split ',(?!\s*\w+")'
foreach ($cell in $linecontents) {
$worksheet.Cells.Item($row,$column) = $cell
$column++
}
$column = 1
$row++
}
$sheet++
}
$output = "F:\Registry_Export\Results\$DIR\$Date\Results.Xlsx"
$xlsx.SaveAs($output)
$excelapp.Quit()
}
How the CSV looks
https://gyazo.com/177c7c3bb21ddf06d0ebacbb7f4d537b
How the XLSX looks
https://gyazo.com/cd5fb48d61f93aac5ec3034d81811094
So, using the Excel.Application ComObject still, what I would suggest is loading each CSV as a CSV, not using Get-Content like you are. Then use the ConvertTo-CSV cmdlet, specifying to use tab as the delimiter, and copy that to the clipboard. Then just paste into Excel, and it will paste in fairly nicely. You may want to adjust column size, but the data will show up just as you would expect it to. I would also use a For loop instead of a ForEach loop, since Excel plays nice with numbers for the tabs (though it is 1 based instead of PowerShell's 0 base). Here's what I would end up with after making those modifications:
function MergeCSV {
$Date = Get-Date -Format "d.MMM.yyyy"
$DIR = $WPFlistview.Selecteditem.Ransomware
$path = "F:\Registry_Export\Results\$DIR\$Date\*"
$csvs = Get-ChildItem $path -Include *.csv
$y = $csvs.Count
Write-Host "Detected the following CSV files: ($y)"
Write-Host " "$csvs.Name"`n"
$outputfilename = "Final Registry Results"
Write-Host Creating: $outputfilename
$excelapp = New-Object -ComObject Excel.Application
$excelapp.SheetsInNewWorkbook = $csvs.Count
$xlsx = $excelapp.Workbooks.Add()
for($i=1;$i -le $y;$i++) {
$worksheet = $xlsx.Worksheets.Item($i)
$worksheet.Name = $csvs[$i-1].Name
$file = (Import-Csv $csvs[$i-1].FullName)
$file | ConvertTo-Csv -Delimiter "`t" -NoTypeInformation | Clip
$worksheet.Cells.Item(1).PasteSpecial()|out-null
}
$output = "F:\Registry_Export\Results\$DIR\$Date\Results.Xlsx"
$xlsx.SaveAs($output)
$excelapp.Quit()
}
You could use ImportExcel by Doug Finke And then replace your Export-CSV in the original script with Export-Excel -WorksheetName
Install-Module ImportExcel
Export-Excel "F:\Registry_Export\Results\$DIR\$Date\Results.xlsx" -worksheetname "HKCR"
Export-Excel "F:\Registry_Export\Results\$DIR\$Date\Results.xlsx" -worksheetname "HKCU"

PowerShell saving excel sheet in unreadable format

I have the below piece of code that checks for Files to Tapes jobs for a database and gives the output in an excel sheet.
$date = Get-Date
$day = $date.Day
$hour = $date.Hour
$Excel = New-Object -ComObject Excel.Application
$Excel.visible = $true
$Excel.DisplayAlerts = $false
$Workbook = $Excel.Workbooks.Add()
$Sheet = $Excel.Worksheets.Item(1)
#Counter variable for rows and columns
$intRow = 1
$intCol = 1
$Sheet.Cells.Item($intRow,1) = "Tasks/Servers"
$Sheet.Cells.Item($intRow,2) = "DateLastRun"
$Sheet.Cells.Item($intRow,3) = "PRX1CSDB01"
$Sheet.Cells.Item($intRow,4) = "PRX1CSDB02"
$Sheet.Cells.Item($intRow,5) = "PRX1CSDB03"
$Sheet.Cells.Item($intRow,6) = "PRX1CSDB11"
$Sheet.Cells.Item($intRow,7) = "PRX1CSDB12"
$Sheet.Cells.Item($intRow,8) = "PRX1CSDB13"
$Sheet.Cells.Item($intRow+1,1) = "File To Tape weekly Full Backup"
$Sheet.UsedRange.Rows.Item(1).Borders.LineStyle = 1
#FTT.txt contains the path for a list of servers
$path = Get-Content D:\Raghav\DB_Integrated\FTT.txt
foreach ($server in $path)
{
If (Test-Path $server)
{
$BckpWeek = gci -path $server | select-object | where {$_.Name -like "*logw*"} | sort LastWriteTime | select -last 1
$Sheet.Cells.Item($intRow+1,$intCol+1) = $BckpWeek.LastWriteTime.ToString('MMddyyyy')
$Sheet.UsedRange.Rows.Item($intRow).Borders.LineStyle = 1
$x = (get-date) - ([datetime]$BckpWeek.LastWriteTime)
if( $x.days -gt 7){$status_week = "Failed"}
else{$status_week = "Successful"}
$Sheet.Cells.Item($intRow+1,$intCol+2) = $status_week
$intCol++
}
else
{
$Sheet.Cells.Item($intRow+1,$intCol+2) = "Path Not Found"
$intCol++
}
}
$Sheet.UsedRange.EntireColumn.AutoFit()
$workBook.SaveAs("C:\Users\Output.xlsx",51)
$excel.Quit()
However, when I try to import the contents of Output.xlsx into a variable say $cc, I get data in an unreadable format.
$cc = Import-Csv "C:\Users\Output.xlsx"
Attached is the image for what I get on exporting output.xlsx into $cc. I tried to put the output in csv format too. But that also doesnt seem to help.Anybody having any idea on this or having faced any similar situation before?
#ZevSpitz - Looking for the OleDbConnection class, I landed up at https://blogs.technet.microsoft.com/pstips/2014/06/02/get-excel-data-without-excel/ . This is what I was looking for. Thank you for pointing me out in the right direction.
#MikeGaruccio - Unfortunately, I didn't find Import-Excel command in Get-Help menu. I am using Powershell 4.0. Anyways, thank you for the suggestion.

Resources