I have a powershell script that reads in a csv and then appends to an excel worksheet.
It runs quite painfully slow. I have searched and it seems this is a limitation of using com to write to excel. Some suggestions I have found to speed this up are to write out entire ranges instead of cell by cell. However I need to format the cells and it doesn't seem to be possible to do this when writing out ranges. Any suggestions on how to optimize the below code would be welcome.
I do not have the option to use a DB.
$csvPath = "Z:\script_test\"
$outputFile = "Z:\script_test\exceltest.xlsx"
foreach($csvFile in Get-ChildItem $csvPath -Filter "STATS*.txt" ){
$csvFilePath = [io.path]::combine($csvPath, $csvFile)
$rawcsvData = Import-Csv -Delimiter ";" -Path $csvFilePath
$Excel = New-Object -ComObject excel.application
$Excel.visible = $false
$workbook = $Excel.workbooks.Open($outputFile)
$ExcelWorkSheet = $Excel.WorkSheets.item("2016")
$ExcelWorkSheet.activate()
$excel.cells.item(1,1) = “PEX”
$excel.cells.item(1,2) = “RUN DATE”
$excel.cells.item(1,3) = “EXECS”
$excel.cells.item(1,4) = “CPU AV.”
$excel.cells.item(1,5) = “CPU HI.”
$excel.cells.item(1,6) = “CPU TOT.”
$excel.cells.item(1,7) = “#VALUE!”
$excel.cells.item(1,8) = “ELAPS AV.”
$excel.cells.item(1,9) = “ELAPSE HI.”
$excel.cells.item(1,10) = “ELAPSE TOT”
$i = $ExcelWorkSheet.UsedRange.rows.count + 1
foreach($rawcsv in $rawcsvData)
{
$RUNDATE = $rawcsv.“RUN DATE ”.replace("--1","")
$EXECS = $rawcsv."EXECS ".replace("?","")
$CPUAV = $rawcsv.“CPU AV. ”.replace("-",":")
$CPUHI = $rawcsv.“CPU HI. ”.replace("-",":")
$CPUTOT = $rawcsv.“CPU TOT. ”.replace("-",":")
$ELAPSEAV = $rawcsv.“ELAPSE AV.”.replace("-",":")
$ELAPSEHI = $rawcsv.“ELAPSE HI.”.replace("-",":")
$ELPASETOT = $rawcsv.“ELPASE TOT”.replace("-",":")
Write-Output("working" + $i)
$excel.cells.item($i,1) = $rawcsv."PEX "
$excel.cells.item($i,2) = $RUNDATE
$excel.cells.item($i,2).NumberFormat = “yyyy/mm/dd”
$excel.cells.item($i,3) = $EXECS
$excel.cells.item($i,4) = $CPUAV
$excel.cells.item($i,4).NumberFormat = “hh:mm:ss.00”
$excel.cells.item($i,5) = $CPUHI
$excel.cells.item($i,5).NumberFormat = “hh:mm:ss.00”
$excel.cells.item($i,6) = $CPUTOT
$excel.cells.item($i,6).NumberFormat = “hh:mm:ss.00”
$excel.cells.item($i,7) = “=((HOUR(F"+$i+")*3600)+(MINUTE(F"+$i+")*60)+SECOND(F"+$i+"))*21”
$excel.cells.item($i,8) = $ELAPSEAV
$excel.cells.item($i,8).NumberFormat = “hh:mm:ss.00”
$excel.cells.item($i,9) = $ELAPSEHI
$excel.cells.item($i,9).NumberFormat = “hh:mm:ss.00”
$excel.cells.item($i,10) = $ELPASETOT
$excel.cells.item($i,10).NumberFormat = “hh:mm:ss.00”
$i++
}
$ExcelWorkSheet.UsedRange.RemoveDuplicates()
#$workbook.saveas($outputFile)
$workbook.save()
$Excel.Quit()
Remove-Variable -Name excel
[gc]::collect()
[gc]::WaitForPendingFinalizers()
Move-Item -Path $csvFilePath -Destination "Z:\script_test\used files"
}
The slow part is all about COM object performance. You won't be able to speed this up good enough if you will keep working with COM object, sadly.
Back in days I had some project related to Excel and I found some great module that uses external DLL, you can take a look on it: PSExcel
Best part is that you will not need to have Excel installed, like you do with COM object.
There is a Powershell cmdlet you can install called Export-XLSX that works very similarly to the native Export-CSV: https://gallery.technet.microsoft.com/office/Export-XLSX-PowerShell-f2f0c035
The documentation there is pretty good, but here's an example of how you would use it:
# 1. Define path of Export-XLSX.ps1 script:
$ExportXLSX = "C:\YourFilePath\Export-XLSX\Export-XLSX.ps1"
# 2. Call script same as any other function by preceding filename with a period (.$ExportXLSX) and following it with parameters:
Get-ChildItem $env:windir | Select-Object Mode,LastWriteTime,Length,Name | .$ExportXLSX -Path 'c:\temp\PSExcel.xlsx' -WorkSheetName 'Files'
UPDATE: Having compared this option with the full PSExcel module presented in the other answer, I actually prefer the PSExcel module. The performance speed-wise is pretty much the same in my testing, but the PSExcel module appears to create much smaller files.
For example, using the above list of the windows directory outputs a 53KB file using the Export-XLSX.ps1 on my machine. However, the PSExcel module outputs a 7KB file. Given its ease of use, I would go with it.
Related
I'm thinking about writing script that convert my existing CSV file to XLSX file so I been following this post
https://code.adonline.id.au/csv-to-xlsx-powershell/
and it's working fine but I'm just wondering how can I format as a table and apply style while converting to XLSX file?
I'll be really appreciated if I can get any help or suggestion.
### Set input and output path
$inputCSV = "C:\AuditLogSearch\Modified Audit-Log-Records.csv"
$outputXLSX = "C:\AuditLogSearch\output1.xlsx"
### Create a new Excel Workbook with one empty sheet
$excel = New-Object -ComObject excel.application
$workbook = $excel.Workbooks.Add(1)
$worksheet = $workbook.worksheets.Item(1)
### Build the QueryTables.Add command
### QueryTables does the same as when clicking "Data » From Text" in Excel
$TxtConnector = ("TEXT;" + $inputCSV)
$Connector = $worksheet.QueryTables.add($TxtConnector,$worksheet.Range("A1"))
$query = $worksheet.QueryTables.item($Connector.name)
### Set the delimiter (, or ;) according to your regional settings
$query.TextFileOtherDelimiter = $Excel.Application.International(5)
### Set the format to delimited and text for every column
### A trick to create an array of 2s is used with the preceding comma
$query.TextFileParseType = 1
$query.TextFileColumnDataTypes = ,2 * $worksheet.Cells.Columns.Count
$query.AdjustColumnWidth = 1
### Execute & delete the import query
$query.Refresh()
$query.Delete()
$Workbook.SaveAs($outputXLSX,51)
$excel.Quit()
Assuming you want to try out the ImportExcel Module.
Install it first: Install-Module ImportExcel -Scope CurrentUser
Then the code would look like this:
$params = #{
AutoSize = $true
TableName = 'exampleTable'
TableStyle = 'Medium11' # => Here you can chosse the Style you like the most
BoldTopRow = $true
WorksheetName = 'YourWorkSheetName'
PassThru = $true
Path = 'path/to/excel.xlsx' # => Define where to save it here!
}
$xlsx = Import-Csv path/to/csv.csv | Export-Excel #params
$ws = $xlsx.Workbook.Worksheets[$params.Worksheetname]
$ws.View.ShowGridLines = $false # => This will hide the GridLines on your file
Close-ExcelPackage $xlsx
The author has a Youtube channel where he used to upload tutorials and there is also online Documentation over the internet if you want to learn more.
Try to format datalabel font size with a powershell generated barchart, but does not work
Read the "whole" API for Chart.SeriesCollection for VBA and .NET. But it does not help. Is it a bug or have I a brain bug? Anyone who can help?
https://learn.microsoft.com/de-de/office/vba/api/excel.chart.seriescollection
My try (with different iterations about this)
$chart.SeriesCollection(1).DataLabels.Format.TextFrame2.TextRange2.Font.Size = 18
Powershell Error Message: The property 'Size' cannot be found on this object. Verify that the property exists and can be set.
The whole short powershell script:
$excel = New-Object -comobject Excel.Application
$excel.Visible = $True
$workbook = $excel.Workbooks.Add()
$sheet = $excel.Worksheets.Item(1)
$sheet.Activate() | Out-NULL
$sheet.Cells.Item(1,1).Value2 = "City"
$sheet.Cells.Item(1,2).Value2 = "Citizens"
$sheet.Cells.Item(2,1) = "Offenbach"
$sheet.Cells.Item(2,2) = 111020
$sheet.Cells.Item(3,1) = "Heusenstamm"
$sheet.Cells.Item(3,2) = 18200
$sheet.Cells.Item(4,1) = "Rembruecken"
$sheet.Cells.Item(4,2) = 1850
$range = "A1:B4"
$chartSelect = $sheet.range($range)
$ch = $sheet.shapes.addChart().chart
$ch.chartType = 51
$ch.ApplyDataLabels(2)
$ch.SeriesCollection(1).DataLabels.Format.TextFrame2.TextRange2.Font.Size = 18
$ch.setSourceData($chartSelect)
Try this ?
1..3| %{$ch.SeriesCollection(1).DataLabels($_).Font.Size = 18}
There are a couple syntaxes that ought to work.
The old one which you need to select 'Show hidden members' in the VBIDE's Object Browser to see (as if it's been deprecated, but it works fine):
ActiveChart.SeriesCollection(2).DataLabels.Font.Size = 18
The new and improved and way more complex one:
ActiveChart.SeriesCollection(2).DataLabels.Format.TextFrame2.TextRange.Font.Size = 18
I'll let you convert from VBA to PowerShell.
I created a PowerShell script that allows me to merge multiple .CSV into one .XLSX file.
It works well on my computer:
$path = "C:\Users\Francesco\Desktop\CSV\Results\*"
$csvs = Get-ChildItem $path -Include *.csv
$y = $csvs.Count
Write-Host "Detected the following CSV files: ($y)"
Write-Host " "$csvs.Name"`n"
$outputfilename = "Final Registry Results"
Write-Host Creating: $outputfilename
$excelapp = New-Object -ComObject Excel.Application
$excelapp.SheetsInNewWorkbook = $csvs.Count
$xlsx = $excelapp.Workbooks.Add()
for ($i=1;$i -le $y;$i++) {
$worksheet = $xlsx.Worksheets.Item($i)
$worksheet.Name = $csvs[$i-1].Name
$file = (Import-Csv $csvs[$i-1].FullName)
$file | ConvertTo-Csv -Delimiter "`t" -NoTypeInformation | clip
$worksheet.Cells.Item(1).PasteSpecial() | Out-Null
}
$output = "C:\Users\Francesco\Desktop\CSV\Results\Results.xlsx"
$xlsx.SaveAs($output)
$excelapp.Quit()
The problem is that I need to run this on several servers and servers are well known for not having Office installed so I cannot use Excel.Application.
Is there a way to merge multiple CSV into one CSV or XLSX without using Excel.Application and saving each CSV into a different sheet?
#AnsgarWiechers is right, ImportExcel is powerful and not difficult to use. However for your specific case you can use a more limited approach, using OleDb (or ODBC or ADO) to write to an Excel file like a database. Here is some sample code showing how to write to an Excel file using OleDb.
$provider = 'Microsoft.ACE.OLEDB.12.0'
$dataSource = 'C:\users\user\OleDb.xlsb'
$connStr = "Provider=$provider;Data Source=$dataSource;Extended Properties='Excel 12.0;HDR=YES'"
$objConn = [Data.OleDb.OleDbConnection]::new($connStr)
$objConn.Open()
$cmd = $objConn.CreateCommand()
$sheetName = 'Demo'
$cmd.CommandText = 'CREATE TABLE $sheetName (Name TEXT,Age NUMBER)'
$cmd.ExecuteNonQuery()
$cmd.CommandText = "INSERT INTO demo (Name,Age) VALUES ('Adam', 20)"
$cmd.ExecuteNonQuery()
$cmd.CommandText = "INSERT INTO demo (Name,Age) VALUES ('Bob',30)"
$cmd.ExecuteNonQuery()
$cmd.Dispose()
$objConn.Close()
$objConn.Dispose()
You didn't say much about the CSV files you'll be processing. If column data varies, to create the table you'll have to get the attribute (column) names from the CSV header (either by reading the first line of the CSV file, or by enumerating the properties of the first item returned by Import-CSV).
If your CSV files have a large number of lines, writing one line at a time may be slow. In that case using a DataSet and OleDbDataAdapter might improve performance (but I haven't tested). But at that point you might as well use OleDb to read the .csv directly into a DataSet, create a OleDbDataAdapter, set the adapter's InsertCommand property, and finally call the adapters Update method. I don't have time to write and test all that.
This is not intended as a full solution, just a demo of how to use OleDb to write to an Excel file.
Note: I tested this on a server that didn't have Office or Excel installed. The Office data providers pre-installed on that machine were 32-bit, but I was using 64-bit PowerShell. To get 64-bit drivers I installed the Microsoft Access Database Engine 2016 Redistributable and that's what I used for testing.
Time has passed and I have found a new solution: Install-Module -Name ImportExcel
This way the module takes care of the job like in this script.
I have tried lots of options to paste information copied from other Excel workbook into my new workbook but not success do that (the range is huge - more them 3000 lines).
Please see sample of my script:
$objExcel = New-Object -ComObject Excel.Application
$objExcel.Visible = $false
$objExcel.displayAlerts = $false
$Src = [Environment]::GetFolderPath('Desktop')+'\New.xlsx'
$Files = [Environment]::GetFolderPath('Desktop')+'\Org.xlsx'
$wb1 = $objExcel.workbooks.open($Files)
$Worksheetwb1 = $wb1.WorkSheets.item('Org')
$Worksheetwb1.activate()
$Range = $Worksheetwb1.Range('A1:I1').EntireColumn
$Range.Copy() | Out-Null
$wb3 = $objExcel.workbooks.open($Src)
$Worksheetwb3 = $wb3.WorkSheets.item('Dest')
$Worksheetwb3.activate()
$Worksheetwb3.Columns.item('A:I').clear()
$Range3 = $Worksheetwb3.Range('A1:I1').EntireColumn
$Worksheetwb3.Paste($Range.Value2)
$wb3.close($true)
$wb1.close($true)
$objExcel.Quit()
You're pasting into a wrong range. Worksheet.Paste() has parameters of destination and link, your code uses destination only, which should be a Range belonging to that worksheet. Therefore, the proper line should be this:
$Worksheetwb3.Paste($Range3)
Alternatively to Vesper's solution:
$Worksheetwb3.Range("A1").Paste() | Out-Null
# Paste special (as values)
$Worksheetwb3.Range("A1").PasteSpecial(-4163) | Out-Null
I have found the answer by changing the order of commands the Copy and then immediate after it the paste solved it for me.
I need to be able to read an existing (password protected) Excel spreadsheet (an .xlsx file) from Powershell - but I don't want to install Excel. Every approach I've found assumes that Excel is installed on the workstation where the script is running.
I've tried the Excel viewer, but it doesn't seem to work; it won't invoke properly. I've looked at other solutions on stackoverflow, but all of them seem to want to update the excel spreadsheet, and I'm hoping I don't have to go that far.
Am I missing something obvious?
See the Detailed Article from Scripting Guy here. You have to use classic COM ADO in your Powershell Script.
Hey, Scripting Guy! How Can I Read from Excel Without Using Excel?
Relevant Powershell Snippet:
$strFileName = "C:\Data\scriptingGuys\Servers.xls"
$strSheetName = 'ServerList$'
$strProvider = "Provider=Microsoft.Jet.OLEDB.4.0"
$strDataSource = "Data Source = $strFileName"
$strExtend = "Extended Properties=Excel 8.0"
$strQuery = "Select * from [$strSheetName]"
$objConn = New-Object System.Data.OleDb.OleDbConnection("$strProvider;$strDataSource;$strExtend")
$sqlCommand = New-Object System.Data.OleDb.OleDbCommand($strQuery)
$sqlCommand.Connection = $objConn
$objConn.open()
$DataReader = $sqlCommand.ExecuteReader()
While($DataReader.read())
{
$ComputerName = $DataReader[0].Tostring()
"Querying $computerName ..."
Get-WmiObject -Class Win32_Bios -computername $ComputerName
}
$dataReader.close()
$objConn.close()
That said, you have stated that your Excel file is password protected.
According to this Microsoft Support article, you cannot open password protected Excel files using OLEDB Connections.
From the Article:
On the Connection tab, browse to your workbook file. Ignore the "User
ID" and "Password" entries, because these do not apply to an Excel
connection. (You cannot open a password-protected Excel file as a data
source. There is more information on this topic later in this
article.)
If you don't have Excel installed, EPPlus is the best solution I know of to access Excel files from PowerShell. Refer to my answer here to setup EPPlus for PowerShell.
The following code creates a passwort protected Excel file containing the output of Get-Process and then reads back the process information from the password protected file:
# Load EPPlus
$DLLPath = "C:\Windows\System32\WindowsPowerShell\v1.0\Modules\EPPlus\EPPlus.dll"
[Reflection.Assembly]::LoadFile($DLLPath) | Out-Null
$FileName = "$HOME\Downloads\Processes.xlsx"
$Passwort = "Excel"
# Create Excel File with Passwort
$ExcelPackage = New-Object OfficeOpenXml.ExcelPackage
$Worksheet = $ExcelPackage.Workbook.Worksheets.Add("FromCSV")
$ProcessesString = Get-Process | ConvertTo-Csv -NoTypeInformation | Out-String
$Format = New-object -TypeName OfficeOpenXml.ExcelTextFormat -Property #{TextQualifier = '"'}
$null=$Worksheet.Cells.LoadFromText($ProcessesString,$Format)
$ExcelPackage.SaveAs($FileName,$Passwort)
# Open Excel File with Passwort
$ExcelPackage = New-Object OfficeOpenXml.ExcelPackage -ArgumentList $FileName,$Passwort
# Select First Worksheet
$Worksheet = $ExcelPackage.Workbook.Worksheets[1]
# Get Process data from Cells
$Processes = 0..$Worksheet.Dimension.Columns | % {
# Get all Cells in a row
$Row = $Worksheet.Cells[($Worksheet.Dimension.Start.Row+$_),$Worksheet.Dimension.Start.Column,($Worksheet.Dimension.Start.Row+$_),$Worksheet.Dimension.End.Column]
# Join values of all Cells in a row to a comma separated string
($Row | select -ExpandProperty Value) -join ','
} | ConvertFrom-Csv
Refer to my answer here for more options to protect Excel files.