Is it possible to speed up an Excel query within PowerShell? - excel

I am currently using the following query to add a specific row (with 10 columns) from an Excel spreadsheet (~1500 rows and hosted on SharePoint) to an array in PowerShell.
$connection = New-Object System.Data.OleDb.OleDbConnection
$connectstring = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=$Source;Extended Properties='Excel 12.0 Xml;HDR=YES'"; $connection.ConnectionString = $connectstring
$connection.open()
$cmdObject = New-Object System.Data.OleDb.OleDbCommand
$query = "Select * from [Sheet1$]
WHERE [Sheet1$].[Test] = '$Test'"
$cmdObject.CommandText = $query
$cmdObject.CommandType = "Text"
$cmdObject.Connection = $connection
$oReader = $cmdObject.ExecuteReader()
[void]$oReader.Read()
$Global:oData = New-Object PSObject
$oData | Add-Member NoteProperty List1 $oReader[0]
...
$oData | Add-Member NoteProperty List10 $oReader[9]
$oReader.Close()
$cmdObject.Dispose()
$connection.Close()
$connection.Dispose()
This works absolutely fine, however it is often quite slow. Is there any way in which I could speed up the query? I can't add the entire excel sheet into an array, as the data changes throughout the day and is queried regularly.
I've found other questions, such as Speed up reading an Excel File in Powershell however that doesn't seem relevant to this particular issue.
Appreciate any help.

Related

Powershell - Create a Pivot Table in Excel

I have managed to open and adjust my file so far like this :
$Fichier = "XXXX\McAfeeWin10.csv"
$objExcel = New-Object -ComObject Excel.Application
$WorkBook = $objExcel.Workbooks.Open($Fichier)
$WorkSheet = $WorkBook.worksheets.item(1)
$objExcel.Visible = $true
$Range = $worksheet.UsedRange.Cells
$range.NumberFormat = "#"
$WorkSheet.Columns("A:C").AutoFit()
That file has 3 columns and looks like this :
Now all I want to do is create a pivot table with the settings like this :
Column = VerNoyau
Row = DATVer
Values = Poste
How can I do it?
I have found pivot table examples but they are very complex and involve creating the whole file. I'm working on an already existing file so I assume it would be simpler.

Faster way of writing out to excel in powershell

I have a powershell script that reads in a csv and then appends to an excel worksheet.
It runs quite painfully slow. I have searched and it seems this is a limitation of using com to write to excel. Some suggestions I have found to speed this up are to write out entire ranges instead of cell by cell. However I need to format the cells and it doesn't seem to be possible to do this when writing out ranges. Any suggestions on how to optimize the below code would be welcome.
I do not have the option to use a DB.
$csvPath = "Z:\script_test\"
$outputFile = "Z:\script_test\exceltest.xlsx"
foreach($csvFile in Get-ChildItem $csvPath -Filter "STATS*.txt" ){
$csvFilePath = [io.path]::combine($csvPath, $csvFile)
$rawcsvData = Import-Csv -Delimiter ";" -Path $csvFilePath
$Excel = New-Object -ComObject excel.application
$Excel.visible = $false
$workbook = $Excel.workbooks.Open($outputFile)
$ExcelWorkSheet = $Excel.WorkSheets.item("2016")
$ExcelWorkSheet.activate()
$excel.cells.item(1,1) = “PEX”
$excel.cells.item(1,2) = “RUN DATE”
$excel.cells.item(1,3) = “EXECS”
$excel.cells.item(1,4) = “CPU AV.”
$excel.cells.item(1,5) = “CPU HI.”
$excel.cells.item(1,6) = “CPU TOT.”
$excel.cells.item(1,7) = “#VALUE!”
$excel.cells.item(1,8) = “ELAPS AV.”
$excel.cells.item(1,9) = “ELAPSE HI.”
$excel.cells.item(1,10) = “ELAPSE TOT”
$i = $ExcelWorkSheet.UsedRange.rows.count + 1
foreach($rawcsv in $rawcsvData)
{
$RUNDATE = $rawcsv.“RUN DATE ”.replace("--1","")
$EXECS = $rawcsv."EXECS ".replace("?","")
$CPUAV = $rawcsv.“CPU AV. ”.replace("-",":")
$CPUHI = $rawcsv.“CPU HI. ”.replace("-",":")
$CPUTOT = $rawcsv.“CPU TOT. ”.replace("-",":")
$ELAPSEAV = $rawcsv.“ELAPSE AV.”.replace("-",":")
$ELAPSEHI = $rawcsv.“ELAPSE HI.”.replace("-",":")
$ELPASETOT = $rawcsv.“ELPASE TOT”.replace("-",":")
Write-Output("working" + $i)
$excel.cells.item($i,1) = $rawcsv."PEX "
$excel.cells.item($i,2) = $RUNDATE
$excel.cells.item($i,2).NumberFormat = “yyyy/mm/dd”
$excel.cells.item($i,3) = $EXECS
$excel.cells.item($i,4) = $CPUAV
$excel.cells.item($i,4).NumberFormat = “hh:mm:ss.00”
$excel.cells.item($i,5) = $CPUHI
$excel.cells.item($i,5).NumberFormat = “hh:mm:ss.00”
$excel.cells.item($i,6) = $CPUTOT
$excel.cells.item($i,6).NumberFormat = “hh:mm:ss.00”
$excel.cells.item($i,7) = “=((HOUR(F"+$i+")*3600)+(MINUTE(F"+$i+")*60)+SECOND(F"+$i+"))*21”
$excel.cells.item($i,8) = $ELAPSEAV
$excel.cells.item($i,8).NumberFormat = “hh:mm:ss.00”
$excel.cells.item($i,9) = $ELAPSEHI
$excel.cells.item($i,9).NumberFormat = “hh:mm:ss.00”
$excel.cells.item($i,10) = $ELPASETOT
$excel.cells.item($i,10).NumberFormat = “hh:mm:ss.00”
$i++
}
$ExcelWorkSheet.UsedRange.RemoveDuplicates()
#$workbook.saveas($outputFile)
$workbook.save()
$Excel.Quit()
Remove-Variable -Name excel
[gc]::collect()
[gc]::WaitForPendingFinalizers()
Move-Item -Path $csvFilePath -Destination "Z:\script_test\used files"
}
The slow part is all about COM object performance. You won't be able to speed this up good enough if you will keep working with COM object, sadly.
Back in days I had some project related to Excel and I found some great module that uses external DLL, you can take a look on it: PSExcel
Best part is that you will not need to have Excel installed, like you do with COM object.
There is a Powershell cmdlet you can install called Export-XLSX that works very similarly to the native Export-CSV: https://gallery.technet.microsoft.com/office/Export-XLSX-PowerShell-f2f0c035
The documentation there is pretty good, but here's an example of how you would use it:
# 1. Define path of Export-XLSX.ps1 script:
$ExportXLSX = "C:\YourFilePath\Export-XLSX\Export-XLSX.ps1"
# 2. Call script same as any other function by preceding filename with a period (.$ExportXLSX) and following it with parameters:
Get-ChildItem $env:windir | Select-Object Mode,LastWriteTime,Length,Name | .$ExportXLSX -Path 'c:\temp\PSExcel.xlsx' -WorkSheetName 'Files'
UPDATE: Having compared this option with the full PSExcel module presented in the other answer, I actually prefer the PSExcel module. The performance speed-wise is pretty much the same in my testing, but the PSExcel module appears to create much smaller files.
For example, using the above list of the windows directory outputs a 53KB file using the Export-XLSX.ps1 on my machine. However, the PSExcel module outputs a 7KB file. Given its ease of use, I would go with it.

How to convert .xls to .csv using Powershell without Excel installed

Is there a way to convert .xls to .csv without Excel being installed using Powershell?
I don't have access to Excel on a particular machine so I get an error when I try:
New-Object -ComObject excel.application
New-Object : Retrieving the COM class factory for component with CLSID
{00000000-0000-0000-0000-000000000000} failed due to the following
error: 80040154 Class not registered (Exception from HRESULT:
0x80040154 (REGDB_E_CLASSNOTREG)).
Forward
Depending on what you already have installed on your system you might need the Microsoft Access Database Engine 2010 Redistributable for this solution to work. That will give you access to the provider: "Microsoft.ACE.OLEDB.12.0"
Disclaimer: Not super impressed with the result and someone with more background could make this answer better but here it goes.
Code
$strFileName = "C:\temp\Book1.xls"
$strSheetName = 'Sheet1$'
$strProvider = "Provider=Microsoft.ACE.OLEDB.12.0"
$strDataSource = "Data Source = $strFileName"
$strExtend = "Extended Properties='Excel 8.0;HDR=Yes;IMEX=1';"
$strQuery = "Select * from [$strSheetName]"
$objConn = New-Object System.Data.OleDb.OleDbConnection("$strProvider;$strDataSource;$strExtend")
$sqlCommand = New-Object System.Data.OleDb.OleDbCommand($strQuery)
$sqlCommand.Connection = $objConn
$objConn.open()
$da = New-Object system.Data.OleDb.OleDbDataAdapter($sqlCommand)
$dt = New-Object system.Data.datatable
[void]$da.fill($dt)
$dataReader.close()
$objConn.close()
$dt
Create an ODBC connection to the excel file $strFileName. You need to know your sheet name and populate $strSheetName which helps build $strQuery. When then use several objects to create a connection and extract the data from the sheet as a System.Data.DataTable. In my test file, with one populated sheet, I had two columns of data. After running the code the output of $dt is:
letter number
------ ------
a 2
d 34
b 0
e 4
You could then take that table and then ExportTo-CSV
$dt | Export-Csv c:\temp\data.csv -NoTypeInformation
This was built based on information gathered from:
Scripting Guy
PowerShell Code Repository

powershell excel access without installing Excel

I need to be able to read an existing (password protected) Excel spreadsheet (an .xlsx file) from Powershell - but I don't want to install Excel. Every approach I've found assumes that Excel is installed on the workstation where the script is running.
I've tried the Excel viewer, but it doesn't seem to work; it won't invoke properly. I've looked at other solutions on stackoverflow, but all of them seem to want to update the excel spreadsheet, and I'm hoping I don't have to go that far.
Am I missing something obvious?
See the Detailed Article from Scripting Guy here. You have to use classic COM ADO in your Powershell Script.
Hey, Scripting Guy! How Can I Read from Excel Without Using Excel?
Relevant Powershell Snippet:
$strFileName = "C:\Data\scriptingGuys\Servers.xls"
$strSheetName = 'ServerList$'
$strProvider = "Provider=Microsoft.Jet.OLEDB.4.0"
$strDataSource = "Data Source = $strFileName"
$strExtend = "Extended Properties=Excel 8.0"
$strQuery = "Select * from [$strSheetName]"
$objConn = New-Object System.Data.OleDb.OleDbConnection("$strProvider;$strDataSource;$strExtend")
$sqlCommand = New-Object System.Data.OleDb.OleDbCommand($strQuery)
$sqlCommand.Connection = $objConn
$objConn.open()
$DataReader = $sqlCommand.ExecuteReader()
While($DataReader.read())
{
$ComputerName = $DataReader[0].Tostring()
"Querying $computerName ..."
Get-WmiObject -Class Win32_Bios -computername $ComputerName
}
$dataReader.close()
$objConn.close()
That said, you have stated that your Excel file is password protected.
According to this Microsoft Support article, you cannot open password protected Excel files using OLEDB Connections.
From the Article:
On the Connection tab, browse to your workbook file. Ignore the "User
ID" and "Password" entries, because these do not apply to an Excel
connection. (You cannot open a password-protected Excel file as a data
source. There is more information on this topic later in this
article.)
If you don't have Excel installed, EPPlus is the best solution I know of to access Excel files from PowerShell. Refer to my answer here to setup EPPlus for PowerShell.
The following code creates a passwort protected Excel file containing the output of Get-Process and then reads back the process information from the password protected file:
# Load EPPlus
$DLLPath = "C:\Windows\System32\WindowsPowerShell\v1.0\Modules\EPPlus\EPPlus.dll"
[Reflection.Assembly]::LoadFile($DLLPath) | Out-Null
$FileName = "$HOME\Downloads\Processes.xlsx"
$Passwort = "Excel"
# Create Excel File with Passwort
$ExcelPackage = New-Object OfficeOpenXml.ExcelPackage
$Worksheet = $ExcelPackage.Workbook.Worksheets.Add("FromCSV")
$ProcessesString = Get-Process | ConvertTo-Csv -NoTypeInformation | Out-String
$Format = New-object -TypeName OfficeOpenXml.ExcelTextFormat -Property #{TextQualifier = '"'}
$null=$Worksheet.Cells.LoadFromText($ProcessesString,$Format)
$ExcelPackage.SaveAs($FileName,$Passwort)
# Open Excel File with Passwort
$ExcelPackage = New-Object OfficeOpenXml.ExcelPackage -ArgumentList $FileName,$Passwort
# Select First Worksheet
$Worksheet = $ExcelPackage.Workbook.Worksheets[1]
# Get Process data from Cells
$Processes = 0..$Worksheet.Dimension.Columns | % {
# Get all Cells in a row
$Row = $Worksheet.Cells[($Worksheet.Dimension.Start.Row+$_),$Worksheet.Dimension.Start.Column,($Worksheet.Dimension.Start.Row+$_),$Worksheet.Dimension.End.Column]
# Join values of all Cells in a row to a comma separated string
($Row | select -ExpandProperty Value) -join ','
} | ConvertFrom-Csv
Refer to my answer here for more options to protect Excel files.

Updating a Sharepoint List from Powershell

I can modify a new datarow in Powershell but it will not update the Sharepoint List on the site itself.
Here is a bit of my code
Here i fill my dataset with table info
$connString = 'Provider=Microsoft.ACE.OLEDB.12.0;WSS;IMEX=2;RetrieveIds=Yes;DATABASE=https://sharepoint/;LIST={6d552622-3333-4444-9999-234d32d32d3};'
$spConn = new-object System.Data.OleDb.OleDbConnection($connString)
$spConn.open()
$qry="select * from myList"
$cmd = new-object System.Data.OleDb.OleDbCommand($qry,$spConn)
$da = new-object System.Data.OleDb.OleDbDataAdapter($cmd)
$dataSet = new-object System.Data.DataSet
$sp = $dataSet.Tables.Add("Table")
$da.fill($sp)
Here i add a new datarow
$row = $sp.NewRow()
$sp.Rows.Add($row)
$row["Title"] = "Foo"
And here i try to update the Sharepoint List
$da.Update($sp)
It does not let me update, any help or guidence would be great.
Thanks
You are adding a new row which is an insert. Inserts are not supported by the OleDB provider against a SharePoint list. You can select or update the value of an existing row, but not create a new row.

Resources