PowerShell saving excel sheet in unreadable format - excel

I have the below piece of code that checks for Files to Tapes jobs for a database and gives the output in an excel sheet.
$date = Get-Date
$day = $date.Day
$hour = $date.Hour
$Excel = New-Object -ComObject Excel.Application
$Excel.visible = $true
$Excel.DisplayAlerts = $false
$Workbook = $Excel.Workbooks.Add()
$Sheet = $Excel.Worksheets.Item(1)
#Counter variable for rows and columns
$intRow = 1
$intCol = 1
$Sheet.Cells.Item($intRow,1) = "Tasks/Servers"
$Sheet.Cells.Item($intRow,2) = "DateLastRun"
$Sheet.Cells.Item($intRow,3) = "PRX1CSDB01"
$Sheet.Cells.Item($intRow,4) = "PRX1CSDB02"
$Sheet.Cells.Item($intRow,5) = "PRX1CSDB03"
$Sheet.Cells.Item($intRow,6) = "PRX1CSDB11"
$Sheet.Cells.Item($intRow,7) = "PRX1CSDB12"
$Sheet.Cells.Item($intRow,8) = "PRX1CSDB13"
$Sheet.Cells.Item($intRow+1,1) = "File To Tape weekly Full Backup"
$Sheet.UsedRange.Rows.Item(1).Borders.LineStyle = 1
#FTT.txt contains the path for a list of servers
$path = Get-Content D:\Raghav\DB_Integrated\FTT.txt
foreach ($server in $path)
{
If (Test-Path $server)
{
$BckpWeek = gci -path $server | select-object | where {$_.Name -like "*logw*"} | sort LastWriteTime | select -last 1
$Sheet.Cells.Item($intRow+1,$intCol+1) = $BckpWeek.LastWriteTime.ToString('MMddyyyy')
$Sheet.UsedRange.Rows.Item($intRow).Borders.LineStyle = 1
$x = (get-date) - ([datetime]$BckpWeek.LastWriteTime)
if( $x.days -gt 7){$status_week = "Failed"}
else{$status_week = "Successful"}
$Sheet.Cells.Item($intRow+1,$intCol+2) = $status_week
$intCol++
}
else
{
$Sheet.Cells.Item($intRow+1,$intCol+2) = "Path Not Found"
$intCol++
}
}
$Sheet.UsedRange.EntireColumn.AutoFit()
$workBook.SaveAs("C:\Users\Output.xlsx",51)
$excel.Quit()
However, when I try to import the contents of Output.xlsx into a variable say $cc, I get data in an unreadable format.
$cc = Import-Csv "C:\Users\Output.xlsx"
Attached is the image for what I get on exporting output.xlsx into $cc. I tried to put the output in csv format too. But that also doesnt seem to help.Anybody having any idea on this or having faced any similar situation before?

#ZevSpitz - Looking for the OleDbConnection class, I landed up at https://blogs.technet.microsoft.com/pstips/2014/06/02/get-excel-data-without-excel/ . This is what I was looking for. Thank you for pointing me out in the right direction.
#MikeGaruccio - Unfortunately, I didn't find Import-Excel command in Get-Help menu. I am using Powershell 4.0. Anyways, thank you for the suggestion.

Related

How to use powershell to select range and dump that to csv file

Actually, this is a version of question here:
How to use powershell to select and copy columns and rows in which data is present in new workbook.
The goal is to grab certain columns from multiple Excel workbooks and dump everything to one csv file. Columns are always the same.
I'm doing that manually:
$xl = New-Object -ComObject Excel.Application
$xl.Visible = $false
$xl.DisplayAlerts = $false
$counter = 0
$input_folder = "C:\Users\user\Documents\excelfiles"
$output_folder = "C:\Users\user\Documents\csvdump"
Get-ChildItem $input_folder -File |
Foreach-Object {
$counter++
$wb = $xl.Workbooks.Open($_.FullName, 0, 1, 5, "")
try {
$ws = $wb.Worksheets.item('Calls') # => This specific worksheet
$rowMax = ($ws.UsedRange.Rows).count
for ($i=1; $i -le $rowMax-1; $i++) {
$newRow = New-Object -Type PSObject -Property #{
'Type' = $ws.Cells.Item(1+$i,1).text
'Direction' = $ws.Cells.Item(1+$i,2).text
'From' = $ws.Cells.Item(1+$i,3).text
'To' = $ws.Cells.Item(1+$i,4).text
}
$newRow | Export-Csv -Path $("$output_folder\$ESO_Output") -Append -noType -Force
}
}
} catch {
Write-host "No such workbook" -ForegroundColor Red
# Return
}
}
Question:
This works, but is extremely slow because Excel has to select every cell, copy that, then Powershell has to create array and save row by row in output csv file.
Is there a method to select a range in Excel (number of columns times ($ws.UsedRange.Rows).count), cut header line and just append this range (array?) to csv file to make everything much faster?
So that's the final solution
Script is 22 times faster!!! than original solution.
Hope somebody will find that useful :)
PasteSpecial is to filter out empty rows. There is no need to save them into csv
$xl = New-Object -ComObject Excel.Application
$xl.Visible = $false
$xl.DisplayAlerts = $false
$counter = 0
$input_folder = "C:\Users\user\Documents\excelfiles"
$output_folder = "C:\Users\user\Documents\csvdump"
Get-ChildItem $input_folder -File |
Foreach-Object {
$counter++
try {
$new_ws1 = $wb.Worksheets.add()
$ws = $wb.Worksheets.item('Calls')
$rowMax = ($ws.UsedRange.Rows).count
$range = $ws.Range("A1:O$rowMax")
$x = $range.copy()
$y = $new_ws1.Range("A1:O$rowMax").PasteSpecial([System.Type]::Missing,[System.Type]::Missing,$true,$false)
$wb.SaveAs("$($output_folder)\$($_.Basename)",[Microsoft.Office.Interop.Excel.XlFileFormat]::xlCSVWindows)
} catch {
Write-host "No such workbook" -ForegroundColor Red
# Return
}
}
$xl.Quit()
Part above will generate a bunch of csv files.
Part below will read these files in separate loop and combine them together into one.
-exclude is an array of something I want to omit
Remove-Item to remove temporary files
Answer below is based on this post: https://stackoverflow.com/a/27893253/6190661
$getFirstLine = $true
Get-ChildItem "$output_folder\*.csv" -exclude $excluded | foreach {
$filePath = $_
$lines = Get-Content $filePath
$linesToWrite = switch($getFirstLine) {
$true {$lines}
$false {$lines | Select -Skip 1}
}
$getFirstLine = $false
Add-Content "$($output_folder)\MERGED_CSV_FILE.csv" $linesToWrite
Remove-Item $_.FullName
}

Merge content of multiple Excel files into one using PowerShell

I have multiple Excel files with different names in path.
e.g. C:\Users\XXXX\Downloads\report
Each file has a fixed number of columns.
e.g. Date | Downtime | Response
I want to create a new Excel file with merge of all Excel data. New column should be added with client name in which i want to enter file name. Then each Excel file data append below one by one.
e.g. Client name | Date | Downtime | Response
Below code can able to append all excel data but now need to add Client name column.
$path = "C:\Users\XXXX\Downloads\report"
#Launch Excel, and make it do as its told (supress confirmations)
$Excel = New-Object -ComObject Excel.Application
$Excel.Visible = $True
$Excel.DisplayAlerts = $False
$Files = Get-ChildItem -Path $path
#Open up a new workbook
$Dest = $Excel.Workbooks.Add()
#Loop through files, opening each, selecting the Used range, and only grabbing the first 5 columns of it. Then find next available row on the destination worksheet and paste the data
ForEach($File in $Files)
{
$Source = $Excel.Workbooks.Open($File.FullName,$true,$true)
If(($Dest.ActiveSheet.UsedRange.Count -eq 1) -and ([String]::IsNullOrEmpty($Dest.ActiveSheet.Range("A1").Value2)))
{
#If there is only 1 used cell and it is blank select A1
[void]$source.ActiveSheet.Range("A1","E$(($Source.ActiveSheet.UsedRange.Rows|Select -Last 1).Row)").Copy()
[void]$Dest.Activate()
[void]$Dest.ActiveSheet.Range("A1").Select()
}
Else
{
#If there is data go to the next empty row and select Column A
[void]$source.ActiveSheet.Range("A2","E$(($Source.ActiveSheet.UsedRange.Rows|Select -Last 1).Row)").Copy()
[void]$Dest.Activate()
[void]$Dest.ActiveSheet.Range("A$(($Dest.ActiveSheet.UsedRange.Rows|Select -last 1).row+1)").Select()
}
[void]$Dest.ActiveSheet.Paste()
$Source.Close()
}
$Dest.SaveAs("$path\Merge.xls")
$Dest.close()
$Excel.Quit()
Suggest any effective way to do this. Please provide links if available.
Convert XLS to XLSX :
$xlFixedFormat = [Microsoft.Office.Interop.Excel.XlFileFormat]::xlWorkbookDefault
$excel = New-Object -ComObject excel.application
$excel.visible = $true
$folderpath = "C:\Users\xxxx\Downloads\report\*"
$filetype ="*xls"
Get-ChildItem -Path $folderpath -Include $filetype |
ForEach-Object `
{
$path = ($_.fullname).substring(0,($_.FullName).lastindexOf("."))
"Converting $path to $filetype..."
$workbook = $excel.workbooks.open($_.fullname)
$workbook.saveas($path, $xlFixedFormat)
$workbook.close()
}
$excel.Quit()
$excel = $null
[gc]::collect()
[gc]::WaitForPendingFinalizers()
If you are willing to use the external module Import-Excel, you could simply loop through the files like so:
$report_directory = ".\reports"
$merged_reports = #()
# Loop through each XLSX-file in $report_directory
foreach ($report in (Get-ChildItem "$report_directory\*.xlsx")) {
# Loop through each row of the "current" XLSX-file
$report_content = foreach ($row in Import-Excel $report) {
# Create "custom" row
[PSCustomObject]#{
"Client name" = $report.Name
"Date" = $row."Date"
"Downtime" = $row."Downtime"
"Response" = $row."Response"
}
}
# Add the "custom" data to the results-array
$merged_reports += #($report_content)
}
# Create final report
$merged_reports | Export-Excel ".\merged_report.xlsx"
Please note that this code is not optimized in terms of performance but it should allow you to get started

Powershell script using Excel running slow

So i have this script that i coded on my laptop that works just fine, the job is to combine two .csv-files into one .xls-file.
And running the script with two .csv-files containing a couple of thousand rows takes a few seconds max.
But when i try to run it on the server where it should be located, it takes... hours. I haven't done a full run, but writing one line in the .xls-file takes maybe 2-3 seconds.
So what im wondering is what is causing the huge increase in runtime. I'm monitoring the CPU-load while the script is running, and it's at 50-60% load.
The server has loads of Ram, and two CPU-core.
How can i speed this up?
The script looks like this:
$path = "C:\test\*"
$path2 = "C:\test"
$date = Get-Date -Format d
$csvs = Get-ChildItem $path -Include *.csv | Sort-Object LastAccessTime -Descending | Select-Object -First 2
$y = $csvs.Count
Write-Host "Detected the following CSV files: ($y)"
foreach ($csv in $csvs) {
Write-Host " "$csv.Name
}
$outputfilename = "regSCI " + $date
Write-Host Creating: $outputfilename
$excelapp = New-Object -ComObject Excel.Application
$excelapp.sheetsInNewWorkbook = $csvs.Count
$xlsx = $excelapp.Workbooks.Add()
$sheet = 1
$xlleft = -4131
foreach ($csv in $csvs) {
$row = 1
$column = 1
$worksheet = $xlsx.Worksheets.Item($sheet)
$worksheet.Name = $csv.Name
$worksheet.Rows.HorizontalAlignment = $xlleft
$file = (Get-Content $csv)
Write-Host Worksheet created: $worksheet.Name
foreach($line in $file) {
Write-Host Writing Line
$linecontents = $line -split ',(?!\s*\w+")'
foreach($cell in $linecontents) {
Write-Host Writing Cell
$cell1 = $cell.Trim('"')
$worksheet.Cells.Item($row, $column) = $cell1
$column++
}
$column = 1
$row++
$WorkSheet.UsedRange.Columns.Autofit() | Out-Null
}
$sheet++
$headerRange = $worksheet.Range("a1", "q1")
$headerRange.AutoFilter() | Out-Null
}
$output = $path2 + "\" + $outputfilename
Write-Host $output
$xlsx.SaveAs($output)
$excelapp.Quit()
To speed up your existing code, add these just after creating Excel object:
$excelapp.ScreenUpdating = $false
$excelapp.DisplayStatusBar = $false
$excelapp.EnableEvents = $false
$excelapp.Visible = $false
And these just before SaveAs:
$excelapp.ScreenUpdating = $true
$excelapp.DisplayStatusBar = $true
$excelapp.EnableEvents = $true
This causes excel not to render the worksheet in realtime and fire events every time you change the contets. Most probably DisplayStatusBar and ScreenUpdating doesn't matter if you make an application invisible, but I included it just in case.
Also, you're running Autofit() after every line. This certainly doesn't help with performance.

Import csv into excel and specify cell format

I am trying to import multiple csv files into their own tabs in 1 excel workbook. I am having an issue with long number fields being displayed as exponential data and changing the last digit to 0. For example I have a 16 digit account number (1234567890123456) it is being displayed in excel as an exponential number (1.23457E+15). When I look at the actual number in the cell it is (1234567890123450). I assume if I make the column text before I bring it in, it will work, but I'm not sure how to do that. Here is my code.
$excel = New-Object -ComObject excel.application
$excel.visible = $False
$excel.displayalerts=$False
$workbook = $excel.workbooks.add()
$sheets = $workbook.sheets
$sheetCount = $Sheets.Count
$mySheet = 1
$mySheetName = "Sheet" + $mySheet
$s1 = $sheets | where {$_.name -eq $mySheetName }
$s1.Activate()
If($sheetCount -gt 1)
{
#Delete other Sheets
$Sheets | ForEach
{
$tmpSheetName = $_.Name
$tmpSheet = $_
If($tmpSheetName -ne "Sheet1"){$tmpSheet.Delete()}
}
}
#import csv files
$files = dir -Path $csvDir*.csv
ForEach($file in $files){
If($mySheet -gt 1){$s1 = $workbook.sheets.add()}
$s1.Name = $file.BaseName
$s1.Activate()
$s1Data = Import-Csv $file.FullName
$s1data | ConvertTo-Csv -Delimiter "`t" -NoTypeInformation | Clip
$s1.cells.item(1,1).Select()
$s1.Paste()
$mySheet ++
if (test-path $file ) { rm $file }
}
$workbook.SaveAs($excelTMGPath)
$workbook.Close()
$workbook = $null
#$excel.quit()
while ([System.Runtime.InteropServices.Marshal]::FinalReleaseComObject($excel)) {}
$excel = $null
Try
If $s1 is pointed correctly,
$s1.cells.item(1,1).NumberFormat="#"
If that does not work, use NumberFormat where necessary. Use the format you prefer.
Change the name of your file extension from .csv to .txt. Adjust your filename in the code,
$files = dir -Path $csvDir*.txt

Read Excel data with Powershell and write to a variable

Using PowerShell I would like to capture user input, compare the input to data in an Excel spreadsheet and write the data in corresponding cells to a variable. I am fairly new to PowerShell and can't seem to figure this out. Example would be: A user is prompted for a Store Number, they enter "123". The input is then compared to the data in Column A. The data in the corresponding cells is captured and written to a variable, say $GoLiveDate.
Any help would be greatly appreciated.
User input can be read like this:
$num = Read-Host "Store number"
Excel can be handled like this:
$xl = New-Object -COM "Excel.Application"
$xl.Visible = $true
$wb = $xl.Workbooks.Open("C:\path\to\your.xlsx")
$ws = $wb.Sheets.Item(1)
Looking up a value in one column and assigning the corresponding value from another column to a variable could be done like this:
for ($i = 1; $i -le 3; $i++) {
if ( $ws.Cells.Item($i, 1).Value -eq $num ) {
$GoLiveDate = $ws.Cells.Item($i, 2).Value
break
}
}
Don't forget to clean up after you're done:
$wb.Close()
$xl.Quit()
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($xl)
I find it preferable to use an OleDB connection to interact with Excel. It's faster than COM interop and less error prone than import-csv. You can prepare a collection of psobjects (one psobject is one row, each property corresponding to a column) to match your desired target grid and insert it into the Excel file. Similarly, you can insert a DataTable instead of a PSObject collection, but unless you start by retrieving data from some data source, PSObject collection way is usually easier.
Here's a function i use for writing a psobject collection to Excel:
function insert-OLEDBData ($file,$sheet,$ocol) {
{
"xlsb$"
{"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=`"$File`";Extended Properties=`"Excel 12.0;HDR=YES;IMEX=1`";"}
"xlsx$"
{"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=`"$File`";Extended Properties=`"Excel 12.0 Xml;HDR=YES;IMEX=1`";"}
}
$OLEDBCon = New-Object System.Data.OleDb.OleDbConnection($cs)
$hdr = $oCol|gm -MemberType NoteProperty|%{$_.name}
$names = '[' + ($hdr-join"],[") + ']'
$vals = (#("?")*([array]$hdr).length)-join','
$sql = "insert into [$sheet`$] ($names) values ($vals)"
$sqlCmd = New-Object system.Data.OleDb.OleDbCommand($sql)
$sqlCmd.connection = $oledbcon
$cpary = #($null)*([array]$hdr).length
$i=0
[array]$hdr|%{([array]$cpary)[$i] = $sqlCmd.parameters.add($_,"VarChar",255);$i++}
$oledbcon.open()
for ($i=0;$i-lt([array]$ocol).length;$i++)
{
for ($k=0;$k-lt([array]$hdr).length;$k++)
{
([array]$cpary)[$k].value = ([array]$oCol)[$i].(([array]$hdr)[$k])
}
$res = $sqlCmd.ExecuteNonQuery()
}
$OLEDBCon.close()
}
This does not seem to work anymore. I swear it used to, but maybe an update to O365 killed it? or I last used it on Win 7, and have long since moved to Win 10:
$GoLiveDate = $ws.Cells.Item($i, 2).Value
I can still use .Value for writing to a cell, but not for reading it into a variable. instead of the contents of the cell, It returns: "Variant Value (Variant) {get} {set}"
But after some digging, I found this does work to read a cell into a variable:
$GoLiveDate = $ws.Cells.Item($i, 2).Text
In regards to the next question / comment squishy79 asks about slowness, and subsequent
OleDB solutions, I can't seem to get those to work in modern OS' either, but my own performance trick is to have all my Excel PowerShell scripts write to a tab delimited .txt file like so:
Add-Content -Path "C:\FileName.txt" -Value $Header1`t$Header2`t$Header3...
Add-Content -Path "C:\FileName.txt" -Value $Data1`t$Data2`t$Data3...
Add-Content -Path "C:\FileName.txt" -Value $Data4`t$Data5`t$Data6...
then when done writing all the data, open the .txt file using the very slow Com "Excel.Application" just to do formatting then SaveAs .xlsx (See comment by SaveAs):
Function OpenInExcelFormatSaveAsXlsx
{
Param ($FilePath)
If (Test-Path $FilePath)
{
$Excel = New-Object -ComObject Excel.Application
$Excel.Visible = $true
$Workbook = $Excel.Workbooks.Open($FilePath)
$Sheet = $Workbook.ActiveSheet
$UsedRange = $Sheet.UsedRange
$RowMax = ($Sheet.UsedRange.Rows).count
$ColMax = ($Sheet.UsedRange.Columns).count
# This code gets the Alpha character for Columns, even for AA AB, etc.
For ($Col = 1; $Col -le $ColMax; $Col++)
{
$Asc = ""
$Asc1 = ""
$Asc2 = ""
If ($Col -lt 27)
{
$Asc = ([char]($Col + 64))
Write-Host "Asc: $Asc"
}
Else
{
$First = [math]::truncate($Col / 26)
$Second = $Col - ($First * 26)
If ($Second -eq 0)
{
$First = ($First - 1)
$Second = 26
}
$Asc1 = ([char][int]($First + 64))
$Asc2 = ([char][int]($Second + 64))
$Asc = "$Asc1$Asc2"
}
}
Write-Host "Col: $Col"
Write-Host "Asc + 1: $Asc" + "1"
$Range = $Sheet.Range("a1", "$Asc" + "1")
$Range.Select() | Out-Null
$Range.Font.Bold = $true
$Range.Borders.Item(9).LineStyle = 1
$Range.Borders.Item(9).Weight = 2
$UsedRange = $Sheet.UsedRange
$UsedRange.EntireColumn.AutoFit() | Out-Null
$SavePath = $FilePath.Replace(".txt", ".xlsx")
# I found scant documentation, but you need a file format 51 to save a .txt file as .xlsx
$Workbook.SaveAs($SavePath, 51)
$Workbook.Close
$Excel.Quit()
}
Else
{
Write-Host "File Not Found: $FilePath"
}
}
$TextFilePath = "C:\ITUtilities\MyTabDelimitedTextFile.txt"
OpenInExcelFormatSaveAsXlsx -FilePath $TextFilePath
If you don't care about formatting, you can just open the tab delimited .txt files as-is in Excel.
Of course, this is not very good for inserting data into an existing Excel spreadsheet unless you are OK with having the script rewrite the whole sheet it each time an insert is made. It will still run much faster than using COM in most cases.
I found this, and Yevgeniy's answer. I had to do a few minor changes to the above function in order for it to work. Most notably the handeling of NULL or empty valued values in the input array. Here is Yevgeniy's code with a few minor changes:
function insert-OLEDBData {
PARAM (
[Parameter(Mandatory=$True,Position=1)]
[string]$file,
[Parameter(Mandatory=$True,Position=2)]
[string]$sheet,
[Parameter(Mandatory=$True,Position=3)]
[array]$ocol
)
$cs = Switch -regex ($file)
{
"xlsb$"
{"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=`"$File`";Extended Properties=`"Excel 12.0;HDR=YES`";"}
"xlsx$"
{"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=`"$File`";Extended Properties=`"Excel 12.0 Xml;HDR=YES`";"}
}
$OLEDBCon = New-Object System.Data.OleDb.OleDbConnection($cs)
$hdr = $oCol | Get-Member -MemberType NoteProperty,Property | ForEach-Object {$_.name}
$names = '[' + ($hdr -join "],[") + ']'
$vals = (#("?")*([array]$hdr).length) -join ','
$sql = "insert into [$sheet`$] ($names) values ($vals)"
$sqlCmd = New-Object system.Data.OleDb.OleDbCommand($sql)
$sqlCmd.connection = $oledbcon
$cpary = #($null)*([array]$hdr).length
$i=0
[array]$hdr|%{([array]$cpary)[$i] = $sqlCmd.parameters.add($_,"VarChar",255);$i++}
$oledbcon.open()
for ($i=0;$i -lt ([array]$ocol).length;$i++)
{
for ($k=0;$k -lt ([array]$hdr).length;$k++)
{
IF (([array]$oCol)[$i].(([array]$hdr)[$k]) -notlike "") {
([array]$cpary)[$k].value = ([array]$oCol)[$i].(([array]$hdr)[$k])
} ELSE {
([array]$cpary)[$k].value = ""
}
}
$res = $sqlCmd.ExecuteNonQuery()
}
$OLEDBCon.close()
}

Resources