I have a big problem that I can not understand when I take data from Excel sheet.
I use this function to read data (1 row) from excel and it does so correctly
function ExtractExcelRows {
[cmdletbinding()]
Param($ExcelFile)
# Excel.exe not autokill fix
$before = Get-Process | % { $_.Id }
$excel = New-Object -ComObject Excel.Application
$excelId = Get-Process excel | % { $_.Id } | ? { $before -notcontains $_ }
$workbook = $excel.Workbooks.Open($ExcelFile.FullName)
$sheet = $workbook.Worksheets.Item(1)
$excel.Visible = $false
$rowMax = ($sheet.UsedRange.Rows).Count
# Declare the starting positions
$rowEmail, $colEmail = 1, 11
$Rows = #()
for ($i=1; $i -le $rowMax-1; $i++) {
if ($sheet.Cells.Item($rowEmail+$i, $colEmail).Text) {
$Rows += #{
Email = $sheet.Cells.Item($rowEmail+$i, $colEmail).Text
}
}
}
$workbook.Close($false)
$excel.Quit()
Stop-Process -Id $excelId -Force
Write-Host $Rows.Count # count 1 row ! right!
return $Rows
}
When I try to save my object in a global variable the result of the count is different and I do not understand why.
$global:ExcelData = ExtractExcelRows $ExcelFile
write-host $ExcelData.Count # count 4 row!!!! not right!
Can anyone tell me where the error is and how to fix it?
To put my comments as answer:
function ExtractExcelRows {
[cmdletbinding()]
Param($ExcelFile)
$excel = New-Object -ComObject Excel.Application
$excel.Visible = $false
$workbook = $excel.Workbooks.Open($ExcelFile.FullName)
$sheet = $workbook.Worksheets.Item(1)
$rowMax = ($sheet.UsedRange.Rows).Count
# Declare the starting positions
$rowEmail, $colEmail = 1, 11
$Rows = for ($i = 1; $i -lt $rowMax; $i++) {
if ($sheet.Cells.Item($rowEmail + $i, $colEmail).Text) {
[PSCustomObject]#{ 'Email' = $sheet.Cells.Item($rowEmail+$i, $colEmail).Text }
}
}
$workbook.Close($false)
$excel.Quit()
# clean up used COM objects
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($sheet) | Out-Null
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($workbook) | Out-Null
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($excel) | Out-Null
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
Write-Host $Rows.Count # count 1 row ! right!
# The comma used as unary operator wraps the array in another single element array.
# Powershell unboxes that to return an array, even if it is empty.
return ,$Rows
}
Have you checked the contents of that variable? I'm betting the first three items are True/False, or something like that. The reason is that the Excel com object's methods tend to return a value to indicate if the execution was successful or not, and all output that is not explicitly redirected is output by the function, not only the items you specify with return (for that matter, there is no need to use return). You should pipe things like $workbook.close($false) to Out-Null as such:
$workbook.close($false) | Out-Null
$excel.quit() | Out-Null
That should account for 2 of your 4 items, I'm not sure what the other one is.
Related
I'm trying to create an Excel workbook, then populate the cells with data found from searching many txt files.
I read a file and extract all comments AFTER I find "IDENTIFICATION DIVISION" and BEFORE I find "ENVIRONMENT DIVISION"
I then populate two cells in my excel workbook. cell one if the file and cell two is the comments extracted.
I have 256GB of memory on the work server. less than %5 is being used before Powershell throws the memory error.
Can anyone see where I'm going wrong?
Thanks,
-Ron
$excel = New-Object -ComObject excel.application
$excel.visible = $False
$workbook = $excel.Workbooks.Add()
$diskSpacewksht= $workbook.Worksheets.Item(1)
$diskSpacewksht.Name = "XXXXX_Desc"
$col1=1
$diskSpacewksht.Cells.Item(1,1) = 'Program'
$diskSpacewksht.Cells.Item(1,2) = 'Description'
$CBLFileList = Get-ChildItem -Path 'C:\XXXXX\XXXXX' -Filter '*.cbl' -File -Recurse
$Flowerbox = #()
ForEach($CBLFile in $CBLFileList) {
$treat = $false
Write-Host "Processing ... $CBLFile" -foregroundcolor green
Get-content -Path $CBLFile.FullName |
ForEach-Object {
if ($_ -match 'IDENTIFICATION DIVISION') {
# Write-Host "Match IDENTIFICATION DIVISION" -foregroundcolor green
$treat = $true
}
if ($_ -match 'ENVIRONMENT DIVISION') {
# Write-Host "Match ENVIRONMENT DIVISION" -foregroundcolor green
$col1++
$diskSpacewksht.Cells.Item($col1,1) = $CBLFile.Name
$diskSpacewksht.Cells.Item($col1,2) = [String]$Flowerbox
$Flowerbox = #()
continue
}
if ($treat) {
if ($_ -match '\*(.{62})') {
Foreach-Object {$Flowerbox += $matches[1] + "`r`n"}
$treat = $false
}
}
}
}
$excel.DisplayAlerts = 'False'
$ext=".xlsx"
$path="C:\Desc.txt"
$workbook.SaveAs($path)
$workbook.Close
$excel.DisplayAlerts = 'False'
$excel.Quit()
Not knowing what the contents of the .CBL files could be, I would suggest not to try and do all of this using an Excel COM object, but create a CSV file instead to make things a lot easier.
When finished, you can simply open that csv file in Excel.
# create a List object to collect the 'flowerbox' strings in
$Flowerbox = [System.Collections.Generic.List[string]]::new()
$treat = $false
# get a list of the .cbl files and loop through. Collect all output in variable $result
$CBLFileList = Get-ChildItem -Path 'C:\XXXXX\XXXXX' -Filter '*.cbl' -File -Recurse
$result = foreach ($CBLFile in $CBLFileList) {
Write-Host "Processing ... $($CBLFile.FullName)" -ForegroundColor Green
# using switch -File is an extremely fast way of testing a file line by line.
# instead of '-Regex' you can also do '-WildCard', but then add asterikses around the strings
switch -Regex -File $CBLFile.FullName {
'IDENTIFICATION DIVISION' {
# start collecting Flowerbox lines from here
$treat = $true
}
'ENVIRONMENT DIVISION' {
# stop colecting Flowerbox lines and output what we already have
# output an object with the two properties you need
[PsCustomObject]#{
Program = $CBLFile.Name # or $CBLFile.FullName
Description = $Flowerbox -join [environment]::NewLine
}
$Flowerbox.Clear() # empty the list for the next run
$treat = $false
}
default {
# as I have no idea what these lines may look like, I have to
# assume your regex '\*(.{62})' is correct..
if ($treat -and ($_ -match '\*(.{62})')) {
$Flowerbox.Add($Matches[1])
}
}
}
}
# now you have everything in an array of PSObjects so you can save that as Csv
$result | Export-Csv -Path 'C:\Desc.csv' -UseCulture -NoTypeInformation
Parameter -UseCulture ensures you can double-click the file so it will open correctly in your Excel
You can also create an Excel file from this csv programmatically like:
$excel = New-Object -ComObject Excel.Application
$excel.Visible = $false
$workbook = $excel.Workbooks.Open('C:\Desc.csv')
$worksheet = $workbook.Worksheets.Item(1)
$worksheet.Name = "XXXXX_Desc"
# save as .xlsx
# 51 ==> [Microsoft.Office.Interop.Excel.XlFileFormat]::xlWorkbookDefault
# see: https://learn.microsoft.com/en-us/office/vba/api/excel.xlfileformat
$workbook.SaveAs('C:\Desc.xlsx', 51)
# quit Excel and remove all used COM objects from memory
$excel.Quit()
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($worksheet)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($workbook)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($excel)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
Actually, this is a version of question here:
How to use powershell to select and copy columns and rows in which data is present in new workbook.
The goal is to grab certain columns from multiple Excel workbooks and dump everything to one csv file. Columns are always the same.
I'm doing that manually:
$xl = New-Object -ComObject Excel.Application
$xl.Visible = $false
$xl.DisplayAlerts = $false
$counter = 0
$input_folder = "C:\Users\user\Documents\excelfiles"
$output_folder = "C:\Users\user\Documents\csvdump"
Get-ChildItem $input_folder -File |
Foreach-Object {
$counter++
$wb = $xl.Workbooks.Open($_.FullName, 0, 1, 5, "")
try {
$ws = $wb.Worksheets.item('Calls') # => This specific worksheet
$rowMax = ($ws.UsedRange.Rows).count
for ($i=1; $i -le $rowMax-1; $i++) {
$newRow = New-Object -Type PSObject -Property #{
'Type' = $ws.Cells.Item(1+$i,1).text
'Direction' = $ws.Cells.Item(1+$i,2).text
'From' = $ws.Cells.Item(1+$i,3).text
'To' = $ws.Cells.Item(1+$i,4).text
}
$newRow | Export-Csv -Path $("$output_folder\$ESO_Output") -Append -noType -Force
}
}
} catch {
Write-host "No such workbook" -ForegroundColor Red
# Return
}
}
Question:
This works, but is extremely slow because Excel has to select every cell, copy that, then Powershell has to create array and save row by row in output csv file.
Is there a method to select a range in Excel (number of columns times ($ws.UsedRange.Rows).count), cut header line and just append this range (array?) to csv file to make everything much faster?
So that's the final solution
Script is 22 times faster!!! than original solution.
Hope somebody will find that useful :)
PasteSpecial is to filter out empty rows. There is no need to save them into csv
$xl = New-Object -ComObject Excel.Application
$xl.Visible = $false
$xl.DisplayAlerts = $false
$counter = 0
$input_folder = "C:\Users\user\Documents\excelfiles"
$output_folder = "C:\Users\user\Documents\csvdump"
Get-ChildItem $input_folder -File |
Foreach-Object {
$counter++
try {
$new_ws1 = $wb.Worksheets.add()
$ws = $wb.Worksheets.item('Calls')
$rowMax = ($ws.UsedRange.Rows).count
$range = $ws.Range("A1:O$rowMax")
$x = $range.copy()
$y = $new_ws1.Range("A1:O$rowMax").PasteSpecial([System.Type]::Missing,[System.Type]::Missing,$true,$false)
$wb.SaveAs("$($output_folder)\$($_.Basename)",[Microsoft.Office.Interop.Excel.XlFileFormat]::xlCSVWindows)
} catch {
Write-host "No such workbook" -ForegroundColor Red
# Return
}
}
$xl.Quit()
Part above will generate a bunch of csv files.
Part below will read these files in separate loop and combine them together into one.
-exclude is an array of something I want to omit
Remove-Item to remove temporary files
Answer below is based on this post: https://stackoverflow.com/a/27893253/6190661
$getFirstLine = $true
Get-ChildItem "$output_folder\*.csv" -exclude $excluded | foreach {
$filePath = $_
$lines = Get-Content $filePath
$linesToWrite = switch($getFirstLine) {
$true {$lines}
$false {$lines | Select -Skip 1}
}
$getFirstLine = $false
Add-Content "$($output_folder)\MERGED_CSV_FILE.csv" $linesToWrite
Remove-Item $_.FullName
}
I am trying to delete rows of data from an Excel file using the ImportExcel module.
I can open the file, find the the data I wish to delete and the DeleteRow command works on a hardcoded value however does not appear to work on a variable...any ideas?
# Gets ImportExcel PowerShell Module
if (-not(Get-Module -ListAvailable -Name ImportExcel)) {
Find-module -Name ImportExcel | Install-Module -Force
}
# Open Excel File
$excel = open-excelpackage 'C:\temp\input.xlsx'
#Set Worksheet
$ws = $excel.Workbook.Worksheets["Sheet1"]
#Get Row Count
$rowcount = $ws.Dimension.Rows
#Delete row if Cell in Column 15 = Yes
for ($i = 2; $i -lt $rowcount; $i++) {
$cell = $ws.Cells[$i, 15]
if ($cell.value -eq "Yes") {
$ws.DeleteRow($i)
}
}
#Save File
Close-ExcelPackage $excel -SaveAs 'C:\Temp\Output.xlsx'
You should reverse the loop and go from bottom to top row. As you have it, by deleting a row, the index of the ones below that is changed and your for ($i = 2; $i -lt $rowcount; $i++) {..} will skip over.
You can also do this without the ImportExcel module if you have Excel installed:
$file = 'C:\Temp\input.xlsx'
$excel = New-Object -ComObject Excel.Application
$excel.Visible = $false
# open the Excel file
$workbook = $excel.Workbooks.Open($file)
$sheet = $workbook.Worksheets.Item(1)
# get the number of rows in the sheet
$rowMax = $sheet.UsedRange.Rows.Count
# loop through the rows to test if the value in column 15 is "Yes"
# do the loop BACKWARDS, otherwise the indices will change on every deletion.
for ($row = $rowMax; $row -ge 2; $row--) {
$cell = $sheet.Cells[$row, 15].Value2
if ($cell -eq 'Yes') {
$null = $sheet.Rows($row).EntireRow.Delete()
}
}
# save and exit
$workbook.SaveAs("C:\Temp\Output.xlsx")
$excel.Quit()
# clean up the COM objects used
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($sheet)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($workbook)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($excel)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
I have this excel
every row is an automation script I need to execute with certain parameters, the excel is because every script receives different parameters, and I need to do a powershell script that reads the excel file and for each row, execute that process id(script) and send those parameters
is there a way to do that? is it doable?
so far I have this
$file = "C:\Users\MX02689\Documents\Parametros.xlsx"
$sheetName = "Sheet1"
$objExcel = New-Object -ComObject Excel.Application
$workbook = $objExcel.Workbooks.Open($file)
$sheet = $workbook.Worksheets.Item($sheetName)
$objExcel.Visible=$false
$rowMax = ($sheet.UsedRange.Rows).count
$colMax = ($sheet.UsedRange.Columns).count
$rowName,$colName = 1,1
#the idea here is that for each row that has values do this
for($i=1;$i-le $colMax-1; $i++)
#The idea here is that if (parameter 1 -eq 1 ){
execute the command we use to send the scripts process id; "parameter2 parameter 3 parameter 4"
}else{
skip the row and go to the next one
}
{
Write-Output("" + $sheet.Cells.Item($rowName,$colName+$i).text)
}
am I in the right direction? thank you for the help :)
am I in the right direction? is it doable what Im trying to do? is there a optimized way to achieve this? thank you for your help :)
Greetings
Using Excel is not the fastest or easiest way of doing this with PowerShell.
It can be done like this:
$file = "D:\Parametros.xlsx"
$objExcel = New-Object -ComObject Excel.Application
$workbook = $objExcel.Workbooks.Open($file)
$sheet = $workbook.Worksheets.Item(1)
$objExcel.Visible = $false
$rowMax = ($sheet.UsedRange.Rows).count
$colMax = ($sheet.UsedRange.Columns).count
for ($row = 2; $row -le $rowMax; $row++) { # skip the header row
$params = #()
for ($col = 1; $col -le $colMax; $col++) {
$params += $sheet.Cells.Item($row, $col).Value()
}
# execute the command. For demo, just show the parameters used
'Invoke-Command parameters: {0}' -f ($params -join ', ')
}
$objExcel.Quit()
# clean-up used Com objects
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($sheet) | Out-Null
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($workbook) | Out-Null
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($objExcel) | Out-Null
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
Far more convenient would be to save your Excel file as CSV and use that:
Import-Csv -Path 'D:\Parametros.csv' | ForEach-Object {
# execute the command. For demo, just show the parameters used
'Invoke-Command parameters: {0}, {1}, {2}, {3}' -f $_.'process id', $_.parameter1, $_.parameter2, $_.parameter3, $_.parameter4
}
Demo output for both methods:
Invoke-Command parameters: 235522, 1, testinguser3, Mko12345, something
Invoke-Command parameters: 235266, 0, testinguser4, Mko12346, something
Invoke-Command parameters: 235266, 1, testinguser5, Mko12347, something
From your comment, I now understand what the "1" or "0" means in parameter1.
Below find the adjusted codes for Excel aswell as the CSV method:
Method for Excel:
$file = "D:\Parametros.xlsx"
$objExcel = New-Object -ComObject Excel.Application
$workbook = $objExcel.Workbooks.Open($file)
$sheet = $workbook.Worksheets.Item(1)
$objExcel.Visible = $false
$rowMax = ($sheet.UsedRange.Rows).count
$colMax = ($sheet.UsedRange.Columns).count
for ($row = 2; $row -le $rowMax; $row++) { # skip the header row
$params = #()
for ($col = 1; $col -le $colMax; $col++) {
$params += $sheet.Cells.Item($row, $col).Value()
}
# if the second parameter value converted to int = 1, proceed; if 0 skip the line
if ([int]$param[1] -ne 0) {
# execute the command. For demo, just show the parameters used
'Invoke-Command parameters: {0}' -f ($params -join ', ').TrimEnd(", ")
}
}
$objExcel.Quit()
# clean-up used Com objects
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($sheet) | Out-Null
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($workbook) | Out-Null
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($objExcel) | Out-Null
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
Method for CSV file:
Import-Csv -Path 'D:\Parametros.csv' | ForEach-Object {
# get the field values from the row in array $params (not a fixed number of fields)
$params = #($_.PsObject.Properties).Value
# if the second parameter value converted to int = 1, proceed; if 0 skip the line
if ([int]$params[1] -ne 0) {
# execute the command. For demo, just show the parameters used
'Invoke-Command parameters: {0}' -f ($params -join ', ').TrimEnd(", ")
}
}
I am trying to import multiple csv files into their own tabs in 1 excel workbook. I am having an issue with long number fields being displayed as exponential data and changing the last digit to 0. For example I have a 16 digit account number (1234567890123456) it is being displayed in excel as an exponential number (1.23457E+15). When I look at the actual number in the cell it is (1234567890123450). I assume if I make the column text before I bring it in, it will work, but I'm not sure how to do that. Here is my code.
$excel = New-Object -ComObject excel.application
$excel.visible = $False
$excel.displayalerts=$False
$workbook = $excel.workbooks.add()
$sheets = $workbook.sheets
$sheetCount = $Sheets.Count
$mySheet = 1
$mySheetName = "Sheet" + $mySheet
$s1 = $sheets | where {$_.name -eq $mySheetName }
$s1.Activate()
If($sheetCount -gt 1)
{
#Delete other Sheets
$Sheets | ForEach
{
$tmpSheetName = $_.Name
$tmpSheet = $_
If($tmpSheetName -ne "Sheet1"){$tmpSheet.Delete()}
}
}
#import csv files
$files = dir -Path $csvDir*.csv
ForEach($file in $files){
If($mySheet -gt 1){$s1 = $workbook.sheets.add()}
$s1.Name = $file.BaseName
$s1.Activate()
$s1Data = Import-Csv $file.FullName
$s1data | ConvertTo-Csv -Delimiter "`t" -NoTypeInformation | Clip
$s1.cells.item(1,1).Select()
$s1.Paste()
$mySheet ++
if (test-path $file ) { rm $file }
}
$workbook.SaveAs($excelTMGPath)
$workbook.Close()
$workbook = $null
#$excel.quit()
while ([System.Runtime.InteropServices.Marshal]::FinalReleaseComObject($excel)) {}
$excel = $null
Try
If $s1 is pointed correctly,
$s1.cells.item(1,1).NumberFormat="#"
If that does not work, use NumberFormat where necessary. Use the format you prefer.
Change the name of your file extension from .csv to .txt. Adjust your filename in the code,
$files = dir -Path $csvDir*.txt