How to use powershell to select range and dump that to csv file - excel

Actually, this is a version of question here:
How to use powershell to select and copy columns and rows in which data is present in new workbook.
The goal is to grab certain columns from multiple Excel workbooks and dump everything to one csv file. Columns are always the same.
I'm doing that manually:
$xl = New-Object -ComObject Excel.Application
$xl.Visible = $false
$xl.DisplayAlerts = $false
$counter = 0
$input_folder = "C:\Users\user\Documents\excelfiles"
$output_folder = "C:\Users\user\Documents\csvdump"
Get-ChildItem $input_folder -File |
Foreach-Object {
$counter++
$wb = $xl.Workbooks.Open($_.FullName, 0, 1, 5, "")
try {
$ws = $wb.Worksheets.item('Calls') # => This specific worksheet
$rowMax = ($ws.UsedRange.Rows).count
for ($i=1; $i -le $rowMax-1; $i++) {
$newRow = New-Object -Type PSObject -Property #{
'Type' = $ws.Cells.Item(1+$i,1).text
'Direction' = $ws.Cells.Item(1+$i,2).text
'From' = $ws.Cells.Item(1+$i,3).text
'To' = $ws.Cells.Item(1+$i,4).text
}
$newRow | Export-Csv -Path $("$output_folder\$ESO_Output") -Append -noType -Force
}
}
} catch {
Write-host "No such workbook" -ForegroundColor Red
# Return
}
}
Question:
This works, but is extremely slow because Excel has to select every cell, copy that, then Powershell has to create array and save row by row in output csv file.
Is there a method to select a range in Excel (number of columns times ($ws.UsedRange.Rows).count), cut header line and just append this range (array?) to csv file to make everything much faster?

So that's the final solution
Script is 22 times faster!!! than original solution.
Hope somebody will find that useful :)
PasteSpecial is to filter out empty rows. There is no need to save them into csv
$xl = New-Object -ComObject Excel.Application
$xl.Visible = $false
$xl.DisplayAlerts = $false
$counter = 0
$input_folder = "C:\Users\user\Documents\excelfiles"
$output_folder = "C:\Users\user\Documents\csvdump"
Get-ChildItem $input_folder -File |
Foreach-Object {
$counter++
try {
$new_ws1 = $wb.Worksheets.add()
$ws = $wb.Worksheets.item('Calls')
$rowMax = ($ws.UsedRange.Rows).count
$range = $ws.Range("A1:O$rowMax")
$x = $range.copy()
$y = $new_ws1.Range("A1:O$rowMax").PasteSpecial([System.Type]::Missing,[System.Type]::Missing,$true,$false)
$wb.SaveAs("$($output_folder)\$($_.Basename)",[Microsoft.Office.Interop.Excel.XlFileFormat]::xlCSVWindows)
} catch {
Write-host "No such workbook" -ForegroundColor Red
# Return
}
}
$xl.Quit()
Part above will generate a bunch of csv files.
Part below will read these files in separate loop and combine them together into one.
-exclude is an array of something I want to omit
Remove-Item to remove temporary files
Answer below is based on this post: https://stackoverflow.com/a/27893253/6190661
$getFirstLine = $true
Get-ChildItem "$output_folder\*.csv" -exclude $excluded | foreach {
$filePath = $_
$lines = Get-Content $filePath
$linesToWrite = switch($getFirstLine) {
$true {$lines}
$false {$lines | Select -Skip 1}
}
$getFirstLine = $false
Add-Content "$($output_folder)\MERGED_CSV_FILE.csv" $linesToWrite
Remove-Item $_.FullName
}

Related

Powershell - Creating Excel Workbook - Getting "Insufficient memory to continue the execution of the program"

I'm trying to create an Excel workbook, then populate the cells with data found from searching many txt files.
I read a file and extract all comments AFTER I find "IDENTIFICATION DIVISION" and BEFORE I find "ENVIRONMENT DIVISION"
I then populate two cells in my excel workbook. cell one if the file and cell two is the comments extracted.
I have 256GB of memory on the work server. less than %5 is being used before Powershell throws the memory error.
Can anyone see where I'm going wrong?
Thanks,
-Ron
$excel = New-Object -ComObject excel.application
$excel.visible = $False
$workbook = $excel.Workbooks.Add()
$diskSpacewksht= $workbook.Worksheets.Item(1)
$diskSpacewksht.Name = "XXXXX_Desc"
$col1=1
$diskSpacewksht.Cells.Item(1,1) = 'Program'
$diskSpacewksht.Cells.Item(1,2) = 'Description'
$CBLFileList = Get-ChildItem -Path 'C:\XXXXX\XXXXX' -Filter '*.cbl' -File -Recurse
$Flowerbox = #()
ForEach($CBLFile in $CBLFileList) {
$treat = $false
Write-Host "Processing ... $CBLFile" -foregroundcolor green
Get-content -Path $CBLFile.FullName |
ForEach-Object {
if ($_ -match 'IDENTIFICATION DIVISION') {
# Write-Host "Match IDENTIFICATION DIVISION" -foregroundcolor green
$treat = $true
}
if ($_ -match 'ENVIRONMENT DIVISION') {
# Write-Host "Match ENVIRONMENT DIVISION" -foregroundcolor green
$col1++
$diskSpacewksht.Cells.Item($col1,1) = $CBLFile.Name
$diskSpacewksht.Cells.Item($col1,2) = [String]$Flowerbox
$Flowerbox = #()
continue
}
if ($treat) {
if ($_ -match '\*(.{62})') {
Foreach-Object {$Flowerbox += $matches[1] + "`r`n"}
$treat = $false
}
}
}
}
$excel.DisplayAlerts = 'False'
$ext=".xlsx"
$path="C:\Desc.txt"
$workbook.SaveAs($path)
$workbook.Close
$excel.DisplayAlerts = 'False'
$excel.Quit()
Not knowing what the contents of the .CBL files could be, I would suggest not to try and do all of this using an Excel COM object, but create a CSV file instead to make things a lot easier.
When finished, you can simply open that csv file in Excel.
# create a List object to collect the 'flowerbox' strings in
$Flowerbox = [System.Collections.Generic.List[string]]::new()
$treat = $false
# get a list of the .cbl files and loop through. Collect all output in variable $result
$CBLFileList = Get-ChildItem -Path 'C:\XXXXX\XXXXX' -Filter '*.cbl' -File -Recurse
$result = foreach ($CBLFile in $CBLFileList) {
Write-Host "Processing ... $($CBLFile.FullName)" -ForegroundColor Green
# using switch -File is an extremely fast way of testing a file line by line.
# instead of '-Regex' you can also do '-WildCard', but then add asterikses around the strings
switch -Regex -File $CBLFile.FullName {
'IDENTIFICATION DIVISION' {
# start collecting Flowerbox lines from here
$treat = $true
}
'ENVIRONMENT DIVISION' {
# stop colecting Flowerbox lines and output what we already have
# output an object with the two properties you need
[PsCustomObject]#{
Program = $CBLFile.Name # or $CBLFile.FullName
Description = $Flowerbox -join [environment]::NewLine
}
$Flowerbox.Clear() # empty the list for the next run
$treat = $false
}
default {
# as I have no idea what these lines may look like, I have to
# assume your regex '\*(.{62})' is correct..
if ($treat -and ($_ -match '\*(.{62})')) {
$Flowerbox.Add($Matches[1])
}
}
}
}
# now you have everything in an array of PSObjects so you can save that as Csv
$result | Export-Csv -Path 'C:\Desc.csv' -UseCulture -NoTypeInformation
Parameter -UseCulture ensures you can double-click the file so it will open correctly in your Excel
You can also create an Excel file from this csv programmatically like:
$excel = New-Object -ComObject Excel.Application
$excel.Visible = $false
$workbook = $excel.Workbooks.Open('C:\Desc.csv')
$worksheet = $workbook.Worksheets.Item(1)
$worksheet.Name = "XXXXX_Desc"
# save as .xlsx
# 51 ==> [Microsoft.Office.Interop.Excel.XlFileFormat]::xlWorkbookDefault
# see: https://learn.microsoft.com/en-us/office/vba/api/excel.xlfileformat
$workbook.SaveAs('C:\Desc.xlsx', 51)
# quit Excel and remove all used COM objects from memory
$excel.Quit()
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($worksheet)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($workbook)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($excel)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()

is it possible to read an Excel through powershell like this?

I have this excel
every row is an automation script I need to execute with certain parameters, the excel is because every script receives different parameters, and I need to do a powershell script that reads the excel file and for each row, execute that process id(script) and send those parameters
is there a way to do that? is it doable?
so far I have this
$file = "C:\Users\MX02689\Documents\Parametros.xlsx"
$sheetName = "Sheet1"
$objExcel = New-Object -ComObject Excel.Application
$workbook = $objExcel.Workbooks.Open($file)
$sheet = $workbook.Worksheets.Item($sheetName)
$objExcel.Visible=$false
$rowMax = ($sheet.UsedRange.Rows).count
$colMax = ($sheet.UsedRange.Columns).count
$rowName,$colName = 1,1
#the idea here is that for each row that has values do this
for($i=1;$i-le $colMax-1; $i++)
#The idea here is that if (parameter 1 -eq 1 ){
execute the command we use to send the scripts process id; "parameter2 parameter 3 parameter 4"
}else{
skip the row and go to the next one
}
{
Write-Output("" + $sheet.Cells.Item($rowName,$colName+$i).text)
}
am I in the right direction? thank you for the help :)
am I in the right direction? is it doable what Im trying to do? is there a optimized way to achieve this? thank you for your help :)
Greetings
Using Excel is not the fastest or easiest way of doing this with PowerShell.
It can be done like this:
$file = "D:\Parametros.xlsx"
$objExcel = New-Object -ComObject Excel.Application
$workbook = $objExcel.Workbooks.Open($file)
$sheet = $workbook.Worksheets.Item(1)
$objExcel.Visible = $false
$rowMax = ($sheet.UsedRange.Rows).count
$colMax = ($sheet.UsedRange.Columns).count
for ($row = 2; $row -le $rowMax; $row++) { # skip the header row
$params = #()
for ($col = 1; $col -le $colMax; $col++) {
$params += $sheet.Cells.Item($row, $col).Value()
}
# execute the command. For demo, just show the parameters used
'Invoke-Command parameters: {0}' -f ($params -join ', ')
}
$objExcel.Quit()
# clean-up used Com objects
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($sheet) | Out-Null
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($workbook) | Out-Null
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($objExcel) | Out-Null
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
Far more convenient would be to save your Excel file as CSV and use that:
Import-Csv -Path 'D:\Parametros.csv' | ForEach-Object {
# execute the command. For demo, just show the parameters used
'Invoke-Command parameters: {0}, {1}, {2}, {3}' -f $_.'process id', $_.parameter1, $_.parameter2, $_.parameter3, $_.parameter4
}
Demo output for both methods:
Invoke-Command parameters: 235522, 1, testinguser3, Mko12345, something
Invoke-Command parameters: 235266, 0, testinguser4, Mko12346, something
Invoke-Command parameters: 235266, 1, testinguser5, Mko12347, something
From your comment, I now understand what the "1" or "0" means in parameter1.
Below find the adjusted codes for Excel aswell as the CSV method:
Method for Excel:
$file = "D:\Parametros.xlsx"
$objExcel = New-Object -ComObject Excel.Application
$workbook = $objExcel.Workbooks.Open($file)
$sheet = $workbook.Worksheets.Item(1)
$objExcel.Visible = $false
$rowMax = ($sheet.UsedRange.Rows).count
$colMax = ($sheet.UsedRange.Columns).count
for ($row = 2; $row -le $rowMax; $row++) { # skip the header row
$params = #()
for ($col = 1; $col -le $colMax; $col++) {
$params += $sheet.Cells.Item($row, $col).Value()
}
# if the second parameter value converted to int = 1, proceed; if 0 skip the line
if ([int]$param[1] -ne 0) {
# execute the command. For demo, just show the parameters used
'Invoke-Command parameters: {0}' -f ($params -join ', ').TrimEnd(", ")
}
}
$objExcel.Quit()
# clean-up used Com objects
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($sheet) | Out-Null
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($workbook) | Out-Null
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($objExcel) | Out-Null
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
Method for CSV file:
Import-Csv -Path 'D:\Parametros.csv' | ForEach-Object {
# get the field values from the row in array $params (not a fixed number of fields)
$params = #($_.PsObject.Properties).Value
# if the second parameter value converted to int = 1, proceed; if 0 skip the line
if ([int]$params[1] -ne 0) {
# execute the command. For demo, just show the parameters used
'Invoke-Command parameters: {0}' -f ($params -join ', ').TrimEnd(", ")
}
}

Merge content of multiple Excel files into one using PowerShell

I have multiple Excel files with different names in path.
e.g. C:\Users\XXXX\Downloads\report
Each file has a fixed number of columns.
e.g. Date | Downtime | Response
I want to create a new Excel file with merge of all Excel data. New column should be added with client name in which i want to enter file name. Then each Excel file data append below one by one.
e.g. Client name | Date | Downtime | Response
Below code can able to append all excel data but now need to add Client name column.
$path = "C:\Users\XXXX\Downloads\report"
#Launch Excel, and make it do as its told (supress confirmations)
$Excel = New-Object -ComObject Excel.Application
$Excel.Visible = $True
$Excel.DisplayAlerts = $False
$Files = Get-ChildItem -Path $path
#Open up a new workbook
$Dest = $Excel.Workbooks.Add()
#Loop through files, opening each, selecting the Used range, and only grabbing the first 5 columns of it. Then find next available row on the destination worksheet and paste the data
ForEach($File in $Files)
{
$Source = $Excel.Workbooks.Open($File.FullName,$true,$true)
If(($Dest.ActiveSheet.UsedRange.Count -eq 1) -and ([String]::IsNullOrEmpty($Dest.ActiveSheet.Range("A1").Value2)))
{
#If there is only 1 used cell and it is blank select A1
[void]$source.ActiveSheet.Range("A1","E$(($Source.ActiveSheet.UsedRange.Rows|Select -Last 1).Row)").Copy()
[void]$Dest.Activate()
[void]$Dest.ActiveSheet.Range("A1").Select()
}
Else
{
#If there is data go to the next empty row and select Column A
[void]$source.ActiveSheet.Range("A2","E$(($Source.ActiveSheet.UsedRange.Rows|Select -Last 1).Row)").Copy()
[void]$Dest.Activate()
[void]$Dest.ActiveSheet.Range("A$(($Dest.ActiveSheet.UsedRange.Rows|Select -last 1).row+1)").Select()
}
[void]$Dest.ActiveSheet.Paste()
$Source.Close()
}
$Dest.SaveAs("$path\Merge.xls")
$Dest.close()
$Excel.Quit()
Suggest any effective way to do this. Please provide links if available.
Convert XLS to XLSX :
$xlFixedFormat = [Microsoft.Office.Interop.Excel.XlFileFormat]::xlWorkbookDefault
$excel = New-Object -ComObject excel.application
$excel.visible = $true
$folderpath = "C:\Users\xxxx\Downloads\report\*"
$filetype ="*xls"
Get-ChildItem -Path $folderpath -Include $filetype |
ForEach-Object `
{
$path = ($_.fullname).substring(0,($_.FullName).lastindexOf("."))
"Converting $path to $filetype..."
$workbook = $excel.workbooks.open($_.fullname)
$workbook.saveas($path, $xlFixedFormat)
$workbook.close()
}
$excel.Quit()
$excel = $null
[gc]::collect()
[gc]::WaitForPendingFinalizers()
If you are willing to use the external module Import-Excel, you could simply loop through the files like so:
$report_directory = ".\reports"
$merged_reports = #()
# Loop through each XLSX-file in $report_directory
foreach ($report in (Get-ChildItem "$report_directory\*.xlsx")) {
# Loop through each row of the "current" XLSX-file
$report_content = foreach ($row in Import-Excel $report) {
# Create "custom" row
[PSCustomObject]#{
"Client name" = $report.Name
"Date" = $row."Date"
"Downtime" = $row."Downtime"
"Response" = $row."Response"
}
}
# Add the "custom" data to the results-array
$merged_reports += #($report_content)
}
# Create final report
$merged_reports | Export-Excel ".\merged_report.xlsx"
Please note that this code is not optimized in terms of performance but it should allow you to get started

Passing CSV to Excel Workbook (Not From File)

I have a folder of CSV files that contain log entries. For each entry of the CSV, if the Risk property is not Low and not None then I put it in an accumulation CSV object. From there, I want to import it into an Excel Workbook directly WITHOUT having to save the CSV to file.
$CSVPaths = (Split-Path $PSCommandPath)
$AccumulateExportPath = (Split-Path $PSCommandPath)
$FileName="Accumulate"
$Acc=#()
Foreach ($csv in (Get-ChildItem C:\Scripts\Nessus\Sheets |? {$_.Extension -like ".csv" -and $_.BaseName -notlike "$FileName"}))
{
$Content = Import-CSV $csv.FullName
Foreach ($Log in $Content)
{
If ($Log.Risk -ne "None" -and $Log.Risk -ne "Low")
{
$Acc+=$Log
}
}
}
$CSV = $ACC |ConvertTo-CSV -NoTypeInformation
Add-Type -AssemblyName Microsoft.Office.Interop.Excel
$Script:Excel = New-Object -ComObject Excel.Application
$Excel.Visible=$True
#$Excel.Workbooks.OpenText($CSV) What should replace this?
Is there a Method like OpenText() that lets me pass a CSV object instead of a filepath to a CSV file or am I going to have to write my own conversion function?
Interesting question. I'm not aware of a method that allows you to pass a CSV Object.
However, if your result CSV is not too big and you are using PowerShell 5.0+ you could convert the object to a string and leverage Set-Clipboard (more info)
$headers = ($csv | Get-Member | Where-Object {$_.MemberType -eq "NoteProperty"}).Name
$delim = "`t"
# headers
foreach($header in $headers){
$myString += $header + $delim
}
# trim delimiter at the end, and add new line
$myString = $myString.TrimEnd($delim)
$myString = $myString + "`n"
# loop over each line and repeat
foreach($line in $csv){
foreach($header in $headers){
$myString += $line.$header + $delim
}
$myString = $myString.TrimEnd($delim)
$myString = $myString + "`n"
}
# copy to clipboard
Set-Clipboard $myString
# paste into excel from clipboard
$Excel.Workbooks.Worksheets.Item(1).Paste()
Here is another way to create an Excel spreadsheet from PowerShell without writing a .csv file.
$dirs = 'C:\src\t', 'C:\src\sql'
$records = $()
$records = foreach ($dir in $dirs) {
Get-ChildItem -Path $dir -File '*.txt' -Recurse |
Select-Object #{Expression={$_.FullName}; Label="filename"}
}
#open excel
$excel = New-Object -ComObject excel.application
$excel.visible = $false
#add a default workbook
$workbook = $excel.Workbooks.Add()
#remove worksheet 2 & 3
$workbook.Worksheets.Item(3).Delete()
$workbook.Worksheets.Item(2).Delete()
#give the remaining worksheet a name
$uregwksht = $workbook.Worksheets.Item(1)
$uregwksht.Name = 'File Names'
# Start on row 1
$i = 1
# the .appendix to $record refers to the column header in the csv file
foreach ($record in $records) {
$excel.cells.item($i,1) = $record.filename
$i++
}
#adjusting the column width so all data's properly visible
$usedRange = $uregwksht.UsedRange
$usedRange.EntireColumn.AutoFit() | Out-Null
#saving & closing the file
$outputpath = Join-Path -Path $Env:USERPROFILE -ChildPath "desktop\exceltest.xlsx"
$workbook.SaveAs($outputpath)
$excel.Quit()

Import csv into excel and specify cell format

I am trying to import multiple csv files into their own tabs in 1 excel workbook. I am having an issue with long number fields being displayed as exponential data and changing the last digit to 0. For example I have a 16 digit account number (1234567890123456) it is being displayed in excel as an exponential number (1.23457E+15). When I look at the actual number in the cell it is (1234567890123450). I assume if I make the column text before I bring it in, it will work, but I'm not sure how to do that. Here is my code.
$excel = New-Object -ComObject excel.application
$excel.visible = $False
$excel.displayalerts=$False
$workbook = $excel.workbooks.add()
$sheets = $workbook.sheets
$sheetCount = $Sheets.Count
$mySheet = 1
$mySheetName = "Sheet" + $mySheet
$s1 = $sheets | where {$_.name -eq $mySheetName }
$s1.Activate()
If($sheetCount -gt 1)
{
#Delete other Sheets
$Sheets | ForEach
{
$tmpSheetName = $_.Name
$tmpSheet = $_
If($tmpSheetName -ne "Sheet1"){$tmpSheet.Delete()}
}
}
#import csv files
$files = dir -Path $csvDir*.csv
ForEach($file in $files){
If($mySheet -gt 1){$s1 = $workbook.sheets.add()}
$s1.Name = $file.BaseName
$s1.Activate()
$s1Data = Import-Csv $file.FullName
$s1data | ConvertTo-Csv -Delimiter "`t" -NoTypeInformation | Clip
$s1.cells.item(1,1).Select()
$s1.Paste()
$mySheet ++
if (test-path $file ) { rm $file }
}
$workbook.SaveAs($excelTMGPath)
$workbook.Close()
$workbook = $null
#$excel.quit()
while ([System.Runtime.InteropServices.Marshal]::FinalReleaseComObject($excel)) {}
$excel = $null
Try
If $s1 is pointed correctly,
$s1.cells.item(1,1).NumberFormat="#"
If that does not work, use NumberFormat where necessary. Use the format you prefer.
Change the name of your file extension from .csv to .txt. Adjust your filename in the code,
$files = dir -Path $csvDir*.txt

Resources