Command to retrieve all non-empty block-content from Excel workbooks? - excel

I have a sheet DataRepo about the size of 300 rows and 10 rows, about 300 such Excel files in XLSX format. I need to read each Excel file and store it as a CSV (because original XLSX files are corrupted with KeyError, other methods in Python/R resulting to KeyError unless resaving manually with Excel).
I am currently using $Sheet.Cells.Item(row, col).Text to get single value as text but the need over the whole block: either I need to 2-loop over the block and save it CSV or find some ready method for the $Sheet, any ready PowerShell method available? Which looping options available in PowerShell?
How can I retrieve all non-empty content in an Excel sheet with PowerShell?
$XLSDoc = 'C:\Users\hhh\Desktop\1.xlsx'
$SheetName = "DataRepo"
$Excel = New-Object -ComObject "Excel.Application"
$Workbook = $Excel.Workbooks.Open($XLSDoc)
$Sheet = $Workbook.Worksheets.Item($SheetName)
#Get data:
$Sheet.Cells.Item(1,2).Text
Can I do something similar to VBA in PowerShell?
Dim i As Integer
Dim j As Integer
i = 1
j = 1
Do While i < 10
Do While j < 10
Sheet.Cells.Item(i, j).Text
j = j + 1
Loop
i = i + 1
Loop

Use something like this to export each worksheet to a separate CSV:
$wbName = $Workbook.Name
$wbPath = $Workbook.Path
$Workbook.Worksheets | ForEach-Object {
$csvName = Join-Path $wbPath ('{0}_{1}.csv' -f $wbName, $_.Name)
$_.SaveAs($csvName, 6)
}

The following code creates a function out of the code here and then loop over all of the xlsx files in the directory, replacement and trimming added to avoid 216 chars limit per file. Then it outputs the CSV files to the directory of each sheet.
Function ExportXLSXToCSVs ($XLSDoc)
{
$Excel = New-Object -ComObject "Excel.Application"
$Workbook = $Excel.Workbooks.Open($XLSDoc)
$wbName = $Workbook.Name
$wbPath = $Workbook.Path
$Workbook.Worksheets | ForEach-Object {
$csvName = Join-Path $wbPath ('{0}_{1}.csv' -f $wbName, $_.Name)
#Trim/replacements added to avoid the limit 216 chars per file
$csvName = $csvName.Trim().Replace(" ", "")
$_.SaveAs($csvName, 6) #CSV
}
}
#DEMO 1 over a single file
#ExportXLSXToCSVs('C:\Users\hhh\Desktop\1.xlsx')
#DEMO 2 over all files in a directory
Get-ChildItem "C:\Users\hhh\Desktop\Data\" -Filter *.xlsx | ForEach-Object {
$myFile = $_.DirectoryName +"\"+ $_.Name
ExportXLSXToCSVs($myFile)
}

Related

Looping for-each and creating an excel sheet (.xlsx)

I'm trying to create a script that pulls print device from a group of servers housed in a text file. The script works fine except it only pulls one device from one server then the script completes. I'm trying to get this to work then loop in another command to combine all the data from all the sheets and look for dissimilarities between the server(s).
clear-host
# Get list of servers from text file
$sites = Get-Content -Path "User\user$\user\Documents\Working Folder\2132023\test.txt"
$counter = 4
# And here
foreach ($site in $sites) {
$result = Get-Printer -ComputerName $site | Select Name, DriverName, PortName, ShareName
#Create an Excel object
$ExcelObj = New-Object -comobject Excel.Application
$ExcelObj.Visible = $true
# Add a workbook
$ExcelWorkBook = $ExcelObj.Workbooks.Add()
$ExcelWorkSheet = $ExcelWorkBook.Worksheets.Item(1)
# Rename the worksheet
$ExcelWorkSheet.Name = $site
# Fill in the head of the table
$ExcelWorkSheet.Cells.Item(1, 1) = 'Device Name'
$ExcelWorkSheet.Cells.Item(1, 2) = 'Driver Name'
$ExcelWorkSheet.Cells.Item(1, 3) = 'Port Name'
$ExcelWorkSheet.Cells.Item(1, 4) = 'Share Name'
# Make the table head bold, set the font size and the column width
$ExcelWorkSheet.Rows.Item(1).Font.Bold = $true
$ExcelWorkSheet.Rows.Item(1).Font.size = 15
$ExcelWorkSheet.Columns.Item(1).ColumnWidth = 28
$ExcelWorkSheet.Columns.Item(2).ColumnWidth = 28
$ExcelWorkSheet.Columns.Item(3).ColumnWidth = 28
$ExcelWorkSheet.Columns.Item(4).ColumnWidth = 28
# Fill in Excel cells with the data obtained from the server
$ExcelWorkSheet.Columns.Item(1).Rows.Item($counter) = $result.Name
$ExcelWorkSheet.Columns.Item(2).Rows.Item($counter) = $result.DriverName
$ExcelWorkSheet.Columns.Item(3).Rows.Item($counter) = $result.PortName
$ExcelWorkSheet.Columns.Item(4).Rows.Item($counter) = $result.ShareName
$counter++
}
# Save the report and close Excel:
$ExcelWorkBook.SaveAs('\User\User\Documents\Working Folder\2132023\test.xlsx')
$ExcelWorkBook.Close($true)
That is because you are cfeating a new Excel COM object inside the loop.
Put that part above the loop, and inside create a new worksheet for each server and fill the data.
Because Get-Printer may very well return more that one object, you need to loop over the results from that too.
Try
# use full absolute path here
$outFile = 'X:\Somewhere\Documents\Working Folder\2132023\test.xlsx'
if (Test-Path -Path $outFile -PathType Leaf) { Remove-Item -Path $outFile -Force }
# Create an Excel object
$ExcelObj = New-Object -comobject Excel.Application
$ExcelObj.Visible = $true
# Add a workbook
$ExcelWorkBook = $ExcelObj.Workbooks.Add()
# Get list of servers from text file
$sites = Get-Content -Path "X:\Somewhere\Documents\Working Folder\2132023\test.txt"
foreach ($site in $sites) {
$counter = 2
# Add a sheet
$ExcelWorkSheet = $ExcelWorkBook.Sheets.Add()
# make this the the active sheet
$ExcelWorkSheet.Activate()
# Rename the worksheet
$ExcelWorkSheet.Name = $site
# Fill in the head of the table
$ExcelWorkSheet.Cells.Item(1, 1) = 'Device Name'
$ExcelWorkSheet.Cells.Item(1, 2) = 'Driver Name'
$ExcelWorkSheet.Cells.Item(1, 3) = 'Port Name'
$ExcelWorkSheet.Cells.Item(1, 4) = 'Share Name'
# Make the table head bold, set the font size and the column width
$ExcelWorkSheet.Rows.Item(1).Font.Bold = $true
$ExcelWorkSheet.Rows.Item(1).Font.size = 15
$ExcelWorkSheet.Columns.Item(1).ColumnWidth = 28
$ExcelWorkSheet.Columns.Item(2).ColumnWidth = 28
$ExcelWorkSheet.Columns.Item(3).ColumnWidth = 28
$ExcelWorkSheet.Columns.Item(4).ColumnWidth = 28
# Fill in Excel cells with the data obtained from the server
Get-Printer -ComputerName $site | Select-Object Name, DriverName, PortName, ShareName | ForEach-Object {
$ExcelWorkSheet.Columns.Item(1).Rows.Item($counter) = $_.Name
$ExcelWorkSheet.Columns.Item(2).Rows.Item($counter) = $_.DriverName
$ExcelWorkSheet.Columns.Item(3).Rows.Item($counter) = $_.PortName
$ExcelWorkSheet.Columns.Item(4).Rows.Item($counter) = $_.ShareName
$counter++
}
}
# Save the report and close Excel:
$ExcelWorkBook.SaveAs($outFile)
$ExcelWorkBook.Close($true)
$ExcelObj.Quit()
# Clean up the used COM objects
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($ExcelWorkSheet)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($ExcelWorkBook)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($ExcelObj)
$null = [System.GC]::Collect()
$null = [System.GC]::WaitForPendingFinalizers()
P.S. The code would run faster if you set $ExcelObj.Visible = $false

How to run in a loop for column A , find a value and print the corresponding data from column B

I am working on xlsx file and i need to read values from column A and display the values in column B
For an example column A has 100 rows and some of them have a string. At column B (Also 100 rows) i have also values. I want to run in a loop a search for all the cells in column A, Store them and print the corresponding values in column B
I want to search for # and display 1,2,7 from B
I need an object that holds the values from A and object for B (For further actions)
The code below search in all the columns and display the values.
What i need is to read only from a specific column. and i need an object that holds the values from A and B
$data holds the data of column A.
I want to in a loop and search for data and then display the same data in the same row in column B?
$ExcelFile = "C:\Temp\SharedFolder\Test.xlsx"
$excel = New-Object -ComObject Excel.Application
$Excel.visible = $false
$Excel.DisplayAlerts = $False # Disable comfirmation prompts
$workbook = $excel.Workbooks.Open($ExcelFile)
$data = $workbook.Worksheets['Sheet1'].UsedRange.Rows.Columns[1].Value2
Doing this in Excel can be done, but takes a bit more work.
If this is your Excel file:
$ExcelFile = "D:\Test\Test.xlsx"
$searchValue = '#'
$excel = New-Object -ComObject Excel.Application
$Excel.Visible = $false
$Excel.DisplayAlerts = $False # Disable comfirmation prompts
$workbook = $excel.Workbooks.Open($ExcelFile)
$worksheet = $workbook.Worksheets.Item(1)
# get the number of rows in the sheet
$rowMax = $worksheet.UsedRange.Rows.Count
# loop through the rows to test if the value in column 1 equals whatever is in $searchValue
# and capture the results in variable $result
$result = for ($row = 1; $row -le $rowMax; $row++) {
$val = $worksheet.Cells.Item($row, 1).Value2
if ($val -eq $searchValue) {
# output an object with both values from columns A and B
[PsCustomObject]#{A = $val; B = $worksheet.Cells.Item($row, 2).Value2}
}
}
# when done, quit Excel and remove the used COM objects from memory (important)
$excel.Quit()
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($worksheet)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($workbook)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($excel)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
Now you can process the objects in $result. For demo just output:
$result
A B
- -
# 1
# 2
# 7
Of course, it would be far easier if you save your Excel file as CSV..
$searchValue = '#'
$result = Import-Csv -Path 'D:\Test\Test.csv' -UseCulture | Where-Object { $_.A -eq $searchValue }
$result
When exporting an Excel file to Csv, Excel won't always use the comma as delimiter character. That depends on your machine's local settings. This is the reason I added switch -UseCulture to the Import-Csv cmdlet which will make sure it uses the same delimiter character your locally installed Excel uses for its output.

Reading last row of specific column in Excel sheet and appending more data - in PowerShell

I'm trying to write data to specific column in an Excel spreadsheet with PowerShell. I would like to start below last row with data and continiue downwards. On machine I don't have Excel installed so COM won't work for me. I'm currently using Import-Excel to read whole sheet and used Open-ExcelPackage to read specific cell values.
I could do this in CSV file as opposed to .xlsx if it's easier.
Any help would be great!
Download PSExcel module from https://github.com/RamblingCookieMonster/PSExcel Import it using Import-Module.
Then use the following code:
$File = "Path to xlxs file"
$WSName = "SheetName"
$Excel = New-Excel -Path $File
$Worksheet = $Excel | Get-WorkSheet -Name $WSName
$SampleRows = #()
$SampleRows += [PSCustomObject]#{"A" = 1; "B" = 2; "C" = 3; "F" = 4 }
$row_to_insert = $SampleRows.count
$Worksheet.InsertRow($Worksheet.Dimension.Rows,$row_to_insert)
$WorkSheet.Cells["$($Worksheet.Dimension.Start.Address -replace ""\d"")$($Worksheet.Dimension.End.Row):$($Worksheet.Dimension.End.Address)"].Copy($WorkSheet.Cells["$($Worksheet.Dimension.Start.Address -replace ""\d"")$($Worksheet.Dimension.End.Row - $row_to_insert):$($Worksheet.Dimension.End.Address -replace ""\d"")$($Worksheet.Dimension.End.Row - $row_to_insert)"]);
$WorkSheet.Cells["$($Worksheet.Dimension.Start.Address -replace ""\d"")$($Worksheet.Dimension.End.Row):$($Worksheet.Dimension.End.Address)"] | % {$_.Value = ""}
ForEach ($Row in $SampleRows) {
ForEach ($data in $Row.PSObject.Properties.Name) {
$WorkSheet.Cells["$data$($Worksheet.Dimension.Rows)"].Value = $SampleRow.$data
}
}
$Excel | Close-Excel -Save
This code adds 1 row after the last row in the selected worksheet and adds values to this row from $SampleRows.... I think you got the idea. if you need add more rows to $SampleRows array.

powershell excel get first row(header) column count

I'm trying to count the cell number of the first row (A1-D1) which is known as header and get that count as the counter.
As all the while find most of them using Usedrange to count the columns:
$headercolcount=($worksheet.UsedRange.Columns).count
But UsedRange will capture maximum count in the whole activesheet, which resulting not identical to the column count in first row if there is extra content data below the header.
I only wish to grab just the first row:
[]
Update:
For clearer view, here is an example.
As 1F & 1G there are no value present, so the answer should be 5 as 1A-1E as it contains data. So how should I grab the 5 correctly?
[]
Get-Process excel | Stop-Process -Force
# Specify the path to the Excel file and the WorkSheet Name
$FilePath = "C:\temp\A_A.xlsx"
$SheetName = "Blad1" # In english this is probably Sheet1
# Create an Object Excel.Application using Com interface
$objExcel = New-Object -ComObject Excel.Application
# Disable the 'visible' property so the document won't open in excel
$objExcel.Visible = $false
$objExcel.DisplayAlerts = $false
# Open Excel file and in $WorkBook
$WorkBook = $objExcel.Workbooks.Open($FilePath)
# Load WorkSheet 'Blad 1' in variable Worksheet
$WorkSheet = $WorkBook.sheets.item($SheetName)
$xlup = -4162
$lastRow = $WorkSheet.cells.Range("A1048576").End($xlup).row
# get the highest amount of columns
$colMax = ($WorkSheet.UsedRange.Columns).count
# initiatie a counter
$count = $null
# set the column you'd like to count
$row = 1
for ($i = 0; $i -le $colMax; $i++){
if($worksheet.rows.Item("$row").columns.Item($i+1).text){
$count++
}
}
$count
This should work. It takes the highest amount of columns. It then loops until it reaches that amount. During the loop it checks if the cell on that row is filled or not, if it is, it adds to the counter.
If you have millions of lines, this might not be the best way but this works for me.
I've testes it with an excel file:
With
$row = 1 this will give : 5
$row = 2 this will give : 6
$row = 3 this will give : 7
$row = 4 this will give : 8
# Specify the path to the Excel file and the WorkSheet Name
$FilePath = "C:\temp\A_A.xlsx"
$SheetName = "Blad1" # In english this is probably Sheet1
# Create an Object Excel.Application using Com interface
$objExcel = New-Object -ComObject Excel.Application
# Disable the 'visible' property so the document won't open in excel
$objExcel.Visible = $false
$objExcel.DisplayAlerts = $false
# Open Excel file and in $WorkBook
$WorkBook = $objExcel.Workbooks.Open($FilePath)
# Load WorkSheet 'Blad 1' in variable Worksheet
$WorkSheet = $WorkBook.sheets.item($SheetName)
$xlup = -4162
$lastRow = $WorkSheet.cells.Range("A1048576").End($xlup).row
$amountofcolumns = $worksheet.UsedRange.Rows(1).Columns.Count
#OUTPUT
write-host "Last Used row:" $lastRow
Write-host "Amount of columns" $amountofcolumns
#show all columnnames
for($i = 1 ; $i -le $amountofcolumns; $i++){
$worksheet.Cells.Item(1,$i).text
}
This will show you how many rows you have AND will show you all values in the first row , ergo your titles.

Export as CSV instead of a XLS file

I have a script that places everything nicely into a spread sheet. The problem is, I need it to export as a csv file instead. All the foreach loops are completely baffling me here as far as where to put the export csv functions in the script. If someone could could school me on how to get the fields into a csv file, It would be greatly appreciated.
$date = 0
$date = get-date -format "yyyy-MMM-dd-hhmm"
$date
#New Excel Application
$Excel = New-Object -Com Excel.Application
$Excel.visible = $False
# Create 1 worksheets
$Excel = $Excel.Workbooks.Add()
# Assign each worksheet to a variable and
# name the worksheet.
$Sheet1 = $Excel.Worksheets.Item(1)
$Sheet1.Name = "HH_SERVERS"
#Create Heading for General Sheet
$Sheet1.Cells.Item(1, 1) = "Machine_Name"
$Sheet1.Cells.Item(1, 2) = "OS"
$Sheet1.Cells.Item(1, 3) = "Software"
$Sheet1.Cells.Item(1, 4) = "Vendor"
$Sheet1.Cells.Item(1, 5) = "Version"
$colSheets = ($Sheet1)
foreach ($colorItem in $colSheets)
{
$intRow = 2
$intRowDisk = 2
$intRowSoft = 2
$intRowNet = 2
$WorkBook = $colorItem.UsedRange
$WorkBook.Interior.ColorIndex = 20
$WorkBook.Font.ColorIndex = 11
$WorkBook.Font.Bold = $True
}
#Auto Fit all sheets in the Workbook
foreach ($colorItem in $colSheets)
{
$WorkBook = $colorItem.UsedRange
$WorkBook.EntireColumn.AutoFit()
clear
}
$Servers = get-content "c:\temp\HH_Servers.txt"
foreach ($Server in $Servers)
{
$GenItems2 = gwmi Win32_OperatingSystem -Comp $Server
$Software = gwmi Win32_Product -Comp $Server
# Populate General Sheet(1) with information
foreach ($objItem in $GenItems2)
{
$Sheet1.Cells.Item($intRow, 2) = $objItem.Caption
}
#Populate Software Sheet
foreach ($objItem in $Software)
{
$Sheet1.Cells.Item($intRowSoft, 1) = $Server
$Sheet1.Cells.Item($intRowSoft, 3) = $objItem.Name
$Sheet1.Cells.Item($intRowSoft, 4) = $objItem.Vendor
$Sheet1.Cells.Item($intRowSoft, 5) = $objItem.Version
$intRowSoft = $intRowSoft + 1
}
}
$outputfile = "c:\temp\" + $date.toString() + "-HH_Server_Software"
$Excel.SaveAs($outputfile)
$Excel.Close()
Write-Host "*******************************" -ForegroundColor Green
Write-Host "The Report has been completed." -ForeGroundColor Green
Write-Host "*******************************" -ForegroundColor Green
# ========================================================================
# END of Script
# ==================
You can't save an entire workbook as CSV. You need to save the individual worksheet instead. The file format value for CSV is 6 (don't remember where I found that out though):
$xlCSV = 6
$outputfile = "c:\temp\" + $date.toString() + "-HH_Server_Software.csv"
$Sheet1.SaveAs($outputfile, $xlCSV)
(Tested on Windows 7 with Excel 2013.)
Thanks to #Matt for a comment with a link to the XLFileFormat Enumerations.

Resources