Excel add Row Grouping using powershell - excel

I have below csv file, I want to import into excel and add the row grouping for the child items using powershell. I was able open the file and format the cell. Not sure how to add row grouping.
Data
name,,
one,,
,value1,value2
,value3 ,value4
two,,
,value4,sevalue4
,value5,sevalue5
,value6,sevalue6
,value7,sevalue7
three,,
,value8,sevalue8
,value9,sevalue9
,value10,sevalue10
,value11,sevalue11
I want to convert like this in excel.
Here is the code I have it to open it in excel.
$a = New-Object -comobject Excel.Application
$a.visible = $True
$b = $a.Workbooks.Open("C:\shared\c1.csv")
$c = $b.Worksheets.Item(1)
$d = $c.Cells(1,1)
$d.Interior.ColorIndex = 19
$d.Font.ColorIndex = 11
$d.Font.Bold = $True
$b.Save("C:\shared\c1.xlsx")
How do I add row grouping for this data?
Thanks
SR

Logic Applied:
Group all the consecutive rows for which the value in column A is blank
In the following code, I have opened a CSV file, made the required grouping as per the data shared by you and saved it. While saving it, because of the row grouping, I was not able to save it in csv format. So, I had to change the format to a normal workbook. But, it works.
Code
$objExl = New-Object -ComObject Excel.Application
$objExl.visible = $true
$objExl.DisplayAlerts = $false
$strPath = "C:\Users\gurmansingh\Documents\a.csv" #Enter the path of csv
$objBook = $objExl.Workbooks.open($strPath)
$objSheet = $objBook.Worksheets.item(1)
$intRowCount = $objSheet.usedRange.Rows.Count
for($i=1; $i -le $intRowCount; $i++)
{
if($objSheet.Cells.Item($i,1).text -like "")
{
$startRow = $i
for($j=$i+1; $j -le $intRowCount; $j++)
{
if($objSheet.cells.Item($j,1).text -ne "" -or $j -eq $intRowCount)
{
$endRow = $j-1
if($j -eq $intRowCount)
{
$endRow = $j
}
break
}
}
$str = "A"+$startRow+":A"+$endRow
$objSheet.Range($str).Rows.Group()
$i=$j
}
}
$objBook.SaveAs("C:\Users\gurmansingh\Documents\b",51) #saving in a different format.
$objBook.Close()
$objExl.Quit()
Before:
a.csv
Output after running the code:
b.xlsx

Also, check out how easy it is to do using my Excel PowerShell module.
Install-Module ImportExcel
https://github.com/dfinke/ImportExcel/issues/556#issuecomment-469897886

Related

Powershell - Delete excel rows that contain a word

I'm really new to Powershell and I feel like I've looked all over and can't quite figure out what is wrong with my code.
My goal is a powershell script that can run against an Excel workbook and delete rows with a specific string in the cell (in this case it is local admin accounts).
Currently my script launches the excel sheet opens, but no rows are deleted. The code exits without error. Any help would be greatly appreciated
$ObjExcelCellTypeLastCell = 11
$ObjExcel = New-Object -ComObject Excel.Application
$ObjExcel.Visible = $True
$ObjExcel.DisplayAlerts = $True
$Workbook = $ObjExcel.Workbooks.Open("File\Path\")
$Worksheet = $Workbook.Worksheets.Item(1)
$used = $Worksheet.usedRange
$lastCell = $used.SpecialCells($ObjExcelCellTypeLastCell)
$row = $lastCell.row
for ($i = $Worksheet.usedrange.rows.count; $i -gt 0; $i--)
{
If ($Worksheet.Cells.Item($i, 1) = "Local Admin") {
$Range = $Worksheet.Cells.Item($i, 1).EntireRow
$Range.Delete()
$i = $i + 1
Else
Break
}
Exit
}
I don't know much about powershell but i think your if statement $Worksheet.Cells.Item($i, 1) = "Local Admin" is wrong, you should use -eq
also maybe you need to call the Close method on the workbook object that you just Open'd
I am not sure if it's solved, but my code is like below. It's not exactly same to mine, but I think this would work.
#get last row
$rowLast = $WorkSheet.UsedRange.Rows.Count
#for loop
for ($row = $rowLast; $row -gt 0; $row--) {
if($WorkSheet.Cells.Item($row, 1).Text -eq "Local Admin"){
#delete the row. Without "[void]", you will get message "True" when successfully deleted the row.
[void]$WorkSheet.Rows($row).Delete()
}
}
I think you need ".Text" after "$Worksheet.Cells.Item($i, 1)".
Also, I think following codes should be removed.
$i = $i + 1
Else
Break
Exit

Powershell using ImportExcel to delete rows

I am trying to delete rows of data from an Excel file using the ImportExcel module.
I can open the file, find the the data I wish to delete and the DeleteRow command works on a hardcoded value however does not appear to work on a variable...any ideas?
# Gets ImportExcel PowerShell Module
if (-not(Get-Module -ListAvailable -Name ImportExcel)) {
Find-module -Name ImportExcel | Install-Module -Force
}
# Open Excel File
$excel = open-excelpackage 'C:\temp\input.xlsx'
#Set Worksheet
$ws = $excel.Workbook.Worksheets["Sheet1"]
#Get Row Count
$rowcount = $ws.Dimension.Rows
#Delete row if Cell in Column 15 = Yes
for ($i = 2; $i -lt $rowcount; $i++) {
$cell = $ws.Cells[$i, 15]
if ($cell.value -eq "Yes") {
$ws.DeleteRow($i)
}
}
#Save File
Close-ExcelPackage $excel -SaveAs 'C:\Temp\Output.xlsx'
You should reverse the loop and go from bottom to top row. As you have it, by deleting a row, the index of the ones below that is changed and your for ($i = 2; $i -lt $rowcount; $i++) {..} will skip over.
You can also do this without the ImportExcel module if you have Excel installed:
$file = 'C:\Temp\input.xlsx'
$excel = New-Object -ComObject Excel.Application
$excel.Visible = $false
# open the Excel file
$workbook = $excel.Workbooks.Open($file)
$sheet = $workbook.Worksheets.Item(1)
# get the number of rows in the sheet
$rowMax = $sheet.UsedRange.Rows.Count
# loop through the rows to test if the value in column 15 is "Yes"
# do the loop BACKWARDS, otherwise the indices will change on every deletion.
for ($row = $rowMax; $row -ge 2; $row--) {
$cell = $sheet.Cells[$row, 15].Value2
if ($cell -eq 'Yes') {
$null = $sheet.Rows($row).EntireRow.Delete()
}
}
# save and exit
$workbook.SaveAs("C:\Temp\Output.xlsx")
$excel.Quit()
# clean up the COM objects used
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($sheet)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($workbook)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($excel)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()

Read csv stream line by line to create an array for Excel Range

This is my first post - I will be happy to make any corrections required for any mistakes made in the post.
I have been looking through the forums here for a few months and have learned a lot but I cannot seem to accomplish my goal with what I have found.
I need to read a CSV file (Read-Only) when it changes and place the resulting array into and active and open Excel 2016 Tab. I can do this using com and system.io.watcherchangetypes but this is too slow and requires copy paste.
I need to read the csv as fast as possible (under a second) and convert the lines into a usable array for Excel. This whole process has to take under 2 seconds MAX. Some of the CSV's will exceed 180,000 lines as the day goes on.
I work for a Trading Company.
I would be happy with a single column, Tab delimited and multiple Rows. I cant get the multiple rows.
I have to write the range line by line and that takes too long.
I was looking at this one but I am not clear on how to make the whole thing dynamic. There is no set amount of headers and the rows will change as well. I cannot work with any static data at all.
This is the post which prompted me to ask for help: How to use powershell to reorder CSV columns
$export = "\\UNC\to\file\Name.csv"
#$excel = New-Object -ComObject Excel.Application
#$excel.visible = $true
#$workbook = $excel.Workbooks.Add()
$reader = [System.IO.File]::OpenText($export)
$writer = New-Object System.IO.StreamWriter "data2.csv"
for(;;) {
$line = $reader.ReadLine()
if ($null -eq $line) {
break
}
$i=1
$data = $line.Split(",") | %{
if($_ -ne $null)
{
Write-Host $_ $i
++$i
}
}
[void]$data.Length
# $data.GetValue()
#$writer.WriteLine('{0},{1},{2}', $data[0], $data[1], $data[2])
}
$reader.Close()
#$writer.Close()
Any help will be greatly appreciated!
UPDATE:
I figured it out. The result is probably not the most efficient but it gets me what I need for now while i explore how to better accomplish it with what I have learned.
(Measure-Command { $data = [System.io.File]::Open($export, 'Open', 'Read', 'ReadWrite')
$reader = New-Object System.IO.StreamReader($data)
$count = 0
While($text = $reader.Readline())
{
If($text -eq $null)
{
$reader.Close()
$data.close()
}
++$count
}
}).TotalSeconds
$array2 = New-Object 'object[,]' $count,1
$end = ++$count
$file = New-Object System.IO.StreamReader -ArgumentList $export
$stringBuilder = New-Object System.Text.StringBuilder
$list = New-Object System.Collections.Generic.List[System.String]
$a = 0
Measure-Command {
While ($i = $file.ReadLine() -Replace ",","`t")
{
if ($i -eq $null)
{
$file.close()
break loop
}
$null = $stringBuilder.Append($i)
$list.Add($i)
$array2[$a,0] = $i
++$a
}
$outputString = $stringBuilder.ToString()
$array = $list.ToArray()
}
You can do something like this
data = pd.read_csv("data1.csv", sep='\s+',header=None)
dataarraynew13phse = np.array(data)
dataarraynew13phse=dataarraynew13phse.flatten()
sep = '\s+' can be useful to decode tabs, in multiple lines
And then flatten() can make it in a single row or array

Powershell script to match condition of excel cell values

I am novice programmer of powershell, I am trying to do excel search and change of format and font option. Here is the snippet were I am trying to search for the word "PASSED" and change the color to green and bold, currently the code does exits out without changing as expected what is wrong in this which I could not figure out, need help in this regards.
$excel = New-Object -ComObject Excel.Application
$excel.Visible = $false
$excel.DisplayAlerts = $False
$workbook = $excel.Workbooks.Open("C:\test.xlsx")
$sheet = $workbook.ActiveSheet
$xlCellTypeLastCell = 11
$used = $sheet.usedRange
$lastCell = $used.SpecialCells($xlCellTypeLastCell)
$row = $lastCell.row # goes to the last used row in the worksheet
for ($i = 1; $i -lt $row.length; $i++) {
If ($sheet.cells.Item(1,2).Value() = "PASSED") {
$sheet.Cells.Item(1,$i+1).Font.ColorIndex = 10
$sheet.Cells.Item(1,$i+1).Font.Bold = $true
}
}
$workbook.SaveAs("C:\output.xlsx")
$workbook.Close()
Input(test.xlsx) file has the following
Module | test | Status
ABC a PASSED
Its quiet a huge file with different status of each unit test.
$row is a string containing the last row number, comparing to it's Length property in the for loop will land you in trouble since it'll give you the length of the string itself.
Change it to:
for ($i = 1; $i -lt $row; $i++) {
In the if statement inside the loop, there's another problem: =
In order to compare two values for equality, use the -eq operator instead of = (= is only for assignment):
if ($sheet.cells.Item($i,2).Value() -eq "PASSED") {
$sheet.Cells.Item(1,$i+1).Font.ColorIndex = 10
$sheet.Cells.Item(1,$i+1).Font.Bold = $true
}
Lastly, Excel cell references are not zero-based, so Item(1,2) will refer to the cell that in your example has the value "test" (notice how it takes a row as the first parameter, and a column as the second). Change it to Item(2,3) to test against the correct cell, and transpose the cell coordinates inside the if block as well.
You may want to update the for loop to reflect this as well:
for ($i = 2; $i -le $row; $i++) {
if ($sheet.cells.Item($i,3).Value() = "PASSED") {
$sheet.Cells.Item($i,3).Font.ColorIndex = 10
$sheet.Cells.Item($i,3).Font.Bold = $true
}
}

Read Excel data with Powershell and write to a variable

Using PowerShell I would like to capture user input, compare the input to data in an Excel spreadsheet and write the data in corresponding cells to a variable. I am fairly new to PowerShell and can't seem to figure this out. Example would be: A user is prompted for a Store Number, they enter "123". The input is then compared to the data in Column A. The data in the corresponding cells is captured and written to a variable, say $GoLiveDate.
Any help would be greatly appreciated.
User input can be read like this:
$num = Read-Host "Store number"
Excel can be handled like this:
$xl = New-Object -COM "Excel.Application"
$xl.Visible = $true
$wb = $xl.Workbooks.Open("C:\path\to\your.xlsx")
$ws = $wb.Sheets.Item(1)
Looking up a value in one column and assigning the corresponding value from another column to a variable could be done like this:
for ($i = 1; $i -le 3; $i++) {
if ( $ws.Cells.Item($i, 1).Value -eq $num ) {
$GoLiveDate = $ws.Cells.Item($i, 2).Value
break
}
}
Don't forget to clean up after you're done:
$wb.Close()
$xl.Quit()
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($xl)
I find it preferable to use an OleDB connection to interact with Excel. It's faster than COM interop and less error prone than import-csv. You can prepare a collection of psobjects (one psobject is one row, each property corresponding to a column) to match your desired target grid and insert it into the Excel file. Similarly, you can insert a DataTable instead of a PSObject collection, but unless you start by retrieving data from some data source, PSObject collection way is usually easier.
Here's a function i use for writing a psobject collection to Excel:
function insert-OLEDBData ($file,$sheet,$ocol) {
{
"xlsb$"
{"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=`"$File`";Extended Properties=`"Excel 12.0;HDR=YES;IMEX=1`";"}
"xlsx$"
{"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=`"$File`";Extended Properties=`"Excel 12.0 Xml;HDR=YES;IMEX=1`";"}
}
$OLEDBCon = New-Object System.Data.OleDb.OleDbConnection($cs)
$hdr = $oCol|gm -MemberType NoteProperty|%{$_.name}
$names = '[' + ($hdr-join"],[") + ']'
$vals = (#("?")*([array]$hdr).length)-join','
$sql = "insert into [$sheet`$] ($names) values ($vals)"
$sqlCmd = New-Object system.Data.OleDb.OleDbCommand($sql)
$sqlCmd.connection = $oledbcon
$cpary = #($null)*([array]$hdr).length
$i=0
[array]$hdr|%{([array]$cpary)[$i] = $sqlCmd.parameters.add($_,"VarChar",255);$i++}
$oledbcon.open()
for ($i=0;$i-lt([array]$ocol).length;$i++)
{
for ($k=0;$k-lt([array]$hdr).length;$k++)
{
([array]$cpary)[$k].value = ([array]$oCol)[$i].(([array]$hdr)[$k])
}
$res = $sqlCmd.ExecuteNonQuery()
}
$OLEDBCon.close()
}
This does not seem to work anymore. I swear it used to, but maybe an update to O365 killed it? or I last used it on Win 7, and have long since moved to Win 10:
$GoLiveDate = $ws.Cells.Item($i, 2).Value
I can still use .Value for writing to a cell, but not for reading it into a variable. instead of the contents of the cell, It returns: "Variant Value (Variant) {get} {set}"
But after some digging, I found this does work to read a cell into a variable:
$GoLiveDate = $ws.Cells.Item($i, 2).Text
In regards to the next question / comment squishy79 asks about slowness, and subsequent
OleDB solutions, I can't seem to get those to work in modern OS' either, but my own performance trick is to have all my Excel PowerShell scripts write to a tab delimited .txt file like so:
Add-Content -Path "C:\FileName.txt" -Value $Header1`t$Header2`t$Header3...
Add-Content -Path "C:\FileName.txt" -Value $Data1`t$Data2`t$Data3...
Add-Content -Path "C:\FileName.txt" -Value $Data4`t$Data5`t$Data6...
then when done writing all the data, open the .txt file using the very slow Com "Excel.Application" just to do formatting then SaveAs .xlsx (See comment by SaveAs):
Function OpenInExcelFormatSaveAsXlsx
{
Param ($FilePath)
If (Test-Path $FilePath)
{
$Excel = New-Object -ComObject Excel.Application
$Excel.Visible = $true
$Workbook = $Excel.Workbooks.Open($FilePath)
$Sheet = $Workbook.ActiveSheet
$UsedRange = $Sheet.UsedRange
$RowMax = ($Sheet.UsedRange.Rows).count
$ColMax = ($Sheet.UsedRange.Columns).count
# This code gets the Alpha character for Columns, even for AA AB, etc.
For ($Col = 1; $Col -le $ColMax; $Col++)
{
$Asc = ""
$Asc1 = ""
$Asc2 = ""
If ($Col -lt 27)
{
$Asc = ([char]($Col + 64))
Write-Host "Asc: $Asc"
}
Else
{
$First = [math]::truncate($Col / 26)
$Second = $Col - ($First * 26)
If ($Second -eq 0)
{
$First = ($First - 1)
$Second = 26
}
$Asc1 = ([char][int]($First + 64))
$Asc2 = ([char][int]($Second + 64))
$Asc = "$Asc1$Asc2"
}
}
Write-Host "Col: $Col"
Write-Host "Asc + 1: $Asc" + "1"
$Range = $Sheet.Range("a1", "$Asc" + "1")
$Range.Select() | Out-Null
$Range.Font.Bold = $true
$Range.Borders.Item(9).LineStyle = 1
$Range.Borders.Item(9).Weight = 2
$UsedRange = $Sheet.UsedRange
$UsedRange.EntireColumn.AutoFit() | Out-Null
$SavePath = $FilePath.Replace(".txt", ".xlsx")
# I found scant documentation, but you need a file format 51 to save a .txt file as .xlsx
$Workbook.SaveAs($SavePath, 51)
$Workbook.Close
$Excel.Quit()
}
Else
{
Write-Host "File Not Found: $FilePath"
}
}
$TextFilePath = "C:\ITUtilities\MyTabDelimitedTextFile.txt"
OpenInExcelFormatSaveAsXlsx -FilePath $TextFilePath
If you don't care about formatting, you can just open the tab delimited .txt files as-is in Excel.
Of course, this is not very good for inserting data into an existing Excel spreadsheet unless you are OK with having the script rewrite the whole sheet it each time an insert is made. It will still run much faster than using COM in most cases.
I found this, and Yevgeniy's answer. I had to do a few minor changes to the above function in order for it to work. Most notably the handeling of NULL or empty valued values in the input array. Here is Yevgeniy's code with a few minor changes:
function insert-OLEDBData {
PARAM (
[Parameter(Mandatory=$True,Position=1)]
[string]$file,
[Parameter(Mandatory=$True,Position=2)]
[string]$sheet,
[Parameter(Mandatory=$True,Position=3)]
[array]$ocol
)
$cs = Switch -regex ($file)
{
"xlsb$"
{"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=`"$File`";Extended Properties=`"Excel 12.0;HDR=YES`";"}
"xlsx$"
{"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=`"$File`";Extended Properties=`"Excel 12.0 Xml;HDR=YES`";"}
}
$OLEDBCon = New-Object System.Data.OleDb.OleDbConnection($cs)
$hdr = $oCol | Get-Member -MemberType NoteProperty,Property | ForEach-Object {$_.name}
$names = '[' + ($hdr -join "],[") + ']'
$vals = (#("?")*([array]$hdr).length) -join ','
$sql = "insert into [$sheet`$] ($names) values ($vals)"
$sqlCmd = New-Object system.Data.OleDb.OleDbCommand($sql)
$sqlCmd.connection = $oledbcon
$cpary = #($null)*([array]$hdr).length
$i=0
[array]$hdr|%{([array]$cpary)[$i] = $sqlCmd.parameters.add($_,"VarChar",255);$i++}
$oledbcon.open()
for ($i=0;$i -lt ([array]$ocol).length;$i++)
{
for ($k=0;$k -lt ([array]$hdr).length;$k++)
{
IF (([array]$oCol)[$i].(([array]$hdr)[$k]) -notlike "") {
([array]$cpary)[$k].value = ([array]$oCol)[$i].(([array]$hdr)[$k])
} ELSE {
([array]$cpary)[$k].value = ""
}
}
$res = $sqlCmd.ExecuteNonQuery()
}
$OLEDBCon.close()
}

Resources