CSV to Excel with pivot table - excel

So I am trying to write a easy code to convert a csv to excel, insert a pivot table and then save it. the below code runs fine but the excel file is being saved with the name $name instead of the content of $name... and also can I use VBA code in powershell script ? I want to insert a pivot table and I can only find vba code for it....
Add-Type -AssemblyName System.Windows.Forms
$FileBrowser = New-Object System.Windows.Forms.OpenFileDialog -Property #{
InitialDirectory = [Environment]::GetFolderPath('Desktop')
Filter = 'SpreadSheet (*.csv)|*.csv'
}
$FileBrowser.ShowDialog()
$file = $FileBrowser.FileName
$name= $FileBrowser.SafeFileName
Write-Host $name
$excel = New-Object -ComObject Excel.Application
$excel.Visible = $true
$excel.DisplayAlerts = $false
$excel.Workbooks.Open($file)
$excel.Workbooks.item(1).SaveAs('C:\Users\User\Desktop\$name.xlsx',51)
$excel.Quit()
[System.Runtime.InteropServices.Marshal]::ReleaseComObject($excel)
so after fixing the code here is the end product....
Add-Type -AssemblyName System.Windows.Forms
$FileBrowser = New-Object System.Windows.Forms.OpenFileDialog -Property #{
InitialDirectory = [Environment]::GetFolderPath('Desktop')
Filter = 'SpreadSheet (*.csv)|*.csv'
}
$FileBrowser.ShowDialog()
$file = $FileBrowser.FileName
$name = $FileBrowser.SafeFileName -replace ".{4}$"
$excel = New-Object -ComObject Excel.Application
$excel.Visible = $true
$excel.DisplayAlerts = $false
$excel.Workbooks.Open($file)
#$excel.Workbooks.Item(1).activate()
#$excel.ActiveSheet.Cells.Select()
$excel.Workbooks.item(1).SaveAs("C:\Users\User\Desktop\$name.xlsx",51)
$excel.Quit()
[System.Runtime.InteropServices.Marshal]::ReleaseComObject($excel)
Now I want to add a Pivot table to this. I added the lines to activate the workbook and select all the cells... now I have the VBA code but can someone help me translate it to the powershell ?
Sheets.Add
ActiveWorkbook.PivotCaches.Create(SourceType:=xlDatabase, SourceData:= _
"2!R1C1:R1048576C27", Version:=6).CreatePivotTable TableDestination:= _
"Sheet1!R3C1", TableName:="PivotTable3", DefaultVersion:=6
Sheets("Sheet1").Select
Cells(3, 1).Select
With ActiveSheet.PivotTables("PivotTable3").PivotFields("Domain")
.Orientation = xlRowField
.Position = 1
The above VBA is a macro that I recorded that gives us the steps... now we just have to translate it to power shell... I am lost as figuring the code till now itself was hours of tinkering...

It is because this is exactly what you told it to do.
$excel.Workbooks.item(1).SaveAs('C:\Users\User\Desktop\$name.xlsx',51)
Quotes are important. Single quotes mean to pass the string exactly as-is.
As defined in the help files.
about_Quoting_Rules - PowerShell | Microsoft Docs
Variables must be expanded, and in a string, you must use double-quotes.
$excel.Workbooks.item(1).SaveAs("C:\Users\User\Desktop\$name.xlsx",51)
Here are a couple of good articles for your edification on the topic as well:
A Story of PowerShell Quoting Rules
Windows PowerShell Quotes
Point of note:
That Write-Host is not needed if all you are doing is outputting text to the screen. In Powe3rShell output to the screen is the default. Also, depending on what version of PowerShell you are on, Write-Host empties the buffer, and anything after that is not usable elsewhere. Later versions of PowerShell does not have this issue, but it's still best to avoid Write-Host, unless you are sending a colorized text to the screen, or other custom formatting scenarios.
Additional reading:
Write-Host Considered Harmful
... Jeffrey Snover changes his stance on this as of May 2016.
With PowerShell v5 Write-Host no longer "kills puppies". data is
captured into info stream ...
https://twitter.com/jsnover/status/727902887183966208 ...
https://learn.microsoft.com/en-us/powershell/module/Microsoft.PowerShell.Utility/Write-Information?view=powershell-5.1
Lastly, you are no completely cleaning up behind yourself. Though you are quitting Excel, you are not exiting the Excel process, thus releasing those resources and the COM stuff. So, stopping that process and garbage collection is warranted.
For example:
Get-Process -Name 'EXCEL' | Stop-Process
Function Clear-ResourceEnvironment
{
# Clear any PowerShell sessions created
Get-PSSession | Remove-PSSession
# Release any COM object created
$null = [System.Runtime.InteropServices.Marshal]::ReleaseComObject([System.__ComObject]$Shell)
# Perform garbage collection on session resources
[System.GC]::Collect()
[GC]::Collect()
[GC]::WaitForPendingFinalizers()
<#
Remove any custom variables created
some list of your variable, or a wildcard when prefix was used.
#>
Get-Variable -Name MyShell -ErrorAction SilentlyContinue | Remove-Variable
}

Related

Refresh Excel connections using Powershell

I have the below code that opens a spreadsheet, deletes all connections and "Saves" a new file.
$a = New-Object -COM "Excel.Application"
$a.Visible = $false
$b = $a.Workbooks.Open("F:\Scripts\All Users.xlsx")
do
{
$b.Connections.Item(1).Delete()
$Count = $b.Connections.Count()
} until($Count -eq 0)
$b.SaveAs("F:\Scripts\Users Home Drive Search.xlsx")
$b.Close()
I would like to know two things:
How do I get the sheet to RefreshAll connections? I've tried "$b.Connections.refreshall()" but the refreshall() doesn't exist.
How do I quit Excel application? I ran "New-Object -COM "Excel.Application" | Get-Member -MemberType Methods" and I don't see a quit or exit method.
RefreshAll() is a method of the WorkBook object, so use that to refresh the external connections:
$b.RefreshAll()
To exit Excel and remove the used COM objects from memory, use
$a.Quit()
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($b)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($a)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
BTW, I would use more descriptive variable names, so $excel instead of $a and $workbook instead of $b to avoid confusion later on.

PowerShell script to open excel, wait for power query to refresh, and then save the file

I have an Excel that is using Power Query to get data from a API. What I would like to do is have this data update every day without having to open the excel myself. So I enabled the setting within excel to Refresh data when opening the file.
So I am trying to create a PowerShell script which open the excel, waits for the query to refresh, and then saves the excel. However I cant get it to wait update the query has refreshed before saving and closing.
code:
$Excel = New-Object -COM "Excel.Application"
$Excel.Visible = $true
$Workbook = $Excel.Workbooks.Open("G:\...\jmp-main-2020-07-17.xlsx")
While (($Workbook.Sheets | ForEach-Object {$_.QueryTables | ForEach-Object {if($_.QueryTable.Refreshing){$true}}}))
{
Start-Sleep -Seconds 1
}
$Excel.Save()
$Excel.Close()
I think your while loop is wrong. You should probably loop over the worksheets in the workbook and for each of them loop over the QueryTables. Then enter a while loop to wait until the Refreshing property turns $false
foreach ($sheet in $Workbook.Sheets) {
$sheet.QueryTables | ForEach-Object {
while ($_.QueryTable.Refreshing) {
Start-Sleep -Seconds 1
}
}
}
As aside: you should clear the COM object you have created after finishing with them to free memory:
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($Workbook)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($Excel)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
I am using Windows Task Scheduler to open Excel file and VBA inside that Excel file to close it after data is refreshed.
No PowerShell script needed.
I was able to get #Theo's example working with some small tweaks. Not sure why but the QueryTables property did not have my collection of queries for the sheet however using the ListObjects property did.
$file = "path\to\file.xlsx"
$Excel = New-Object -COM "Excel.Application"
$Excel.Visible = $false
$Workbook = $Excel.Workbooks.Open($file)
foreach ($sheet in ($Workbook.Sheets)) {
$sheet.ListObjects | ForEach-Object{$_.QueryTable.Refresh() | out-null}
$sheet.ListObjects | ForEach-Object{
while ($_.QueryTable.Refreshing) {
Start-Sleep -Seconds 1
}
}
}
And to properly save and exit out of the excel you would use
$Workbook.Save()
$Workbook.Close()
$Excel.Quit()

Powershell to refresh Excel with OLAP Query. Credential issues

I'm rather new with PowerShell scripting so please be patient with me :)
I have multiple excel file with OLAP Query connections connecting to Power BI Datasets, following are the script;
$libraryPath = "C:\Repos\AUSD\3.0\Test"
$excel = new-object -comobject Excel.Application
$excel.Visible = $false
# Give delay to open
Start-Sleep -s 3
$allExcelfiles = Get-ChildItem $libraryPath -recurse -include “*.xls*”
foreach ($file in $allExcelfiles)
{
$workbookpath = $file.fullname
Write-Host "Updating " $workbookpath
# Open the Excel file
$excelworkbook = $excel.workbooks.Open($workbookpath)
$connections = $excelworkbook.Connections
# This will Refresh All the pivot tables data.
$excelworkbook.RefreshAll()
# The following script lines will Save the file.
$excelworkbook.Save()
$excelworkbook.Close()
Write-Host "Update Complete " $workbookpath
}
$excel.quit()
It is working fine if following options;
$excel.Visible is true
However this is going to be scheduled in the server and hopefully this could be done in the background, hence the $excel.Visible = $false
This causing the following error;
I suspect this is due to the Automatic sign in which happen when the Excel are open, due to its not being open, its failing the sign in process.
How do I bypass or rather set the credentials/permission right?

Powershell - Excel SaveAs csv with specified delimiter

Afternoon all,
Is it possible to save a CSV file using Powershell with a different delimiter, in my case "§". I am using the following script to open and change items in an XLSX file and then wish to save as a "§" delimited CSV. The find and replace method does not work in my case ( (Get-Content -Path $CSVfile).Replace(',','§') | Set-Content -Path $CSVfile2)
$Path = "C:\ScriptRepository\CQC\DataToLoad\"
$FileName = (Get-ChildItem $path).FullName
$FileName2 = (Get-ChildItem $path).Name
$CSVFile = "$Path\$Filename2.csv"
$Excel = New-Object -ComObject Excel.Application -Property #{Visible =
$false}
$Excel.displayalerts=$False
$Workbook = $Excel.Workbooks.Open($FileName)
$WorkSheet = $WorkBook.Sheets.Item(2)
$Worksheet.Activate()
$worksheet.columns.item('G').NumberFormat ="m/d/yyyy"
$Worksheet.Cells.Item(1,3).Value = "Site ID"
$Worksheet.Cells.Item(1,4).Value = "Site Name"
$Worksheet.SaveAs($CSVFile,
[Microsoft.Office.Interop.Excel.XlFileFormat]::xlCSVWindows)
$workbook.Save()
$workbook.Close()
$Excel.Quit()
Running the following command, will let you save the CSV file using the delimiter §
Import-CSV filename.csv | ConvertTo-CSV -NoTypeInformation -Delimiter "§" | Out-File output_filename.csv
You should check out ImportExcel - PowerShell module to import/export Excel spreadsheets, without Excel. It makes working with excel files easier using powershell.
I know this is an older post but here is an option I recently came across:
Just update the e:\projects\dss\pse&g.xlsxwith the source location and file as well as the file.csv with the location and file name. Lastly your Worksheet if it is named differently [Sheet1$].
$oleDbConn = New-Object System.Data.OleDb.OleDbConnection
$oleDbCmd = New-Object System.Data.OleDb.OleDbCommand
$oleDbAdapter = New-Object System.Data.OleDb.OleDbDataAdapter
$dataTable = New-Object System.Data.DataTable
$oleDbConn.ConnectionString="Provider=Microsoft.ACE.OLEDB.12.0;Data
Source=e:\projects\dss\pse&g.xlsx;Extended Properties=Excel 12.0;Persist Security Info=False"
$oleDbConn.Open()
$oleDbCmd.Connection = $OleDbConn
$oleDbCmd.commandtext = “Select * from [Sheet1$]”
$oleDbAdapter.SelectCommand = $OleDbCmd
$ret=$oleDbAdapter.Fill($dataTable)
Write-Host "Rows returned:$ret" -ForegroundColor green
$dataTable | Export-Csv file.csv -Delimiter ';'
$oleDbConn.Close()
Source
I was using SaveAs(file.csv,6) but couldn't change the delimiter. Also Ishan's resolution works but I wanted something more OOB as this is going to be used within an SSIS package for myself across different systems and this just works. =)

Check header row of Excel sheet for particular column

I have over 150 excel files where some have an extra column (let's call it "ExtraColumn"), while some do not have this column. Instead of opening each file manually to see which ones have the extra column, I want to use powershell to figure it out.
The code I have tried so far hasn't seemed to have gotten me anywhere. If you have any suggestions or can point me to the correct answer, that would be very wonderful and much appreciated!
gci -Path C:\Test -Recurse | % {
$ExcelFile = (Get-Content $_.FullName -TotalCount 1)
if ($ExcelFile -like "ExtraColumn") {
Write-Host "$_ has the extra column"
} else {
Write-Host "$_ does not have the extra column"
}
}
You can use Excel ComObject, for the code simplicity just name the sheet otherwise you can find the sheet as well, add foreach section to run it on all files,
For the example i named the column - 'extracol'
$excel = New-Object -ComObject excel.application
$WB = $excel.Workbooks.Open('C:\exceltest.xlsx')
$WS = $Excel.WorkSheets.item("Sheet1")
$ExtraCol = ($ws.Columns.Find('extracol'))
if ($ExtraCol) {$ExtraCol.Delete()}
$wb.Save()
$wb.Close()
$excel.Quit()

Resources