Merge multiple excel files containing multiple sheets using Powershell - excel

I am having multiple excel files having same no of sheets with same sheet names. The same sheets in all excel files having the same headers. So I want an idea how to do merging for all the matched sheets in multiple excel files and create a new excel file via scripting using Powershell.
Any sugguestion Helps.
Thanks.

If the number of columns is the same across the different Excel worksheets then you should be able to use the below code to merge the files together.
The code uses methods from the NamedRange interface from the Excel API.
Example Code: (Just remember to change the paths and file names to your environment)
# Create an instance of Excel
$Excel = New-Object -ComObject Excel.Application
# Find the files you want to process
$Files = Get-ChildItem -Path C:\Temp -Filter *.xlsx
# Create a target workbook and worksheet called 'Sheet1'
$TargetWorkbook = $Excel.Workbooks.add()
$TargetWorksheet = $TargetWorkbook.Sheets.Item("Sheet1")
# Loop through our Excel files
foreach($File in $Files) {
# Open the workbook and get the first sheet
$SourceWorkbook=$Excel.Workbooks.Open($File.FullName)
$SourceWorksheet=$SourceWorkbook.Sheets.Item(1)
# Calculate the end column letter
$EndColumn = [char]([int][char]'A' + $SourceWorksheet.UsedRange.Columns.Count - 1)
# Activate the source worksheet
$SourceWorksheet.activate()
# Get the total number of rows in the sheet
$SourceLastRow = $SourceWorksheet.UsedRange.Rows.Count + 1
# Calculate what our start row should be
# A1 for the first worksheet only to include the headers
$StartRow = (& { If ($TargetWorkSheet.UsedRange.rows.count -eq 1) { "A1" } Else { "A2" } } )
# Get the range of data and copy it to the clipboard
$SourceRange = $SourceWorksheet.Range("$($StartRow):$EndColumn$SourceLastRow")
$SourceRange.copy()
# Activate the target worksheet
$TargetWorksheet.activate()
# Get the total number of rows in the sheet
$TargetLastRow = $TargetWorkSheet.UsedRange.Rows.Count
if ($TargetWorkSheet.UsedRange.Rows.Count -ne 1) {
# If this isn't the first sheet we've processed, add one additional row
$TargetLastRow++
}
# Get the target range and paste the data
$TargetRange = $TargetWorksheet.Range("A$($TargetLastRow):$EndColumn$($SourceRange.Rows.Count)")
$TargetWorksheet.Paste($TargetRange)
# Disable showing alerts, otherwise a notification about
# large amounts of data on the clipboard will be shown
$Excel.DisplayAlerts = $false
# Close the source workbook
$SourceWorkbook.Close()
}
# Re-enable showing alerts
$Excel.DisplayAlerts = $true
# Save the workbook to the desired path
$TargetWorkbook.SaveAs("C:\Temp\Merged.xlsx")
# Quit Excel
$Excel.Quit()
If your sheets have different numbers of columns, you could still use the above code, however you'll need to make some changes to the $SourceRange, $TargetRange and $EndColumn variables to account for this.

Related

Filter an Excel workbook on all the sheets with criteria on two columns, when the rows satisfy my criteria, copy the rows to a new excel workbook

I have only one excel file, the file has multiple spreadsheets, I loop through all the spreadsheets and find the rows whose column1 and column2 satisfy my criteria, and if it does, copy the rows to a new excel workbook, I need to copy the first row that specifies what the column names are as well, but right now I'm ignoring this to simplify the problem.
I looked up online, and found a post similar to my question, I modified the code according to my situation, here's the code:
$WindowsFolder = "C:\Users\wanlingjiang\Downloads\xlsx\"
# Create a new datatable to copy data into.
$dtExcel = New-Object System.Data.DataTable
# Counter used to only create data columns on the first index in the loop.
$count = 1
# Get all spreadsheet objects from the current folder.
$SpreadSheets = Get-ChildItem $WindowsFolder -File -Verbose
# Loop through each of the spreadsheet objects returned.
foreach ($SpreadSheet in $SpreadSheets) {
# Index the counter.
try{
# Import the data from the source spreadsheet into datatables.
$dts = Get-TablesFromXLSXWorkbook -InputFileName $SpreadSheet.FullName -Verbose -ErrorAction Stop
# We only need to work with the first datatable imported from each spreadsheet.
$Rows = $dts[0].Rows
# Create the data columns within the target datatable
# using the column headers from the current spreadsheet.
if($count -eq 1) {
foreach ($item in $dts.Columns) {
$dtExcel.Columns.Add($item.ColumnName) | Out-Null
}
$count++
}
# Loop through each row of data returned from the current spreadsheet.
foreach ($row in $Rows) {
# Determine if the 'Column1' column in the current row equals 'Criteria1' and if the 'Column2' column in the current row starts with 'Criteria2'.
# If yes, copy the data row to the target datatable.
if($row.'Column1' -eq 'Criteria1' -and $row.'Column2' -like 'Criteria2*') {
$dr = $dtExcel.NewRow()
$dr.ItemArray = $row.ItemArray.Clone()
$dtExcel.Rows.Add($dr)
}
}
} catch {
Write-Warning -Message "Something happened. Write a good error message."
}
}
# Export the target datatable to a new Excel spreadsheet.
New-XLSXWorkbook -InputTables $dtExcel -OutputFileName 'C:\Users\wanlingjiang\Downloads\xlsx\Output.xlsx' -Open
It is not working, the error message says:
New-XLSXWorkbook : The term 'New-XLSXWorkbook' is not recognized as the name of a cmdlet, function, script file, or operable program.
I removed the last line, and tried to debug, it stopped at this line:
$dts = Get-TablesFromXLSXWorkbook -InputFileName $SpreadSheet.FullName -Verbose -ErrorAction Stop
Do I need to install something? Please help. Thank you!

Powershell Excel Master Spreadsheet

This is a long winded one but I will try and shorten it.
I have a master spreadsheet exported from our MIS system each school term and it gives me all the classes and who is in them for a specific school year.
The only problem is that it comes as one massive sheet within one workbook.
It looks like this:
Redacted Master Sheet Image
My end goal is to have one master workbook with each class cut and pasted into a separate sheet, and then the name of that sheet is the 'class code' which is the {redacted} part at the top of each section.
This has been done manually until now but it takes hours.
Is there a way in powershell to do this? I need the data from Class List Report: ... to Males: X Females: X into their own sheet within the workbook and then name the sheet the class code. Unfortunately the classes are different lengths so I cant do it based on counting the number of cells.
Any help is greatly appreciated.
This will probably do what you need given the image you provided. Essentially it opens the Workbook gets the main sheet parses it based on the key phrase. Then creates new sheets with the data between the start of each key phrase named by the identifier within the key phrase.
$ExcelFile="SO61954371.xlsx"
$MainSheetName="Sheet1"
$splitOnLike="Class List Report*"
$newExcelFile="SO61954371AA.xlsx"
#create Excel Object
$xl = New-Object -ComObject Excel.Application
#open Workbook
$WorkBook = $xl.Workbooks.Open($ExcelFile)
#grab main sheet
$mainSheet=$WorkBook.sheets.Item("$MainSheetName")
$classStart=#()
#loop through used rows
for($i=1; $i -le $mainSheet.UsedRange.Rows.Count;$i++){
#get cell in row $i Column 1(A)
$cell=$mainSheet.Cells.Item($i,1)
#check if start of table
if($cell.Text -like "$splitOnLike")
{
#make object with class code and starting cell
$classStart+=[pscustomobject]#{
classCode=($cell.text.split(":")[1] -split "as")[0].trim()
Cell=$cell
}
}
}
#Loop though tables
for($c=0; $c -lt $classStart.Count;$c++){
#create new sheet after last sheet
$Nsheet=$WorkBook.Sheets.Add($($WorkBook.Worksheets|Select -Last 1))
#name sheet
$Nsheet.name=$classStart[$c].classCode
#get and copy from current class to start of next
if($c -ne ($classStart.Count-1)){
$range=$mainSheet.Range("A$($classStart[$c].Cell.Row):$([char](64 + $($mainSheet.UsedRange.Columns.count)))$($classStart[$c+1].Cell.Row-1)")
$range.Copy() | out-null
$nsheetrange=$Nsheet.Range("A1")
$Nsheet.paste($nsheetrange)
}
else{
#for last class get and copy from start of class to end of usable range
$range=$mainSheet.Range("A$($classStart[$c].Cell.Row):$([char](64 + $($mainSheet.UsedRange.Columns.count)))$($mainSheet.UsedRange.Rows.Count)")
$range.Copy() | out-null
$nsheetrange=$Nsheet.Range("A1")
$Nsheet.paste($nsheetrange)
}
}
#save the file
$workbook._SaveAs($newExcelFile)
#kill the Excel instance and cleanup
$xl.Quit()
Remove-Variable -Name xl
[gc]::collect()
[gc]::WaitForPendingFinalizers()

Combining Multiple Workbooks Into One Workbook Worksheet With Powershell

I have this Powershell script i'm trying to combine multiple workbooks with single sheets onto one workbook with a single sheet and combine them all on the one sheet. I can't get past the fact it keeps telling me there is no file named $destfile and can't be opened. What is the correct syntax for that?
Thanks
$ExcelObject = New-Object -ComObject excel.application
$ExcelObject.visible=$true
$file1 = 'file1location'
$file2 = 'file2location'
$destfile = 'fileI want to saveas afterits compiled'
$xl = new-object -c excel.application
$xl.displayAlerts = $false # don't prompt the user
$wb1 = $xl.workbooks.open($file1, $null, $true) # open source, readonly
$wb2 = $xl.workbooks.open($file2, $null, $true)
$wb3 = $xl.workbooks.open($destfile) # open target
$sh1_wb2 = $wb2.sheets.item(1) # first sheet in destination workbook
$sheetToCopy = $wb1.sheets.item('Sheet1') # source sheet to copy
$sheetToCopy.copy($sh1_wb2) # copy source sheet to destination workbook
$wb1.close($false) # close source workbook w/o saving
$wb2.close($true) # close and save destination workbook
$xl.quit()
spps -n excel
You can try to use https://github.com/dfinke/ImportExcel
Export data to csv from multiple worksheets with Import-CSV, then combine those CSV files (i suppose that they have identical rows) and import combined CVS back to excel using Export-CSV... quite simple. and does not require any COM manipulations.

Copy multiple Excel worksheets from multiple workbooks to a new workbook using PowerShell

I have been at this for a while and can't seem to find anything that does exactly what I want. I was working off this post: How to use PowerShell to copy several Excel worksheets and make a new one but it doesn't do exactly what I am looking for. I am attempting to copy all worksheets from multiple workbooks and place them in a new workbook. Ideally, I would like to get the file name from one of the workbooks and use it to name the new file in a new directory.
I have found numerous examples of code that can do this, but there are a couple of key features that I am missing and not sure how to implement them. Here is an example of what I have working now:
$file1 = 'C:\Users\Desktop\TestFolder\PredictedAttritionReport.xlsx' # source's fullpath
$file2 = 'C:\Users\Desktop\TestFolder\AdvisorReport' # destination's fullpath
$xl = new-object -c excel.application
$xl.displayAlerts = $false # don't prompt the user
$wb2 = $xl.workbooks.open($file1, $null, $true) # open source, readonly
$wb1 = $xl.workbooks.open($file2) # open target
$sh1_wb1 = $wb1.sheets.item('Report') # second sheet in destination workbook
$sheetToCopy = $wb2.sheets.item('Report') # source sheet to copy
$sh1_wb1.Name = "DeleteMe$(get-date -Format hhmmss)" #Extremely unlikely to be a duplicate name
$sheetToCopy.copy($sh1_wb1) # copy source sheet to destination workbook
$sh1_wb1.Delete()
$wb2.close($false) # close source workbook w/o saving
$wb1.close($true) # close and save destination workbook
$xl.quit()
spps -n excel
The problem that I have is that I need this to work, such that I don't have to input the actual file name since those names may be different each time they are created and there may be 3 or 4 files with more than one worksheet, where this example works off of only two named files. Additionally, I would like to be able to copy all worksheets in a file instead of just a single named worksheet, which in this case is 'Report'. The final piece is saving it as a new Excel file rather than overwriting the existing destination file, but I think I can figure that part out.
You could specify parameters when you call the script, and use the specified values in place of the sheet/file names as required. Read about PowerShell Parameters here. Reply if you need more info on how to implement.
This function will copy over sheets from one excel workbook to other.
You can call with Copy-AllExcelSheet command one, it is loaded into memory.
See below in example:
Function Copy-AllExcelSheet()
{
param($TargetXls, $sourceXls)
$xl = new-object -c excel.application
$xl.displayAlerts = $false
$sourceWb = $xl.Workbooks.Open($sourceXls,$true,$false)
$targetWB = $xl.Workbooks.Open($TargetXls)
foreach($nextSheet in $sourceWb.Sheets)
{
$nextSheet.Copy($targetWB.Sheets[1])
}
$sourceWb.close($false)
$targetWB.close($true)
$xl.Quit()
}
#To call the function
Copy-AllExcelSheet "c:\Targetfile.xlsx" "c:\sourceFile.xlsx"

Powershell Excel Chart - obtain data series from existing chart

I have a Powershell script that copies a worksheet (with a custom dual axis chart) from one workbook to another workbook and then populates the new copy with data. That portion of the script works fine but I am trying to change the data series in the existing chart and I do not know the fields to change the data series.
I can change the chart title and the legend labels of the existing chart with no issues. I have tried the $ChartTemplate.SeriesCollection(1).Values field and the $ChartTemplate.SeriesCollection(1).XValues with and with out the $ChartTemplate.SeriesCollection().NewSeries.Invoke() command and I have had no success.
Does anyone know the Powershell syntax to edit an existing data series of a custom dual axis line chart (=SERIES(Template!$G$1,Template!$A$2:$A$112,Template!$G$2:$G$112,4) ?
The following is my Powershell code which I have obtained from googling:
$file1 = $global:ChartTemplateXlsx # source's fullpath
$file2 = $Path # destination's fullpath
$xl = new-object -c excel.application
$xl.Visible = $False # dont display the spreadsheet
$xl.displayAlerts = $false # don't prompt the user
$wb1 = $xl.workbooks.open($file1, $null, $true) # open source, readonly
$wb = $xl.workbooks.open($file2) # open target workbook/worksheet
$sh1_wb = $wb.sheets.item(1) # 1st sheet in destination workbook
$sheetToCopy = $wb1.sheets.item('Template') # source sheet to copy
$sheetToCopy.copy($sh1_wb) # copy source sheet to destination workbook
$ws = $wb.ActiveSheet # set the worksheet
$ChartTemplate = $ws.chartobjects(1).chart # obtain the existing chart
$ChartTemplate.HasTitle = $true # turn on chart title
$ChartTemplate.ChartTitle.Text = "Test Chart" # set a new chart title
$ChartTemplate.SeriesCollection(1).Name = "=""Test01"""
$ChartTemplate.SeriesCollection(2).Name = "=""Test02"""
$ChartTemplate.SeriesCollection(3).Name = "=""Test03"""
$ChartTemplate.SeriesCollection(4).Name = "=""Test04"""
$wb1.close($false) # close source workbook w/o saving
$wb.close($true) # close and save destination workbook
$xl.quit()
spps -n excel
By the way, I am using a separate workbook with the chart so that users can create their own chart templates and then I will populate it with data.
The property that you're looking for is either the Formula or FormulaLocal property. They appear to be duplicates of each other. If updating one doesn't work, try the other, or just set both.
$ChartTemplate.SeriesCollection(1).Formula = '=SERIES(Template!$G$1,Template!$A$2:$A$112,Template!$G$2:$G$112,4)'
$ChartTemplate.SeriesCollection(1).FormulaLocal = '=SERIES(Template!$G$1,Template!$A$2:$A$112,Template!$G$2:$G$112,4)'

Resources