Background
I am writing a PowerShell script to write some data to an Excel file (.xlsx) with Microsoft.ACE.OLEDB like this:
$fileName = "C:\tmp\createtest.xlsx"
$sheetName = "record"
$provider = "Provider=Microsoft.ACE.OLEDB.12.0"
$dataSource = "Data Source = $fileName"
$extend = "Extended Properties=Excel 12.0"
$ddlSQL = "CREATE TABLE [$sheetName] (ID CHAR(4), NAME VARCHAR(20))"
$conn = New-Object System.Data.OleDb.OleDbConnection("$provider;$dataSource;$extend")
$sqlCommand = New-Object System.Data.OleDb.OleDbCommand
$sqlCommand.Connection = $conn
$conn.open()
$sqlCommand.CommandText = $ddlSQL
$sqlCommand.ExecuteNonQuery()
...
$conn.close()
Problem
When you create C:\tmp\createtest.xlsx with an empty sheet named record manually, the CREATE TABLE statement creates the record1 sheet automatically.
I want to stop this behavior and let the CREATE TABLE statement throw an exception like ordinary RDBMS.
Question
Is there any way to stop OLEDB to create the (sheet name)1 sheet automatically when the Excel file has a sheet that has the same name?
I found a solution. Execute CREATE TABLE with a suffix $ for table name:
$ddlSQL = "CREATE TABLE [${sheetName}$] (ID CHAR(4), NAME VARCHAR(20))"
If the book and the sheet record already exists, this statement will be finished without error. If the book or the sheet record doesn't exist, This statement throws OleDbException. Either way, no new sheet will be created, so you can test the existence of the sheet safely.
If you want to use the sheet record regardless of existing sheet and avoid to create record1 sheet automatically, you can do it like this:
$checkExistenceSQL = "CREATE TABLE [${sheetName}$] (ID CHAR(4), NAME VARCHAR(20))"
$ddlSQL = "CREATE TABLE [$sheetName] (ID CHAR(4), NAME VARCHAR(20))"
$conn.open()
try {
try {
# Check existing sheet and open if it exists
$sqlCommand.CommandText = $checkExistenceSQL
$sqlCommand.ExecuteNonQuery() > $null
} catch {
try {
# Create new sheet if it doesn't exist
$sqlCommand.CommandText = $ddlSQL
$sqlCommand.ExecuteNonQuery() > $null
} catch {
throw $PSItem
}
}
$insertSQL = "INSERT INTO [${sheetName}$] VALUES (...)"
$sqlCommand.CommandText = $insertSQL
$sqlCommand.ExecuteNonQuery() > $null
} finally {
$conn.close()
}
Note that you have to execute CREATE TABLE first even if the sheet already exists because it affects the effectiveness of the datatype constraint. Also, you have to suffix the table name with $ in the INSERT statement, because it fails if the sheet record already exists. See https://satob.hatenablog.com/entry/2021/11/24/003818 and https://satob.hatenablog.com/entry/2021/11/25/012835 for details.
Related
I have only one excel file, the file has multiple spreadsheets, I loop through all the spreadsheets and find the rows whose column1 and column2 satisfy my criteria, and if it does, copy the rows to a new excel workbook, I need to copy the first row that specifies what the column names are as well, but right now I'm ignoring this to simplify the problem.
I looked up online, and found a post similar to my question, I modified the code according to my situation, here's the code:
$WindowsFolder = "C:\Users\wanlingjiang\Downloads\xlsx\"
# Create a new datatable to copy data into.
$dtExcel = New-Object System.Data.DataTable
# Counter used to only create data columns on the first index in the loop.
$count = 1
# Get all spreadsheet objects from the current folder.
$SpreadSheets = Get-ChildItem $WindowsFolder -File -Verbose
# Loop through each of the spreadsheet objects returned.
foreach ($SpreadSheet in $SpreadSheets) {
# Index the counter.
try{
# Import the data from the source spreadsheet into datatables.
$dts = Get-TablesFromXLSXWorkbook -InputFileName $SpreadSheet.FullName -Verbose -ErrorAction Stop
# We only need to work with the first datatable imported from each spreadsheet.
$Rows = $dts[0].Rows
# Create the data columns within the target datatable
# using the column headers from the current spreadsheet.
if($count -eq 1) {
foreach ($item in $dts.Columns) {
$dtExcel.Columns.Add($item.ColumnName) | Out-Null
}
$count++
}
# Loop through each row of data returned from the current spreadsheet.
foreach ($row in $Rows) {
# Determine if the 'Column1' column in the current row equals 'Criteria1' and if the 'Column2' column in the current row starts with 'Criteria2'.
# If yes, copy the data row to the target datatable.
if($row.'Column1' -eq 'Criteria1' -and $row.'Column2' -like 'Criteria2*') {
$dr = $dtExcel.NewRow()
$dr.ItemArray = $row.ItemArray.Clone()
$dtExcel.Rows.Add($dr)
}
}
} catch {
Write-Warning -Message "Something happened. Write a good error message."
}
}
# Export the target datatable to a new Excel spreadsheet.
New-XLSXWorkbook -InputTables $dtExcel -OutputFileName 'C:\Users\wanlingjiang\Downloads\xlsx\Output.xlsx' -Open
It is not working, the error message says:
New-XLSXWorkbook : The term 'New-XLSXWorkbook' is not recognized as the name of a cmdlet, function, script file, or operable program.
I removed the last line, and tried to debug, it stopped at this line:
$dts = Get-TablesFromXLSXWorkbook -InputFileName $SpreadSheet.FullName -Verbose -ErrorAction Stop
Do I need to install something? Please help. Thank you!
I'm working on a PS script to take a row of data from an Excel spreadsheet and populate that data in certain places in a Word document. To elaborate, we have a contract tracking MASTER worksheet that among other things contains data such as name of firm, address, services, contact name. Additionally, we have another TASK worksheet in the same workbook that tracks information such as project owner, project name, contract number, task agree number.
I'm writing a script that does the following:
Ask the user through a message box what kind of contract is being written ("Master", or "Task")
Opens the workbook with the appropriate worksheet opened ("Master" tab or "Task" tab)
Asks the user through a VB InputBox from which Excel row of data they want to use to populate the Word contract
Extracts that row of data from Excel
Outputs certain portions of that row of data to certain location in a Word document
Saves the Word document
Opens the Word document so the user can continue editing it
My question is this - using something like PSExcel, how do I extract that row of data out to variables that can be placed in a Word document. For reference, in case you're going to reply with a snippet of code, here are what the variables are defined as for the Excel portion my script:
$Filepath = "C:\temp\ContractScript\Subconsultant Information Spreadsheet.xlsx"
$Excel = New-Object -ComObject Excel.Application
$Workbook = $Excel.Workbooks.Open($Filepath)
$Worksheet = $Workbook.sheets.item($AgreementType)
$Excel.Visible = $true
#Choosing which row of data
[int]$RowNumber = [Microsoft.VisualBasic.Interaction]::InputBox("Enter the row of data from $AgreementType worksheet you wish to use", "Row")
Additionally, the first row of data in the excel worksheets are the column headings, in case it matters.
I've gotten this far so far:
import-module psexcel
$Consultant = new-object System.Collections.Arraylist
foreach ($data in (Import-XLSX -path $Filepath -Sheet $AgreementType -RowStart $RowNumber))
{
$Consultant.add($data)'
But I'm currently stuck because I can't figure out how to reference the data being added to $consultant.$data. Somehow I need to read in the column headings first so the $data variable can be defined in some way, so when I add the variable $consultant.Address in Word it finds it. Right now I think the variable name is going to end up "$Consultant.1402 S Broadway" which obviously won't work.
Thanks for any help. I'm fairly new to powershell scripting, so anything is much appreciated.
I have the same issue and searching online for solutions in a royal PITA.
I'd love to find a simple way to loop through all of the rows like you're doing.
$myData = Import-XLSX -Path "path to the file"
foreach ($row in $myData.Rows)
{
$row.ColumnName
}
But sadly something logical like that doesn't seem to work. I see examples online that use ForEach-Object and Where-Object which is cumbersome. So any good answers to the OP's question would be helpful for me too.
UPDATE:
Matthew, thanks for coming back and updating the OP with the solution you found. I appreciate it! That will help in the future.
For my current project, I went about this a different way since I ran into lack of good examples for Import-XLSX. It's just quick code to do a local task when needed, so it's not in a production environment. I changed var names, etc. to show an example:
$myDataField1 = New-Object Collections.Generic.List[String]
$myDataField2 = New-Object Collections.Generic.List[String]
# ...
$myDataField10 = New-Object Collections.Generic.List[String]
# PSExcel, the third party library, might want to install it first
Import-Module PSExcel
# Get spreadsheet, workbook, then sheet
try
{
$mySpreadsheet = New-Excel -Path "path to my spreadsheet file"
$myWorkbook = $mySpreadsheet | Get-Workbook
$myWorksheet = $myWorkbook | Get-Worksheet -Name Sheet1
}
catch { #whatever error handling code you want }
# calculate total number of records
$recordCount = $myWorksheet.Dimension.Rows
$itemCount = $recordCount - 1
# specify column positions
$r, $my1stColumn = 1, 1
$r, $my2ndColumn = 1, 2
# ...
$r, $my10thColumn = 1, 10
if ($recordCount -gt 1)
{
# loop through all rows and get data for each cell's value according to column
for ($i = 1; $i -le $recordCount - 1; $i++)
{
$myDataField1.Add($myWorksheet.Cells.Item($r + $i, $my1stColumn).text)
$myDataField2.Add($myWorksheet.Cells.Item($r + $i, $my2ndColumn).text)
# ...
$myDataField10.Add($myWorksheet.Cells.Item($r + $i, $my10thColumn).text)
}
}
#loop through all imported cell values
for ([int]$i = 0; $i -lt $itemCount; $i++)
{
# use the data
$myDataField1[$i]
$myDataField2[$i]
# ...
$myDataField10[$i]
}
Using Powershell, I need to loop through each cell of a column and check to see if that value exists in another loop through of .txt files, in any of the files then populate an additional column(s) in the spreadsheet indicating which .txt files contain the value(s)
So far, I have a Powershell script which looks through a set of files from a selected destination. The script allows you to choose filename and extension and content. This part is good because I want to be able to loop through a selection of files, but I want the $SearchText value that it looks for in the files, to be looped through, changing one by one to each value in column A of a Worksheet I have in Excel. Once it finds the value in any of the selected files that are being looped through, it'll add a column to the Worksheet giving the file name it has been found in. If it's found in more than one file, it'll add columns for each it finds.
clear;
$Files = "";
$FileNameEnds = "";
$FileExt = ".txt";
$SearchText = "";
Foreach ($file in [System.IO.Directory]::GetFiles($Files,"*"+"$FileNameEnds."+$FileExt, [System.IO.SearchOption]::AllDirectories))
{
$Textfile = [System.IO.File]::ReadAllLines($file);
$FileContains = $false;
foreach ($line in $Textfile.Split([System.Environment]::NewLine))
{
if($line.Contains($SearchText))
{
$FileContains = $true;
}
}
if ($FileContains)
{
Write-Host "Contains:" $file;
Write-Host $FileContains
}
}
Write-Host "Done";
I want a successful double loop, one for setting the $SearchText value to each of the cells in a column from an Excel Worksheet I have, one by one, then use that value to search through a loop of files in a given destination, which is the bit I already have code for.
I am attempting to automate the process of adding a worksheet (with data) per clientname in excel for a monthly report type workbook
I thought it should be straight forward... but the method I am using isnt working.... it doesn't even get to the sheet making mode... can you help me figure out what I did wrong?
The following is the function I made
function Excelclientstatstemplate ($clients) {
$Exl = New-Object -ComObject "Excel.Application"
$Exl.Visible = $false
$Exl.DisplayAlerts = $false
$WB = $Exl.Workbooks.Open($excelmonthlysummary)
$clientws = $WB.worksheets | where {$_.name -like "*$clients*"}
#### Check if Clients worksheet exists, if no then make one with client name ###
$sheetcheck = if (($clientws)) {} Else {
$WS = $WB.worksheets.add
$WS.name = "$clients"
}
$sheetcheck
$WB.Save
# Enter stat labels
$clientws.cells.item(1,1) = "CPU Count"
$clientws.cells.item(2,1) = "RAM"
$clientws.cells.item(3,1) = "Reserved CPU"
$clientws.cells.item(4,1) = "Reserved RAM"
### Put in Values in the next column ###
$clientws.cells.item(1,2) = [int]($cstats.cpuAllocationGHz/2)
$clientws.cells.item(2,2) = [decimal]$cstats.memoryLimitGB
$clientws.cells.item(3,2) = [int]($cstats.rescpuAllocationGHz/2)
$clientws.cells.item(4,2) = [decimal]$cstats.resmemoryLimitGB
$WB.save
$Exl.quit()
Stop-Process -processname EXCEL
Start-Sleep -Seconds 1
Echo "$clients excel sheet in monthly summary is done.."
}
and then I tried to make a Foreach thing for it
$clientxlmonthlywrite = Foreach ($client in $clientlist){
$cstats = $Combinedstats | Where {$_.Group -eq "$client"}
Excelclientstatstemplate -clients $client
}
The entire Process of the function goes
Take client name
Open a particular excel workbook
Check if there are any sheets with client name
If there are NO sheets with client name, make one with client name
Fill The first column Cells with labels
Fill the second column cells with data (data works I already write CSVs withem)
Save and exit
The Foreach variable just does the function for each of Clients names from a clientlist (nothing wrong with clientlist)
Am I messing something up?
Thanks for the help.
You are not calling the .Add() method correctly. You are missing the parenthesis at the end of it. To fix it you should be able to simply modify the line to this:
$WS = $WB.worksheets.add()
Also, the cells have properties that you should refer to, so I would also modify the part that sets your cell values to something like this:
# Enter stat labels
$clientws.cells.item(1,1).value2 = "CPU Count"
$clientws.cells.item(2,1).value2 = "RAM"
$clientws.cells.item(3,1).value2 = "Reserved CPU"
$clientws.cells.item(4,1).value2 = "Reserved RAM"
### Put in Values in the next column ###
$clientws.cells.item(1,2).value2 = [int]($cstats.cpuAllocationGHz/2)
$clientws.cells.item(2,2).value2 = [decimal]$cstats.memoryLimitGB
$clientws.cells.item(3,2).value2 = [int]($cstats.rescpuAllocationGHz/2)
$clientws.cells.item(4,2).value2 = [decimal]$cstats.resmemoryLimitGB
I'm fairly sure that defining the type is pointless, since to Excel they're all strings until you set the cell's formatting settings to something else. I could be wrong, but that is the behavior that I have observed.
Now, for other critiques that you didn't ask for... Don't launch Excel, open the book, save the book, and close Excel for each client. Open Excel once at the beginning, open the book, make your updates for each client, and then save, and close.
Test to see if the client has a sheet, and add it if needed, then select the client's sheet afterwords. Right now there's nothing there to set $clientws if you have to add one for that client.
Adding a worksheet by default places it before the active worksheet. This was a poor choice in design in my opinion, but it is what it is. If it were me I'd add new sheets specifying the last worksheet in the workbook, which will add the new worksheet before the last one, making it the second to the last worksheet. Then I'd move the last worksheet up in front of the new one, effectively adding the new worksheet as the last one listed. Is it possible to add the new worksheet as the last one when you make it? Yes, but it's was too complicated for my taste. See here if you are interested in doing that.
When testing for an existing client worksheet to make one if it is missing, do that, don't tell it to test for something, and do nothing, and put everything you want in an Else statement. That just complicates things. All that said, here's some of those suggestions put into practice:
function Excelclientstatstemplate ($clients) {
#### Check if Clients worksheet exists, if no then make one with client name ###
if (($clients -notin $($WB.worksheets).Name)){
#Find the current last sheet
$LastSheet = $WB.Worksheets|Select -Last 1
#Make a new sheet before the current last sheet so it's near the end
$WS = $WB.worksheets.add($LastSheet)
#Name it
$WS.name = "$clients"
#Move the last sheet up one spot, making the new sheet the new effective last sheet
$LastSheet.Move($WS)
}
#Find the current client sheet regardless of if it existed before or not
$clientws = $WB.worksheets | where {$_.name -like "*$clients*"}
# Enter stat labels
$clientws.cells.item(1,1).value2 = "CPU Count"
$clientws.cells.item(2,1).value2 = "RAM"
$clientws.cells.item(3,1).value2 = "Reserved CPU"
$clientws.cells.item(4,1).value2 = "Reserved RAM"
### Put in Values in the next column ###
$clientws.cells.item(1,2).value2 = [int]($cstats.cpuAllocationGHz/2)
$clientws.cells.item(2,2).value2 = [decimal]$cstats.memoryLimitGB
$clientws.cells.item(3,2).value2 = [int]($cstats.rescpuAllocationGHz/2)
$clientws.cells.item(4,2).value2 = [decimal]$cstats.resmemoryLimitGB
Start-Sleep -Seconds 1
Echo "$clients excel sheet in monthly summary is done.."
}
$Exl = New-Object -ComObject "Excel.Application"
$Exl.Visible = $false
$Exl.DisplayAlerts = $false
$WB = $Exl.Workbooks.Open($excelmonthlysummary)
$clientxlmonthlywrite = Foreach ($client in $clientlist){
$cstats = $Combinedstats | Where {$_.Group -eq "$client"}
Excelclientstatstemplate -clients $client
}
$WB.save
$Exl.quit()
Stop-Process -processname EXCEL
The following PowerShell snippet will list all worksheets and named ranges in an excel spreadsheet via OleDbConnection.GetOleDbSchemaTable():
$file = "C:\Users\zippy\Documents\Foo.xlsx";
$cnStr = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=`"$($file)`";Extended Properties=`"Excel 12.0 Xml;HDR=YES`";";
$cn = New-Object System.Data.OleDb.OleDbConnection $cnStr;
$cn.Open();
# to list the sheets
$worksheets = $cn.GetOleDbSchemaTable([System.Data.OleDb.OleDbSchemaGuid]::Tables,$null);
$cn.Close();
$cn.Dispose();
$worksheets | Format-List;
This will however, not list tables (called lists in Excel 2003), or a named range that refers to a list.
If I pass an OleDbSchemaGuid of type Procedures or Views I get a MethodInvocationException with a message of Operation is not supported for this type of object.
Is this possible to list the tables by tweaking with the connection strings or restrictions parameter?
Try this simple source:
using (var connection = (OleDbConnection)GetConnection())
{
connection.Open();
var dt = connection.GetSchema("TABLES");
var list=dt.Select().Where(w => w["TABLE_NAME"].ToString()).ToList();
//TODO:
}