For example, I'm looking to determine the following logic:
if (B2 in A1:B20) # if cell B2 is within the range A1:B20
{
return $true
}
Is there a function within excel that can be used for something like this? I read about =COUNTIF() function but was not able to get it working. Again this is using an Excel COM object within Powershell.
Thanks
Since cell names are basically coordinates, this is purely a question of arithmetic comparison, no need to involve Excel itself:
function Test-CellInRange
{
param(
[ValidatePattern('^[A-Z]+\d+$')]
[string]$Cell,
[ValidatePattern('^[A-Z]+\d+\:[A-Z]+\d+$')]
[string]$Range
)
# Grab X and Y coordinates from Range input, sort in ascending order (low to high)
$P1,$P2 = $Range -split ':'
$Xpoints = ($P1 -replace '\d'),($P2 -replace '\d') |Sort-Object
$Ypoints = ($P1 -replace '\D'),($P2 -replace '\D') |Sort-Object
# Grab X and Y coordinate from cell
$CellX = $Cell -replace '\d'
$CellY = $Cell -replace '\D'
# Test whether cell coordinates are within range
return ($CellX -ge $Xpoints[0] -and $CellX -le $Xpoints[1] -and $CellY -ge $Ypoints[0] -and $CellY -le $Ypoints[1])
}
Use it like:
if(Test-CellInRange -Cell B2 -Range A1:B20){
"B2 is in A1:B20"
}
I'm not sure about the COM interface (never used it), but if you have access to the INTERSECT Method then you could write something like this:
If Not Application.Intersect(Range("B2"), Range("A1:B20")) Is Nothing Then
CODE_IF_TRUE
End If
It just does a set intersection of the two ranges. If they don't intersect, then the one is definitely not a subset of the other. If you need to check for a proper subset, you'd have to get more creative and check whether the intersection was the same as the entire desired subset. Remember your set logic and check out the UNION Method - between these things you should be able to handle any sort of operations you want.
Related
I have an excel file of new-hires for the company.
Firstly I need to hide all the columns that will be used for searching users.
That was pretty simple and I managed to do it. Now I'm left only with the columns I really need.
Now is the real problem:
I need to filter the data and then import those usernames to the PowerShell array.
So in excel it looks like this:
Then I have the function:
Function GetUsernames ($WorkSheet) {
$userName = $WorkSheet.UsedRange.Rows.Columns["UserNameColumn"].Value2
return $userName
}
But it's returning all of the records in the Username column - 651 records instead of 476.
The function is waiting for my input after I format the excel file manually.
Any directions will be appreciated! :)
What you seek is all the values from rows in a certain column that are not hidden in the Excel file.
To get those, you need to go through the Rows of the selected column.
In my Office version 2016, I cannot reference a column directly by its name, so I have extended your function to first find the column index.
Also, I have renamed the function a bit to follow the Verb-Noun convention in PowerShell
function Get-Usernames ($WorkSheet, $Column) {
# for me (office 2016) I cannot reference a Column by its name
# using Columns["UserNameColumn"], so I have to find the index first
$index = 0
if ($Column -is [int]) {
$index = $Column
}
else {
for ($col = 1; $col -le $WorkSheet.UsedRange.Columns.Count; $col++) {
$name = $WorkSheet.Cells.Item(1, $col).Value() # assuming the first row has the headers
if ($name -eq $Column) {
$index = $col
break
}
}
}
if ($index -gt 0) {
# now return the values in the columns for the rows that are not hidden
# skip the first row, because that is the column name itself
($WorkSheet.UsedRange.Rows.Columns($index).Rows | Select-Object -Skip 1 | Where-Object { !$_.hidden }).Value2
}
}
You can now use the function in your script like this:
$userNames = Get-Usernames $workbook.Worksheets(1) "UserNameColumn"
I'm working on a PS script to take a row of data from an Excel spreadsheet and populate that data in certain places in a Word document. To elaborate, we have a contract tracking MASTER worksheet that among other things contains data such as name of firm, address, services, contact name. Additionally, we have another TASK worksheet in the same workbook that tracks information such as project owner, project name, contract number, task agree number.
I'm writing a script that does the following:
Ask the user through a message box what kind of contract is being written ("Master", or "Task")
Opens the workbook with the appropriate worksheet opened ("Master" tab or "Task" tab)
Asks the user through a VB InputBox from which Excel row of data they want to use to populate the Word contract
Extracts that row of data from Excel
Outputs certain portions of that row of data to certain location in a Word document
Saves the Word document
Opens the Word document so the user can continue editing it
My question is this - using something like PSExcel, how do I extract that row of data out to variables that can be placed in a Word document. For reference, in case you're going to reply with a snippet of code, here are what the variables are defined as for the Excel portion my script:
$Filepath = "C:\temp\ContractScript\Subconsultant Information Spreadsheet.xlsx"
$Excel = New-Object -ComObject Excel.Application
$Workbook = $Excel.Workbooks.Open($Filepath)
$Worksheet = $Workbook.sheets.item($AgreementType)
$Excel.Visible = $true
#Choosing which row of data
[int]$RowNumber = [Microsoft.VisualBasic.Interaction]::InputBox("Enter the row of data from $AgreementType worksheet you wish to use", "Row")
Additionally, the first row of data in the excel worksheets are the column headings, in case it matters.
I've gotten this far so far:
import-module psexcel
$Consultant = new-object System.Collections.Arraylist
foreach ($data in (Import-XLSX -path $Filepath -Sheet $AgreementType -RowStart $RowNumber))
{
$Consultant.add($data)'
But I'm currently stuck because I can't figure out how to reference the data being added to $consultant.$data. Somehow I need to read in the column headings first so the $data variable can be defined in some way, so when I add the variable $consultant.Address in Word it finds it. Right now I think the variable name is going to end up "$Consultant.1402 S Broadway" which obviously won't work.
Thanks for any help. I'm fairly new to powershell scripting, so anything is much appreciated.
I have the same issue and searching online for solutions in a royal PITA.
I'd love to find a simple way to loop through all of the rows like you're doing.
$myData = Import-XLSX -Path "path to the file"
foreach ($row in $myData.Rows)
{
$row.ColumnName
}
But sadly something logical like that doesn't seem to work. I see examples online that use ForEach-Object and Where-Object which is cumbersome. So any good answers to the OP's question would be helpful for me too.
UPDATE:
Matthew, thanks for coming back and updating the OP with the solution you found. I appreciate it! That will help in the future.
For my current project, I went about this a different way since I ran into lack of good examples for Import-XLSX. It's just quick code to do a local task when needed, so it's not in a production environment. I changed var names, etc. to show an example:
$myDataField1 = New-Object Collections.Generic.List[String]
$myDataField2 = New-Object Collections.Generic.List[String]
# ...
$myDataField10 = New-Object Collections.Generic.List[String]
# PSExcel, the third party library, might want to install it first
Import-Module PSExcel
# Get spreadsheet, workbook, then sheet
try
{
$mySpreadsheet = New-Excel -Path "path to my spreadsheet file"
$myWorkbook = $mySpreadsheet | Get-Workbook
$myWorksheet = $myWorkbook | Get-Worksheet -Name Sheet1
}
catch { #whatever error handling code you want }
# calculate total number of records
$recordCount = $myWorksheet.Dimension.Rows
$itemCount = $recordCount - 1
# specify column positions
$r, $my1stColumn = 1, 1
$r, $my2ndColumn = 1, 2
# ...
$r, $my10thColumn = 1, 10
if ($recordCount -gt 1)
{
# loop through all rows and get data for each cell's value according to column
for ($i = 1; $i -le $recordCount - 1; $i++)
{
$myDataField1.Add($myWorksheet.Cells.Item($r + $i, $my1stColumn).text)
$myDataField2.Add($myWorksheet.Cells.Item($r + $i, $my2ndColumn).text)
# ...
$myDataField10.Add($myWorksheet.Cells.Item($r + $i, $my10thColumn).text)
}
}
#loop through all imported cell values
for ([int]$i = 0; $i -lt $itemCount; $i++)
{
# use the data
$myDataField1[$i]
$myDataField2[$i]
# ...
$myDataField10[$i]
}
I have two computers, one with windows7 and one with windows10. Both computers use Excel 15.0.4753.1003.
The following script fails on Windows10:
function write-toexcelrange(){
param(
#The range should be a cell in the upper left corner where you want to "paste" your data
[ValidateNotNullOrEmpty()]
$Range,
# data should be in the form of a jagged multiarray ("row1Column1","row2column2"),("row2column1","row2column2")
# if data is a simple array of values, it will be interpreted as 1 column with multiple rows
# Rows can differ in length
[validatenotnullorempty()]
[array]$data
)
$rows=0
$cols=0
if($data -is [array]) {
foreach($row in $data){
$rows++
$cols=[math]::max($cols,([array]$row).length)
}
#Create multiarray
$marr=new-object 'string[,]' $rows,$cols
for($r=0;$r -lt $marr.GetLength(0);$r++) {
for($c=0;$c -lt $marr.GetLength(1);$c++) {
$marr[$r,$c]=[string]::Empty
}
}
for($r=0;$r -lt $rows;$r++) {
if($data[$r] -is [array]){
for($c=0;$c -lt ([array]$data[$r]).length;$c++) {
$marr[$r,$c]=$data[$r][$c].ToString()
}
} else {
$marr[$r,0]=$data[$r].ToString()
}
}
$wrr=$range.resize($rows,$cols)
$wrr.value2=$marr
} else {
$wrr=$range
$wrr.value2=$data
}
#Return the range written to
$wrr
}
$excel = New-Object -ComObject Excel.Application
$excel.visible = $true
$defaultsheets=$excel.SheetsInNewWorkbook
$excel.SheetsInNewWorkbook=1
$wb = $Excel.Workbooks.add()
$excel.SheetsInNewWorkbook=$defaultsheets
$mysheet = $wb.worksheets.item(1)
$mysheet.name = "test"
write-toexcelrange -Range $mysheet.range("A1") -data $exceldata|out-null
With the following error:
Unable to cast object of type 'System.String[,]' to type 'System.String'.
At C:\data\rangetest.ps1:38 char:9
+ $wrr.value2=$marr
+ ~~~~~~~~~~~~~~~~~
+ CategoryInfo : OperationStopped: (:) [], InvalidCastException
+ FullyQualifiedErrorId : System.InvalidCastException
It appears as if the value2 property behaves differently in Windows10 which is weird considering it´s the same version of excel.
Now to the question:
Is there a fix/workaround to getting the data into the cells, which does not involve looping through all the cells.
Update 1
It was suggested by Grade 'Eh' Bacon that I try the .Formula property. It Works! I also noted that Windows10 uses Powershell v5 while my Windows7 has Powershell v4.
Since that worked for you I'll flesh it out as an answer. To summarize, pay attention to the differences between .text, .value, .value2, and .formula [or .formulaR1C1]. See discussion of the first 3 here:
What is the difference between .text, .value, and .value2?
And discussion of .Formula here:
Can Range.Value2 & Range.Formula have different values in C#, not VBA?
Without getting into why any of these can have different values (in short, formatting and other metadata can have an impact on some of those options in different ways, depending on what type of entry is made to a given cell), after reading those Q&As above, I just always use Formula when referring to what's inside a cell. In most cases, that's what you likely want VBA to look at anyway. Changing .value2 to .formula seems to work here, although I have no idea why that would be the case between Windows versions.
I wonder if there is any way to speed up reading an Excel file with powershell. Many would say I should stop using the do until, but the problem is I need it badly, because in my Excel sheet there can be 2 rows or 5000 rows. I understand that 5000 rows needs some time. But 2 rows shouldn't need 90sec+.
$Excel = New-Object -ComObject Excel.Application
$Excel.Visible = $true
$Excel.DisplayAlerts = $false
$Path = EXCELFILEPATH
$Workbook = $Excel.Workbooks.open($Path)
$Sheet1 = $Workbook.Worksheets.Item(test)
$URows = #()
Do {$URows += $Sheet1.Cells.Item($Row,1).Text; $row = $row + [int] 1} until (!$Sheet1.Cells.Item($Row,1).Text)
$URows | foreach {
$MyParms = #{};
$SetParms = #{};
And i got this 30 times in the script too:
If ($Sheet1.Cells.Item($Row,2).Text){$var1 = $Sheet1.Cells.Item($Row,2).Text
$MyParms.Add("PAR1",$var1)
$SetParms.Add("PAR1",$var1)}
}
I have the idea of running the $MyParms stuff contemporarily, but I have no idea how. Any suggestions?
Or
Increase the speed of reading, but I have no clue how to achieve that without destroying the "read until nothing is there".
Or
The speed is normal and I shouldn't complain.
Don't use Excel.Application in the first place if you need speed. You can use an Excel spreadsheet as an ODBC data source - the file is analogous to a database, and each worksheet a table. The speed difference is immense. Here's an intro on using Excel spreadsheets without Excel
Appending to an array with the += operator is terribly slow, because it will copy all elements from the existing array to a new array. Use something like this instead:
$URows = for ($row = 1; !$Sheet1.Cells.Item($row, 1).Text; $row++) {
if ($Sheet1.Cells.Item($Row,2).Text) {
$MyParms['PAR1'] = $Sheet1.Cells.Item($Row, 2).Text)
$SetParms['PAR1'] = $Sheet1.Cells.Item($Row, 2).Text)
}
$Sheet1.Cells.Item($Row,1).Text
}
Your Do loop is basically a counting loop. The canonical form for such loops is
for (init counter; condition; increment counter) {
...
}
so I changed the loop accordingly. Of course you'd achieve the same result like this:
$row = 1
$URows = Do {
...
$row += 1
}
but that would just mean more code without any benefits. This modification doesn't have any performance impact, though.
Relevant in terms of performance are the other two changes:
I moved the code filling the hashtables inside the first loop, so the code won't loop twice over the data. Using index and assignment operators instead of the Add method for assigning values to the hashtable prevents the code from raising an error when a key already exists in the hashtable.
Instead of appending to an array (which has the abovementioned performance impact) the code now simply echoes the cell text in the loop, which PowerShell automatically turns into a list. The list is then assigned to the variable $URows.
I have an xlsx file with thousands of entries
I can within a second filter a column to show only certain information with $workbook.AutoFilter("DATA")
This filter only takes a second however deleting all rows whos first column = "DATA" takes forever with a loop.
Is there a way to capture an array of the hidden rows or a range... or anything that I could .DELETE()
I tried this
[void] [Reflection.Assembly]::LoadWithPartialName( 'System.Windows.Forms' )
$Excel = New-Object -Com Excel.Application
$WorkBook = $Excel.Workbooks.Open($filename)
$Excel.visible = $true
$Excel.selection.autofilter(1,"DATA")
$sheet = $workbook.Sheets.Item(1)
$max = $sheet.UsedRange.Rows.Count
for ($i=2; $i -le $max; $i++)
{
$row = $sheet.Cells.Item($i,1).EntireRow
if ($row.hidden -eq $false)
{
$row.Delete()
}
}
FIXED.. loop backwards $i-- *
However This failed me misserably because for some reason it leaves roughly 10% of the visabled rows undeleted. If I run it twice it works however scaling up this would become a bigger issue.
In a perfect world I would like something like this
$Excel.selection.autofilter(1,"DATA").DELETE()
Thanks in advance for any hints or tricks you geniuses may have.
Update: Thanks Graimer, you are right I have to loop in the other directions, this still takes quite some time with 10,000+ entries... I am looking for a way to do it without the manual loop.
If I go $Excel.visible = $true, and then $Excel.selection.autofilter(1,"DATA")... then as a user I ctrl+A and delete the selected rows... its quicker manually then the looping process... I cant help but think there MUST be some way to script that action.
Turned out to be pretty easy
after applying a fiter select a range from row1 to Lastrow, delete range.
Because the filter is only showing that one value the range cannot select hidden cells