Powershell partial string comparisson

Powershell partial string comparisson - string

I'm currently stuck on a specific comparisson problem. I have two CSV files which contain application names and I need to compare both csvs for matching names. Of course that would be easy if the applications were written the same ways in both csvs, but they're not.
Each csv has two columns but only the first column contains tha application names. In csv01 an app is called "Adobe Acrobat Reader DC Continuous MUI" while the same app in csv02 is called "Adobe Acrobat Reader DC v2022.002.20191". By looking at the files, I know both contain "Adobe Reader DC". But I'd like to automate th comparisson as the csvs contains hundreds of apps.
I initially thought I'd run a nested foreach loop, taking the first product in csv01 and comparing every app in csv02 to that product to see if I have a match. I did that by splitting the application names at each space character and came up with the following code:
# Define the first string
$Products01 = Import-CSV 'C:\Temp\ProductsList01.csv' -Delimiter ";"
# Define the second string
$Products02 = Import-CSV 'C:\Temp\ProductList02.csv' -Delimiter ";"
# Flag to track if all parts of string2 are contained within string1
$allPartsMatch = $true
# Create Hashtable for results
$MatchingApps = #{}
# Loop through each part of string2
foreach ($Product in $Products01.Product) {
Write-Host "==============================="
Write-Host "Searching for product: $Product"
Write-Host "==============================="
# Split the product name into parts
$ProductSplit = $Product -split " "
Write-Host "Split $Product into $ProductSplit"
foreach ($Application in $Products02.Column1) {
Write-Host "Getting comparisson app: $Application"
# Split the product name into parts
$ApplicationSplit = $Application -split " "
Write-Host "Split comparisson App into: $ApplicationSplit"
# Check if the current part is contained within string1
if ($ProductSplit -notcontains $ApplicationSplit) {
# If the current part is not contained within string1, set the flag to false
$allPartsMatch = $false
}
}
# Display a message indicating the result of the comparison
if ($allPartsMatch) {
Write-Host "==============================="
Write-Host "$Application is contained within $Product"
Write-Host "==============================="
$MatchingApps += #{Product01 = $Product; Product02 = $Application}
} else {
#Write-Host "$Application is not contained within $Product"
}
}
However, I seem to have a logic error in my thought process as this returns 0 matches. So obviously, the script isn't properly splitting or comparing the split items.
My question is - how do compare the parts of both app names to see if I have the apps in both csvs? Can I use a specific regex for that or do I need to approach the problem differently?
Cheers,
Fred
I tried comparing both csv files for similar product names. I expected a table of similar product names. I received nothing.

The basis for "matching" one string to another is that they share a prefix - so start by writing a small function that extracts the common prefix of 2 strings, we'll need this later:
function Get-CommonPrefix {
param(
[string]$A,
[string]$B
)
# start by assuming the strings share no common prefix
$prefixLength = 0
# the maximum length of the shared prefix will at most be the length of the shortest string
$maxLength = [Math]::Min($A.Length, $B.Length)
for($i = 0; $i -lt $maxLength; $i++){
if($A[$i] -eq $B[$i]){
$prefixLength = $i + 1
}
else {
# we've reached an index with two different characters, common prefix stops here
break
}
}
# return the shared prefix
return $A.Substring(0, $prefixLength)
}
Now we can determine the shared prefix between two strings:
PS ~> $sharedPrefix = Get-CommonPrefix 'Adobe Acrobat Reader DC Continuous MUI' 'Adobe Acrobat Reader DC v2022.002.20191'
PS ~> Write-Host "The shared prefix is '$sharedPrefix'"
The shared prefix is 'Adobe Acrobat Reader DC '
Now we just need to put it to use in your nested loop:
# Import the first list
$Products01 = Import-CSV 'C:\Temp\ProductsList01.csv' -Delimiter ";"
# Import the second list
$Products02 = Import-CSV 'C:\Temp\ProductList02.csv' -Delimiter ";"
# now let's find the best match from list 2 for each item in list 1:
foreach($productRow in $Products01) {
# we'll use this to keep track of shared prefixes encountered
$matchDetails = [pscustomobject]#{
Index = -1
Prefix = ''
Product2 = $null
}
for($i = 0; $i -lt $Products02.Count; $i++) {
# for each pair, start by extracting the common prefix and see if we have a "better match" than previously
$commonPrefix = Get-CommonPrefix $productRow.Product $Products02[$i].Product
if($commonPrefix.Length -gt $matchDetails.Prefix.Length){
# we found a better match!
$matchDetails.Index = $i
$matchDetails.Prefix = $commonPrefix
$matchDetails.Product2 = $Products02[$i]
}
}
if($matchDetails.Index -ge 0){
Write-Host "Best match found for '$($productRow.Product)': '$($matchDetails.Product2.Product)' "
# put code that needs to work on both rows here ...
}
}
Note: in cases where multiple entries in the second list matches the same-length prefix from the first list, the code simply picks the first match.

Related

Import multiples data of column from xlsx to a powershell cmd

In input :
i have an \users\myself\desktop\test\file.xslx containing multiples column like this :
ColumnA ColumnB ... ColumnQ (for a total of 17 columns)
each column have some data.
In output :
I would like to have a cmd like this :
New-ADUser -Name $(columnAdata) -GivenName "$(columnBdata)" -Surname "$(columnCdata)" -DisplayName "$(columnDdata)" -SamAccountName "$(columnEdata)" ... etc until -blabla "$(ColumnQdata)"
Is that possible to store de columndata in variables to insert them in a command ?
Thanks a lot.

I would suggest to first change the column headers to be the same as the parameters you intend to use with the New-ADUser cmdlet.
Having matching headers would help greatly in not making mistakes.
Next, save your Excel file as CSV, let's say a file called NewUsers.csv
The code then can be quite simple and easy to maintain:
# import the CSV file using the same separator character as Excel uses on your system
Import-Csv -Path 'X:\NewUsers.csv' -UseCulture | ForEach-Object {
# use Splatting: create a Hashtable with all properties needed taken from the CSV
# see: https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_splatting
$userProperties = #{
Name = $_.Name # as opposed to $_.columnAdata
GivenName = $_.GivenName # as opposed to $_.columnBdata
Surname = $_.Surname
DisplayName = $_.DisplayName
SamAccountName = $_.SamAccountName
# etcetera
}
New-ADUser #userProperties
}

Compare two Excel-files in Powershell

I need help comparing two Excel files in Powershell.
I have an Excel-file which contains 6 000 rows and 4-5 columns with headers:
"Number" "Name" "Mobile data".
Let's call it: $Services
Now, I want to compare that file with other Excel-files. For example:
one file containing 50 rows with header columns: "Number", "Name", etc.
Let's call it $Department
The important thing is that in $Services, it contains more important columns like "Mobile data",
so my mission is to compare column: "Number" from $Services with column "Number" from each other Excel file.
Then if they match, write "the whole row" from $Services
I'm not that familiar with Excel, so I thought, this should be possible to do in Powershell.
I'm novice in Powershell, so I only know basic stuff. I'm not that familiar with pscustomobject and param.
Anyway, what I tried to do was to first declare them in variables with ImportExcel:
$Services = Import-Excel -Path 'C:\Users\*.xlsx'
$Department = Import-Excel -Path 'C:\Users\*.xlsx'
Then I made a foreach statement:
foreach ($Service in $Services) {
if (($Service).Number -like ($Department).Number)
{Write-Output "$Service"}
}
The problem with this is that it is collecting all empty columns from ($Services).Number and writing the output of each row in $Services.
I tried to add a nullorEmpty to $Department, if the .Number is empty, but it didn't make any difference. I also tried to add that if the row is empty in .Number, add "1234", but still it collects all .Number that is empty in $Services.
I also tried to do a: $Services | ForEach-Object -Process {if (($_).Number -match ($Department).Number)
{Write-Output $_}} But it didn't match any. When I tried -notmatch it took all.
I don't know but it seems that I have to convert the files to objects, like the columns to object so each string becomes an object. But right now my head is just spinning and I need some hints on where I can start with this.

I would recommend downloading the Module ImportExcel from the PSGallery.
Import-Excel can easily import your Excel sheet(s) to rows of objects, especially if your sheets are 'clean', i.e., only contain (optional) headers and data rows.
Simply import the cells to PowerShell objects and use Compare-Object to discover differences.
EDIT (after reading the additional questions by poster in the comments):
To compare using specific properties you'll need to add these to the Compare-Object parameters.
Using a trivial "PSCustomObject" to create a simple set of objects to show this idea it might look like this:
$l = 1..4 | ForEach-Object { [pscustomobject]#{a=$_;b=$_+1} }
$r = 1,2,4,5 | ForEach-Object { [pscustomobject]#{a=$_;b=$_+1} }
compare-object $l $r -Property B
B SideIndicator
- -------------
6 =>
4 <=
You may also compare multiple properties this way:
compare-object $l $r -Property A,B
A B SideIndicator
- - -------------
5 6 =>
3 4 <=
FYI: I find myself typing "Get-Command -Syntax SomeCommand" so often every day that I just made a function "Get-Syntax" (which also expands aliases) and then aliased this to simply "syn".
90% of the time once you understand the structure of PowerShell cmdlets (at least well-written ones) there is no need to even look at the full help -- the "syntax" blocks are sufficient.
Until then, type HELP (Get-Help) a lot -- 100+ times per day. :)

So the solution for my whole problem was to add -PassThru.
Because my mission was to compare the numbers of the two Excel-files, select the numbers that equals and then take all the properties from one file. So my script became like this:
$Compare = Compare-Object $Services $Department -Property Numbers -IncludeEqual -ExcludeDifferent -PassThru
$Compare | Export-Excel -Path 'C:\Users\*
But I wonder, -PassThru sends all the objects from ReferenceObject, how can I send all the objects from DifferenceObject?

Split string in PowerShell by pattern

I have a fairly long string in PowerShell that I need to split. Each section begins with a date in format mm/dd/yyyy hh:mm:ss AM. Essentially what I am trying to do is get the most recent message in the string. I don't need to keep the date/time part as I already have that elsewhere.
This is what the string looks like:
10/20/2018 1:22:33 AM
Some message the first one in the string
It can be several lines long
With multiple line breaks
But this is still the first message in the string
10/21/2018 4:55:11 PM
This would be second message
Same type of stuff
But its a different message
I know how to split a string on specific characters, but I don't know how on a pattern like date/time.

Note:
The solution below assumes that the section are not necessarily chronologically ordered so that you must inspect all time stamps to determine the most recent one.
If, by contrast, you can assume that the last message is the most recent one, use LotPings' much simpler answer.
If you don't know ahead of time what section has the most recent time stamp, a line-by-line approach is probably best:
$dtMostRecent = [datetime] 0
# Split the long input string ($longString) into lines and iterate over them.
# If input comes from a file, replace
# $longString -split '\r?\n'
# with
# Get-Content file.txt
# If the file is large, replace the whole command with
# Get-Content file.txt | ForEach-Object { ... }
# and replace $line with $_ in the script block (loop body).
foreach ($line in $longString -split '\r?\n') {
# See if the line at hand contains (only) a date.
if ($dt = try { [datetime] $line } catch {}) {
# See if the date at hand is the most recent so far.
$isMostRecent = $dt -ge $dtMostRecent
if ($isMostRecent) {
# Save this time stamp as the most recent one and initialize the
# array to collect the following lines in (the message).
$dtMostRecent = $dt
$msgMostRecentLines = #()
}
} elseif ($isMostRecent) {
# Collect the lines of the message associated with the most recent date.
$msgMostRecentLines += $line
}
}
# Convert the message lines back into a single, multi-line string.
# $msgMostRecent now contains the multi-line message associated with
# the most recent time stamp.
$msgMostRecent = $msgMostRecentLines -join "`n"
Note how try { [datetime] $line } catch {} is used to try to convert a line to a [datetime] instance and fail silently, if it can't, in which case $dt is assigned $null, which in a Boolean context is interpreted as $False.
This technique works irrespective of the culture currently in effect, because PowerShell's casts always use the invariant culture when casting from strings, and the dates in the input are in one of the formats the invariant culture understands.
By contrast, the -as operator, whose use would be more convenient here - $dt =$line -as [datetime] - unexpectedly is culture-sensitive, as Esperento57 points out.
This surprising behavior is discussed in this GitHub issue.

Provided the [datetime] sections are ascending,
it should be sufficient to split on them with a RegEx and get the last one
((Get-Content .\test.txt -Raw) -split "\d+/\d+/\d{4} \d+:\d+:\d+ [AP]M`r?`n")[-1]
Output based on your sample string stored in file test.txt
This would be second message
Same type of stuff
But its a different message

you can split it by timestamp pattern like this:
$arr = $str -split "[0-9]{1,2}/[0-9]{1,2}/[0-9]{1,4} [0-9]{1,2}:[0-9]{1,2}:[0-9]{1,2} [AaPp]M\n"

To my knowledge you can't use any of the static String methods like Split() for this. I tried to find a regular expression that would handle the entire thing, but wasn't able to come up with anything that would quite break it up properly.
So, you'll need to go line by line, testing to see if it that line is a date, then concatenate the lines in between like the following:
$fileContent = Get-Content "inputFile.txt"
$messages = #()
$currentMessage = [string]::Empty
foreach($line in $fileContent)
{
if ([Regex]::IsMatch($line, "\d{1,2}/\d{1,2}/\d{4} \d{1,2}:\d{2}:\d{2} (A|P)M"))
{
# The current line is a date, the current message is complete
# Add the current message to the output, and clear out the old message
# from your temporary storage variable $currentMessage
if (-not [string]::IsNullOrEmpty($currentMessage))
{
$messages += $currentMessage
$currentMessage = [string]::Empty
}
}
else
{
# Add this line to the message you're building.
# Include a new line character, as it was stripped out with Get-Content
$currentMessage += "$line`n"
}
}
# Add the last message to the output
$messages += $currentMessage
# Do something with the message
Write-Output $messages
As the key to all of this is recognizing that a given line is a date and therefore the start of a message, let's look a bit more at the regex. "\d" will match any decimal character 0-9, and the curly braces immediately following indicate the number of decimal characters that need to match. So, "\d{1,2}" means "look for one or two decimal characters" or in this case the month of the year. We then look for a "/", 1 or 2 more decimal characters - "\d{1,2}", another "/" and then exactly 4 decimal characters - "\d{4}". The time is more of the same, with ":" in between the decimal characters instead of "/". At the end, there will either be "AM" or "PM" so we look for either an "A" or a "P" followed by an "M", which as a regular expression is "(A|P)M".
Combine all of that, and you get "\d{1,2}/\d{1,2}/\d{4} \d{1,2}:\d{2}:\d{2} (A|P)M" to determine if you have a date on that line. I believe it would also be possible to use[DateTime]::Parse() to determine if the line is a date, but then you wouldn't get to have fun with Regex's and would need a try-catch. For more info on Regex's in Powershell (which are just the .NET regex) see .NET Regex Quick Reference

Powershell string manipulation and replace

I have a powershell string which can contain multiple email address, for example below is a exmaple that contains two email ids. In that i have two scenarios.
1) Scenario One where the #gmail.com is consistent
strEmail=john.roger#gmail.com,smith.david#gmail.com
2) Secnario second: where the #mail.com could be different
strEmail2 = john.roger#gmail.com,smith.david#outlook.com
I need to get rid of anything after # including it.
So for result for
scenario (1) will be: john.roger,smith.david
Scenario (2) will be: john.roger,smith.david
SO for Scenarion(1) i can use replace with "hardcoded" value of "#gmail.com", How about second secnario.
I am looking for some solution which will work for both scenarion... like something in Regex or any other way i don't know.

Splitting and joining would return the names on one line
Following
$strEmail = "john.roger#gmail.com,smith.david#outlook.com"
($strEmail -split "," | % {($_ -split "#")[0]}) -join ","
returns
john.roger,smith.david
Breakdown
$strEmail -split "," returns an array of two elements
[0] john.roger#gmail.com
[1] smith.david#outlook.com
% {($_ -split "#")[0]} loops over the array
and splits each item into an array of two elements
[0] john.roger
[1] gmail.com
[0] smith.david
[1] outlook.com
and returns the first element [0] from each array
- join "," joins each returned item into a new string

Both of these should work.
This will print each name on a new line:
$strEmail = "john.roger#gmail.com,smith.david#outlook.com"
$strEmail = $strEmail.Split(',') | Foreach {
$_.Substring(0, $_.IndexOf('#'))
}
$strEmail
This will give you the same output as you outlined above:
$strEmail = "john.roger#gmail.com,smith.david#outlook.com"
$strEmailFinal = ""
$strEmail = $strEmail.Split(',') | Foreach {
$n = $_.Substring(0, $_.IndexOf('#'))
$strEmailFinal = $strEmailFinal + $n + ","
}
$strEmailFinal.TrimEnd(',')

Another approach... Well, if you like RegEx of course
Clear-Host
$SomeEmailAddresses = #'
1) Scenario One where the #gmail.com is consistent
strEmail=john.roger#gmail.com,smith.david#gmail.com
2) Secnario second: where the #mail.com could be different
strEmail2 = john.roger#gmail.com,smith.david#outlook.com
'#
((((Select-String -InputObject $SomeEmailAddresses `
-Pattern '\w+#\w+\.\w+|\w+\.\w+#\w+\.\w+|\w+\.\w+#\w+\.\w+\.\w+' `
-AllMatches).Matches).Value) -replace '#.*') -join ','
Results
john.roger,smith.david,john.roger,smith.david
Just comment out or delete the -join for one per line

Find first available serial based on list of strings?

Given a list of strings such as: apple01, apple02, and apple04, banana02, cherry01, how would you come up with the first available serial number of each type -- that is, apple03 if I ask about apple, or banana01 if I ask about banana, and cherry02 if I ask about cherry?
I'm tasked with automating the creation of Azure VM's, and these strings are actually host names of existing VM's, as reported by the Azure Powershell command (Get-AzureRmResourceGroupDeployment -ResourceGroupName "$azureResGrpName2").DeploymentName (or anything effectively similar).
Update: Here's my working code:
$rgdNames = (Get-AzureRmResourceGroupDeployment -ResourceGroupName "$azureResGrpName").DeploymentName
$siblings = $rgdNames | Where-Object{$_ -match "^($hostname)(\d+)$" }
if ($siblings) {
# Make a list of all numbers in the list of hostnames
$serials = #()
foreach ($sibling in $siblings) {
# $sibling -split ($sibling -split '\d+$') < split all digits from end, then strip off everything at the front
# Then convert it to a number and add that to $serials
$hostnumber = [convert]::ToInt32([string]$($sibling -split ($sibling -split '\d+$'))[1], 10)
$serials += $hostnumber
}
foreach ($i in 1..$siblingsMax){ # Iterate over all valid serial numbers
if (!$serials.Contains($i)) { # Stop when we find a serial number that isn't in the list of existing hosts
$serial = $i
break
}
}
} else {
$serial = 1
}

This seems to do the trick!
## Create variable to store number in
$spareNumber = $null
## we are presuming that they have already been seperated into groups
$apples = #("apples01","apples002","apples3","apples15")
## create an empty array
$num = #()
## You know what a foreach is right? ;)
foreach ($apple in $apples)
{
## the hard working part
## [convert]:: toint32 converts to, you guessed it... (and adds it to $num)
## $apple -split ($apple -split '\d+$') < split all digits from end, then strip off everything at the front
$num += [convert]::ToInt32([string]$($apple -split ($apple -split '\d+$'))[1], 10)
}
## count from 1 to 10, and pass to foreach
(1..10) | foreach {
##'when we find one that isn't in $num, store it in sparenumber and break out of this joint.
if (!$num.Contains($_)) {
$spareNumber = $_
break
}
}
## and here is your number...
$spareNumber

The naive and straight-forward solution would be based on pre-generating a list of syntax valid names.
Query names that are already in use and store the results into a collection.
Iterate the used names collection and remove those from the pre-generated collection.
Sort the pre-generated collection.
Now you got a collection that contains unused names in sorted order. Pick any number of desired names for further usage.

This could be the start of what you are looking for. I took a very similar approach as PetSerAl has done in his comment. I have made mine more verbose as it helps to take in what is happening here. Most of the explanation comes from comments in the code.
# Array of fruits/index values. Using string array as proof on concept.
$fruits = "apple01","apple02","apple04","banana02","cherry01"
# Regex -match the fruit name and index number. This also filters out lines that do not match the standard.
$fruityArray = $fruits | Where-Object{$_ -match '^(.+?)(\d+)$' } | ForEach-Object{
# Create a simple object that splits up the matched info into a fruit index object
[pscustomobject][ordered]#{
Fruit = $matches[1]
Index = [int]$matches[2]
}
}
# Group by fruit and then we can find the next available index within the groups
$fruityArray | Group-Object Fruit | ForEach-Object{
# For this fruit determine the next available index
$thisFruitGroup = $_
# Determine the highest index value. We add one in case all indexes are present from 1 to highest
$highestPossibleIndex = ($thisFruitGroup.Group.Index | Measure-Object -Maximum).Maximum + 1
# Check all possible indexes. Locating all that are free but filter the first one out
$nextAvailableIndex = 1..$highestPossibleIndex | Where-Object{$_ -notin $thisFruitGroup.Group.Index} | Select -First 1
# Take the fruit and add a padded index then pass that value to the pipeline.
'{0}{1:00}'-f $thisFruitGroup.Name, $nextAvailableIndex
}
We take the array of fruits and create an object array of fruit and indexes. Group those together by fruit and then determine the next available index based on all available indexes for that fruit. We add one to the highest possible index on the chance that they are all in use (no gaps). This is the case for cherry.
apple03
banana01
cherry02
Alternatively you could output the results to a variable and call the fruit you need from there if you don't need the whole list as output.
# Group by fruit and then we can find the next available index within the groups
$availableIndexes = $fruityArray | Group-Object Fruit | ForEach-Object{
# For this fruit determine the next available index
$thisFruitGroup = $_
# Determine the highest index value. We add one in case all indexes are present from 1 to highest
$highestPossibleIndex = ($thisFruitGroup.Group.Index | Measure-Object -Maximum).Maximum + 1
# Check all possible indexes. Locating all that are free but filter the first one out
$nextAvailableIndex = 1..$highestPossibleIndex | Where-Object{$_ -notin $thisFruitGroup.Group.Index} | Select -First 1
# Take the fruit and add a padded index then pass that value to the pipeline.
[pscustomobject][ordered]#{
Fruit = $thisFruitGroup.Name
Index = $nextAvailableIndex
String = '{0}{1:00}'-f $thisFruitGroup.Name, $nextAvailableIndex
}
}
$availableIndexes | Where-Object{$_.Fruit -eq "Apple"} | Select-Object -ExpandProperty String
Which would net the output of:
apple03

if the name that occurs first is pure letters not including any number and than followed by a number . that would be easy and you can get the index of the first number or char that exists in that char list {1,2,3,4,5,6,7,8,9}
and than you can make substring(0,indexof(firstnumber))

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Powershell partial string comparisson - string

Related

Import multiples data of column from xlsx to a powershell cmd

Compare two Excel-files in Powershell

Split string in PowerShell by pattern

Powershell string manipulation and replace

Find first available serial based on list of strings?

Categories

Resources