-match fails when checking if a string contains a path - string

I have a list of strings, and need to check each item to see if it contains some string $path, where the string should contain a unc path and $path is also a unc path.
For example:
"RW \\test" -match "\\test"
returns True, as \\test is contained in RW \\test. Great.
So why does this return False ? :
"RW \\test\te" -match "\\test\te"
At first I though maybe the single backslash is somehow acting as an escape character (even though in PowerShell that should be `)
So I tried
"RW \\test\\te" -match "\\test\\te"
But this also returns False ....
Why?

You need to escape both of the backslashes with backslashes in your regular expression on the right-hand side of the -match operator.
PS /> "RW \\test\te" -match "\\\\test\\te"
True
Here's what the result looks like:
PS /> $matches[0]
\\test\te
You could also expand on this to use named captures in regular expressions. Named captures just give friendly names to individual captures inside of a regular expression, making them more easily referenced as a property on the $matches variable, instead of a numeric index.
PS /> "RW \\test\te" -match "(?<UNCPath>\\\\test\\te)"
True
PS /> $matches.UNCPath
\\test\te
Keep in mind that the backtick character is used to escape certain special characters in PowerShell double-quoted strings. However, in the case of the -match operator, you're invoking the .NET regular expression engine. In the .NET regex engine, the backslash is used to escape special characters in the regex context. Hence, in this example, the backtick escape character isn't applicable.
Also, make sure that you are not escaping special characters in your source string, on the left-hand side of the -match operator. The reason that your final example doesn't match, is because you added a second \, but only escaped a single \ in the regex on the right-hand side of the -match operator.

To complement Trevor Sullivan's helpful answer with a tip provided by PetSerAl in a comment on the question:
To use a string as a literal in a regex context, pass it to [regex]::Escape():
PS> "RW \\test\te" -match [regex]::Escape("\\test\te")
True
[regex]::Escape() conveniently escapes all characters that have special meaning in a regex with escape character \, so that the string is matched as a literal:
PS> [regex]::Escape("\\test\te")
\\\\test\\te
Note how the \ instances were each escaped with \, effectively doubling them.
If your string does use regex constructs but also contains characters with special meaning in regexes that you want to be treated as literals, you must \-escape them individually:
PS> '***' -match '\**' # match zero or more (*) '*' chars (\*)
True

Somewhat orthogonal, but for matching paths, you might find the -like operator easier to use. It supports wildcards instead of regular expressions so you could write your example as
"RW \\test\te" -like "*\\test\te"
Note that the leading '*' on the RHS is required- wildcard patterns are "anchored" (have to match the whole string). Regular expressions are unanchored by default and only have to match a fragment of the string.

Related

Escape special characters for regex pattern in Powershell

I'm writing a Powershell script to check for a list of passwords that meet a specific password policy. In this case at least 7 characters, at least 1 upper case letter, at least 1 lower case letter, and a special character to include white space. This is the regex I currently have:
(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[~!##$%^&*_\-+=`|\\\(\)\{\}\[\]:;"'<>,.?\/\s\/])[A-Za-z\d[~!##$%^&*_\-+=`|\\\(\)\{\}\[\/\]:;"'<>,.?\s]{7,}$
I've tested the pattern on regex101 with some password strings that match the above stated policy and it works. Where I'm getting lost is when I plug the pattern into Powershell, Powershell is seeing the quotes/apostrophes as such, instead of characters to search for in the regex pattern. How do I go about escaping these characters so Powershell knows to include them in the regex pattern?
Doing this with a single regex makes for a complex and hard to read regex. Make several smaller tests, and they are easier to read - and you can provide a good error message because you can tell which one failed:
at least 7 characters
$pass.Length -ge 7
at least 1 upper case letter
$pass -cmatch '[A-Z]' (cmatch is case sensitive)
at least 1 lower case letter
$pass -cmatch '[a-z]'
and a special character to include white space.
$pass -match '\W' (\W is not word characters; not a letter or digit)
There is also [regex]::Escape($Text) which will escape characters in a string that could be interpreted by the regex engine as patterns. You would still need to handle quotes and backticks when writing the $Text variable so that the PowerShell string processor does not get confused; use a single quoted string and you only need to escape single quotes inside it.
Do note that NIST password guidelines recommend against this kind of password complexity testing, and instead recommend only:
at least 12 characters.
checked a list of passwords found in breaches, rejected if it's one of those.
PowerShell has no special syntax for representing regex literals - they are simply represented as string literals.
The simplest solution, which doesn't require escaping (of quote characters, ` or $) in your regex, is to use a verbatim here-string:
# The middle line is your original regex, as-is.
$regex = #'
(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[~!##$%^&*_\-+=`|\\\(\)\{\}\[\]:;"'<>,.?\/\s\/])[A-Za-z\d[~!##$%^&*_\-+=`|\\\(\)\{\}\[\/\]:;"'<>,.?\s]{7,}$
'#
Note that no characters (other than whitespace) may follow the opening delimiter, #', and that the closing delimiter, '#, must be on its own line, at the very start of that line (not even whitespace may precede it).
The alternative is to use a regular (single-line) verbatim (single-quoted) string ('...'), in which case the only character you need to escape is ' itself, namely as '':
# Note how both embedded instances of ' are escaped as ''
$regex = '(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[~!##$%^&*_\-+=`|\\\(\)\{\}\[\]:;"''<>,.?\/\s\/])[A-Za-z\d[~!##$%^&*_\-+=`|\\\(\)\{\}\[\/\]:;"''<>,.?\s]{7,}$'

concatenating variables inside linux commands in perl

I need to use a system command (grep) which has a variable concatenated with a string as the regex to search a file.
Is it possible to concatenate a regex for grep between a variable and string in Perl??
I have tried using the . operator but it doesn't work.
if(`grep -e "$fubname._early_exit_indicator = 1" $golden_path/early_exit_information.tsv`){
print "-D- The golden data indicates early exit should occur\n";
$golden_early_exit_indicator=1;
}
Expected to match the regex, "$fubname._early_exit_indicator = 1" but it doesn't match as required.
The expected result should be:
-D- The golden data indicates early exit should occur
But in the present code, it doesn't print this.
Output link: (https://drive.google.com/open?id=1N0SaZ-r3bYPlljKUgTOH5AbxCAaHw7zD)
The problem is that the . operator is not recognized as an operator inside quotes. Dot operators are use between strings, not inside strings. Using the dot inside a string, inserts it literally. This literal dot in the pattern, causes the grep command in your code to fail.
Also note that inside quotes, Perl tries to interpolates variable using certain identifier parsing rules.
See perldoc perlop for the different types of quoting that are used in Perl, and see perldoc perldata for information about the identifier parsing rules.
In summary, in order to interpolate the variable $fubname in the backticks argument, use
"${fubname}_early_exit_indicator = 1"
Note that we need braces around the identifier, since the following underscore is a valid identifier character. (To the contrary a literal dot is not a valid identifier character, so if following character was a literal dot, you would not need the braces around the identifier.)
The . operator will not work inside the quotes. use something like this-
if(`grep -e "${fubname}_early_exit_indicator = 1" ...
I hope this works

BASH script matching a glob at the begining of a string

I have folders in a directory with names giving specific information. For example:
[allied]_remarkable_points_[treatment]
[nexus]_advisory_plans_[inspection]
....
So I have a structure similar to this: [company]_title_[topic]. The script has to match the file naming structure to variables in a script in order to extract the information:
COMPANY='[allied]';
TITLE='remarkable points'
TOPIC='[treatment]'
The folders do not contain a constant number of characters, so I can't use indexed matching in the script. I managed to extract $TITLE and $TOPIC, but I can't manage to match the first string since the variable brings me back the complete folders name.
FOLDERNAME=${PWD##*/}
This is the line is giving me grief:
COMPANY=`expr $FOLDERNAME : '\(\[.*\]\)'`
I tried to avoid the greedy behaviour by placing ? in the regular expression:
COMPANY=`expr $FOLDERNAME : '\(\[.*?\]\)'`
but as soon as I do that, it returns nothing
Any ideas?
expr isn't needed for regular-expression matching in bash.
[[ $FOLDERNAME =~ (\[[^]]*\]) ]] && COMPANY=${BASH_REMATCH[1]}
Use [^]]* instead of .* to do a non-greedy match of the bracketed portion. An bigger regular expression can capture all three parts:
[[ $FOLDERNAME =~ (\[[^]]*\])_([^_]*)_(\[[^]]*\]) ]] && {
COMPANY=${BASH_REMATCH[1]}
TITLE=${BASH_REMATCH[2]}
TOPIC=${BASH_REMATCH[3]}
}
Bash has built-in string manipulation functionality.
for f in *; do
company=${f%%\]*}
company=${company#\[} # strip off leading [
topic=${f##\[}
topic=${f%\]} # strip off trailing ]
:
done
The construct ${variable#wildcard} removes any prefix matching wildcard from the value of variable and returns the resulting string. Doubling the # obtains the longest possible wildcard match instead of the shortest. Using % selects suffix instead of prefix substitution.
If for some reason you do want to use expr, the reason your non-greedy regex attempt doesn't work is that this syntax is significantly newer than anything related to expr. In fact, if you are using Bash, you should probably not be using expr at all, as Bash provides superior built-in features for every use case where expr made sense, once in the distant past when the sh shell did not have built-in regex matching and arithmetic.
Fortunately, though, it's not hard to get non-greedy matching in this isolated case. Just change the regex to not match on square brackets.
COMPANY=`expr "$FOLDERNAME" : '\(\[[^][]*\]\)'`
(The closing square bracket needs to come first within the negated character class; in any other position, a closing square bracket closes the character class. Many newbies expect to be able to use backslash escapes for this, but that's not how it works. Notice also the addition of double quotes around the variable.)
If you're not adverse to using grep, then:
COMPANY=$(grep -Po "^\[.*?\]" $FOLDERNAME)

substitution in string

I have the following string ./test
and I want to replace it with test
so, I wrote the following in perl:
my $t =~ s/^.//;
however, that replaces ./test with /test
can anyone please suggest how I fix it so I get rid of the / too. thanks!
my $t =~ s/^\.\///;
You need to escape the dot and the slash.
The substitution is s/match/replace/. If you erase, it's s/match//. You want to match "starts with a dot and a slash", and that's ^\.\/.
The dot doesn't do what you expect - rather than matching a dot character, it matches any character because of its special treatment. To match a dot and a forward slash, you can rewrite your expression as follows:
my $t =~ s|^\./||;
Note that you are free to use a different character as a delimiter, in order not to confuse it with any such characters inside the regular expression.
If you want to get rid of ./ then you need to include both of those characters in the regex.
s/^\.\///;
Both . and / have special meanings in this expression (. is a regex metacharacter meaning "any character" and / is the delimiter for the s/// operator) so we need to escape them both by putting a \ in front of them.
An alternative (and, in my opinion, better) approach to the / issue is to change the character that you are using as the s/// delimiter.
s|^\./||;
This is all documented in perldoc perlop.
You have to use a backward slash before the dot and the forward slash: s/\.\//;
The backslash is used to write symbols that otherwise would have a different meaning in the regular expression.
You need to write my $t =~ s/^\.\///; (Note that the period needs to be escaped in order to match a literal period rather than any character). If that's too many slashes, you can also change the delimiter, writing instead, e.g., my $t =~ s:^\./::;.
$t=q(./test);$t=~s{^\./}{};print $t;
You need to escape the dot if you want it to match a dot. Otherwise it matches any character. You can choose alternate delimiters --- best when dealing with forward slashes lest you get the leaning-toothpick look when you otherwise need to escape those too.
Note that the dot in your question is matching any character, not a literal '.'.
my $t = './test';
$t =~ s{\./}{};
use Path::Class qw( file );
say file("./test")->cleanup();
Path::Class

PowerShell using variables in strings passed as parameters

I have several php files in directory, I want to replace a few words in all files with different text. It's a part of my code:
$replacements_table=
("hr_table", "tbl_table"),
('$users', "tbl_users")
foreach ($file in $phpFiles){
foreach($replacement in $replacements_table){
(Get-Content $file) | Foreach-Object{$_ -replace $replacement} | Set-Content $file
}
}
It works fine for replacing "hr_table", but doesn't work at all for '$users'. Any suggestion would be nice
The string is actually a regular expression and so needs to be escaped using '\'. See this thread
$replacements_table= ("hr_table", "tbl_table"), ('\$users', "tbl_users")
will work.
The dollar sign is a special regular expression character, matches the end of a string, you need to escape it. Escaping a character in regex is done by a '\' in front of the character you want to escape. A safer method to escape characters (especially when you don't know if the string might contain special characters) is to use the Escape method.
$replacements_table= (hr_table', 'tbl_table'), ([regex]::Escape('$users'), 'tbl_users')
Try escaping "$' with a backslash: '\$users'
The $ symbol tells the regular expression to match at the end of the string. The backslash is the regular expression escape character.
try using double quotes around your variable name instead of single quotes
EDIT
Try something along these lines ....
$x = $x.Replace($originalText, '$user')

Resources