How to parse string in powershell

How to parse string in powershell - string

I have a Powershell command that outputs multiple lines.
I want to output only one line that contains the name of a .zip file.
Currently, all lines are returned when substring .zip is found:
$p.Start() | Out-Null
$p.WaitForExit()
$output = $p.StandardOutput.ReadToEnd()
$output += $p.StandardError.ReadToEnd()
foreach($line in $output)
{
if($line.Contains(".zip"))
{
$line
}
}

Since you're using .ReadToEnd(), $output receives a single, multi-line string, not an array of lines.
You must therefore split that string into individual lines yourself, using the -split operator.
You can then apply a string-comparison operator such as -match or -like directly to the array of lines to extract matching lines:
# Sample multi-line string.
$output = #'
line 1
foo.zip
another line
'#
$output -split '\r?\n' -match '\.zip' # -> 'foo.zip'
-split is regex-based, and regex \r?\n matches newlines (line breaks) of either variety (CRLF, as typical on Windows, as well as LF, as typical on Unix-like platforms).
-match is also regex-based, which is why the . in \.zip is \-escaped, given that . is a regex metacharacter (it matches any character other than LF by default).
Note that -match, like PowerShell in general, is case-insensitive by default, so both foo.zip and foo.ZIP would match, for instance;
if you do want case-sensitivity, use -cmatch.
As an aside:
I wonder why you're running your command via a [System.Diagnostics.Process] instance, given that you seem to be invoking synchronously while capturing its standard streams.
PowerShell allows you to do that much more simply by direct invocation, optionally with redirection:
$output = ... 2>&1

Related

Powershell multiple string replacement using while cycle

I am trying to solve a somewhat weird problem: I need to replace strings within a raw content by strings from the same content that meet a certain matching criteria. The input data look like this:
apple-beta
apple-alpha_orange-beta
apple-alpha_orange-alpha_cherry-beta
apple-alpha_orange-alpha_kiwi-beta
apple-alpha_orange-alpha_mango-beta
abcd-alpha_efgh-beta
abcd-alpha_efgh-alpha_ijkl-beta
abcd-alpha_efgh-alpha_mnop-beta
The replacment should work as follows: look for all "-beta" strings in the content and delete all according "-alpha" strings (eg because there is "orange-beta" already => all "orange-alpha" should be deleted, because there is "apple-beta" already => all "apple-alpha" should be deleted etc.). The result would look like this:
apple-beta
_orange-beta
__cherry-beta
__kiwi-beta
__mango-beta
abcd-alpha_efgh-beta
abcd-alpha__ijkl-beta
abcd-alpha__mnop-beta
I have tried to achieve this with a number of awkward single replacements and temporary file storages as well as with a while-construction that doesn't work at all:
$whileinput = get-content -raw C:\content-input.txt
while ($whileinput -match "\w+-beta") {
$fullval = $whileinput -match "\w+-beta" -replace "-beta","-alpha"
$whileinput = $whileinput -replace '$fullval',''
}
Any help is very appreciated!
Daniel

I would find all your beta items. Then replace the corresponding alpha items.
$data = Get-Content C:\content-input.txt
$betas = ([regex]::Matches($data,'[^_]*?(?=-beta)').Value -ne '' | Foreach-Object {
[regex]::Escape($_)} ) -join '|'
$data -replace "($betas)-alpha"
Explanation:
[regex]::Matches().Value returns only the matched texts.
[^_]*? lazily matches consecutive characters that are not _. (?=-beta) is a positive lookahead for the text -beta but doesn't include the text in the match.
-ne '' is to filter out blank output.
[regex]::Escape() is not necessarily needed in this case. But it is good practice when your text may have special regex characters that you want to match literally.
$betas contains | delimited items because | is the regex OR. Using () to surround the $betas string allows one of those words to be fully matched before matching -alpha in the replacement.

Get-Content gets the entire contents of a file into a variable, so if anything in your file matches that pattern, it'll loop infinitely (because the contents of the file always match your pattern).
PowerShell is heavily based around the concept of the "pipeline" which you can use in conjunction with the Foreach-Object cmdlet to iterate over each line in a file.
I'm not quite clear on what you want the regexes to do, but I don't think the ones you have will do what you want. Try this.
Get-Content -raw C:\content-input.txt | Foreach-Object {
if($_ -match 'beta$') {
$out+=$_ -replace '\w+-alpha',''
}
}
$out | Out-File .\path-to-output.txt
$_ is the default "pipeline variable" aka the current item in the iteration - in this case the current line. Now at least your loop is working!

Log Parsing via Powershell - print all array elements after nth element

I'm parsing a log file that is space delimited for the first 7 elements and then a log message or sentence follows. I know just enough to get around in PS, and I'm learning more each day, so I'm not sure this is the best way to do this and apologies if I'm not leveraging a more efficient means that would be second nature to you. I'm using -split(' ')[n] to extract each field of the log file line by line. I'm able to extract the first parts fine as they are space-delimited, but I'm not sure how to get the rest of the elements up to the end of the line.
$logFile=Get-Content $logFilePath
$dateStamp=$logfile -split(' ')[0]
$timeStamp=$logfile -split(' ')[1]
$requestID=$logfile -split(' ')[3]
$binaryID=$logfile -split(' ')[4]
$logID=$logfile -split(' ')[5]
$action=$logfile -split(' ')[6]
$logMessage=$logfile -split(' ')[?]
This is not a CSV that I can import. I'm more familiar with string manipulation in bash so I am able to successfully replace spaces in the first 7 elements, and the end, with "," :
#!/bin/bash
inputFile="/cygdrive/c/Temp/logfile.log"
outputFile="/cygdrive/c/Temp/test_log.csv"
echo "\"DATE\",\"TIME\",\"HYPEN\",\"REQUESTID\",\"BINARY\",\"PROC_NUMBER\",\"MESSAGE\"" > $outputFile
while read -a line
do
arrLength=$(echo ${#line[#]})
echo \"${line[0]}\",\"${line[1]}\",\"${line[2]}\",\"${line[3]}\",\"${line[4]}\",\"${line[5]}\",\"${line[#]:6:$arrLength}\"
done < $inputFile >> $outputFile
Can you help either printing the array elements from position n to the end, or replacing the spaces appropriately in PS so I have a CSV that I can import? Just trying to avoid the two-step process of converting it in bash, then importing it in PS but I'm still researching. I did find this post Parsing Text file and placing contents into an Array Powershell
for importing the file assuming it's space-delimited and that works for the first 7 elements but not sure about everything after that.
Of course I welcome any other PS solutions such as one of those [something]::SOMETHING things I've seen by googling that might do all this much more seamlessly.

You can specify the maximum number of substrings in which the string is split like this:
$splittedRow = $logfile.split(' ',8)
$dateStamp=$splittedRow[0]
$timeStamp=$splittedRow[1]
$requestID=$splittedRow[3]
$binaryID=$splittedRow[4]
$logID=$splittedRow[5]
$action=$spltttedRow[6]
$logMessage=$splittedRow[7]

As an addition to Viktor Be's answer:
$data = "111 22222 333 4444444 5 6 77 888888 9999999 0" #this is the content of file below for testing purposes
#$data = get-content -path C:\temp\mytest.txt
foreach ($line in $data){
$splitted = $line.split(' ',8)
$line_output= ""
for ($i = 0;$i -lt 7;$i++){
$line_output += "$($splitted[$i]);"
}
$line_output += $splitted[7]
$line_output | out-file "C:\temp\MyCsvThatPowershellCanRead.csv" -append
}

You should be able to iterate over each line in the logfile and get the information you need the way you are doing. However, it's easy to grab the message field, which could include n number of spaces in the log message with a regular expression.
The following regex should work for you. Assuming $line is the current line you are on:
$line -match '(?<=(\S+\s+){6}).*'
$logMessage = $matches[0]
The way this expression works is that it looks for .* (which means any character 0 or more times) that comes after 6 occurences of non-whitespace characters followed by whitespace characters. The .* in this expression should match on your log message.

PowerShell Replace Specific String in File When multiple matches exist

Problem
I am trying to modify a file by replacing a very specific substring within a file; however, in this particular instance, the file contains two lines that are nearly identical.
This line is in an AssemblyInfo file and I am trying to replace the value of the AssemblyVersion. See below:
$CurrentVersion = '1.0.*'
$NewVersion = '1.0.7.1'
# Two similar lines:
// [assembly: AssemblyVersion("1.0.*")] # I want to ignore this line
[assembly: AssemblyVersion("1.0.*")] # I want to target this value
I have been trying several different approaches to this, each with varying results.
$Assembly = 'C:\path\to\AssemblyInfo.cs'
$regex = '(?<!\/\/ \[assembly:AssemblyVersion\(")(?<=AssemblyVersion\(")[^"]*'
$regex2 = ('`n\[assembly: AssemblyVersion\("'+$CurrentVersion+'"\)\]')
Attempt 001
(GC $Assembly) |
ForEach-Object { $_.Replace($CurrentVersion, $NewVersion) } |
Set-Content $Assembly
This was an obvious failure. It ends up replacing both instances of '1.0.*'
Attempt 002
GC $Assembly |
Select-String -Pattern '^\[assembly' -AllMatches |
ForEach-Object { $_.Replace($CurrentVersion, $NewVersion) } |
Set-Content $Assembly
This ended with incompatible command issues...
Attempt 003
(GC $Assembly) | ForEAch-Object {
If ( $_ -MATCH $CurrentVersion ) {
ForEach ($Line in $_) {
$_.Replace($CurrentVersion, $NewVersion)
}
}
} |
Set-Content $Assembly
This ended up removing all lines that contained // as the starting characters... which was not what I wanted...
Attempt 004
GC $Assembly |
ForEach-Object {
$_.Replace($regex2, ('[assembly: AssemblyVersion("'+$NewVersion+'")]'))
} |
Set-Content $Assembly
I get an error saying the file is in use... but that didn't make sense as I couldn't find anything using it...
I have tried several other paths as well, most variations of the above 4 in hopes of achieving my goal. Even going so far as to target the line using the regex line provided above and variations of it to try and grab
Question
Using PowerShell, how can I replace only the line/value of the target line (ignoring the line that begins with // ), but still keep all the lines in the file upon saving the contents?

You are trying to use regex patterns but keep using the string method .Replace() which does not support it. You should be using the -replace operator. That would solve part of your issue.
Looks like you only want to replace the line that does not have anything else on it besides the assembly info.
$path = "C:\temp\test.txt"
$newVersion = "1.0.0.1"
$pattern = '^\s*?\[assembly: AssemblyVersion\("(.*)"\)\]'
(Get-Content $path) | ForEach-Object{
if($_ -match $pattern){
# We have found the matching line
'[assembly: AssemblyVersion("{0}")]' -f $newVersion
} else {
# Output line as is
$_
}
} | Set-Content $path
That would be a verbose yet simple to follow way to do what you want. It will only match if the assembly line is at the start of the line with optional spaces.
I would expect that the pattern '^[assembly: AssemblyVersion\("(.*)"\)\]' works just as well since it should appear at the start of the line anyway.
This comes from another answer about almost this exact problem except now there is more that one match possibility. Also you will see my regex pattern isolates the current version in case you need that. If that is the case looked at the linked question.
You can combine this with other options but if you know ahead of time the version you are going to use then the replacement is pretty simple as well.
$newVersion = "1.6.5.6"
(Get-Content $path -Raw) -replace '(?m)^\[assembly: AssemblyVersion\("(.*)"\)\]', ('[assembly: AssemblyFileVersion("{0}")]' -f $newVersion) | Set-Content $path
That reads the file in as one string and performs the replacement as long as the pattern is at the start of the line in the file. (?m) lets us treat the start of line anchor ^ as something that works at the beginning of lines in the text. Not just the start of the whole string.

Accelerate Powershell script runtime

I'm using a POWERSHELL script which converts a specific log format to a tab or comma separated (CSV) format and it looks like this:
$filename = "filename.log"
foreach ($line in [System.IO.File]::ReadLines($filename)) {
$x = [regex]::Split( $line , 'regex')
$xx = $x -join ","
$xx >> Results.csv
}
And it works fine, but for a 20MB log file it takes almost 20 min to be converted! Is there a way to accelerate it?
My System: CPU: Corei7 3720QM / RAM: 8GB
Update: The log format is like this:
192.168.1.5:24652 172.16.30.8:80 http://www.example.com "useragent"
I want destination format to be:
192.168.1.5,24652,172.16.30.8,80,http://www.example.com,"useragent"
REGEX: ^([\d\.]+):(\d+)\s+([\d\.]+):(\d+)\s+([^ ]*)\s+(\".*\")$

As Lieven Keersmaekers points out, you can do a single -replace operation to do the work.
Additionally, foreach($thing in $o.GetThings()){} will initially block until GetThings() return and then store the entire result in memory, which you have no need for. You can avoid this by using the pipeline instead.
Finally, your regex can be simplified so that the engine doesn't have to parse the entire string before splitting, by matching on either : preceded by a digit or whitespace:
Get-Content filename.log |ForEach-Object {
$_ -replace '(?:(?<=\d)\:|\s+)',','
} |Out-File results.csv

Multi-Line String to Single-Line String conversion in PowerShell

I have a text file that has multiple 'chunks' of text. These chunks have multiple lines and are separated with a blank line, e.g.:
This is an example line
This is an example line
This is an example line
This is another example line
This is another example line
This is another example line
I need these chunks to be in single-line format e.g.
This is an example lineThis is an example lineThis is an example line
This is another example lineThis is another example lineThis is another example line
I have researched this thoroughly and have only found ways of making whole text files single-line. I need a way (preferably in a loop) of making an array of string chunks single-line. Is there any way of achieving this?
EDIT:
I have edited the example content to make it a little clearer.

# create a temp file that looks like your content
# add the A,B,C,etc to each line so we can see them being joined later
"Axxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Bxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Cxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Dxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Exxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Fxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Gxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Hxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Ixxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" | Set-Content -Path "$($env:TEMP)\JoinChunks.txt"
# read the file content as one big chunk of text (rather than an array of lines
$textChunk = Get-Content -Path "$($env:TEMP)\JoinChunks.txt" -Raw
# split the text into an array of lines
# the regex "(\r*\n){2,}" means 'split the whole text into an array where there are two or more linefeeds
$chunksToJoin = $textChunk -split "(\r*\n){2,}"
# remove linefeeds for each section and output the contents
$chunksToJoin -replace '\r*\n', ''
# one line equivalent of above
((Get-Content -Path "$($env:TEMP)\JoinChunks.txt" -Raw) -split "(\r*\n){2,}") -replace '\r*\n', ''

A bit of a fudge:
[String] $strText = [System.IO.File]::ReadAllText( "c:\temp\test.txt" );
[String[]] $arrLines = ($strText -split "`r`n`r`n").replace("`r`n", "" );
This relies on the file having Windows CRLFs.

There a several ways to approach a task like that. One is to use a regular expression replacement with a negative lookahead assertion:
(Get-Content 'C:\path\to\input.txt' | Out-String) -replace "`r?`n(?!`r?`n)" |
Set-Content 'C:\path\to\output.txt'
You could also work with a StreamReader and StreamWriter:
$reader = New-Object IO.StreamReader 'C:\path\to\input.txt'
$writer = New-Object IO.StreamWriter 'C:\path\to\output.txt'
while ($reader.Peek() -gt 0) {
$line = $reader.ReadLine()
if ($line.Trim() -ne '') {
$writer.Write($line)
} else {
$writer.WriteLine()
}
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to parse string in powershell - string

Related

Powershell multiple string replacement using while cycle

Log Parsing via Powershell - print all array elements after nth element

PowerShell Replace Specific String in File When multiple matches exist

Accelerate Powershell script runtime

Multi-Line String to Single-Line String conversion in PowerShell

Categories

Resources