Powershell Trim Space between Sentences

Powershell Trim Space between Sentences - string

I'm using Powershell to trim spaces between strings, I need help. I'm reading the values into a variable using Get-Content
Here is my input data:
04:31 Alex M.O.R.P.H. & Natalie Gioia - My Heaven http://goo.gl/rMOa2q
[ARMADA MUSIC]
12:37 Chakra - Home (Alexander Popov Remix) http://goo.gl/3janGY
[SOUNDPIERCING]
See the space between the two songs? I want to eliminate these. so that the output is:
04:31 Alex M.O.R.P.H. & Natalie Gioia - My Heaven http://goo.gl/rMOa2q
[ARMADA MUSIC]
12:37 Chakra - Home (Alexander Popov Remix) http://goo.gl/3janGY
[SOUNDPIERCING]

I put the contents in a file called foo.txt.
foreach ($line in get-content foo.txt) {
if ($line -ne '') {
$line
}
}

$noEmptyLines = Get-Content -Path C:\FilePath\File.Txt | Where-Object{$_ -notmatch "^\s*$"}
You would have a variable where any lines that contained only whitespace would be removed.

Related

How to remove all characters from multiple filenames after specified characters - Powershell

Forgive me if I explain poorly, I'm very new to PowerShell.
I'm trying clean up my media files and I'm trying to remove all characters after a specified string from multiple files with all sub directories of a directory.
The filename length will not be consistent. But the file types will. I need to exclude the extension from being removed as well.
So the files would look something like this:
TVshow S01 E01 Title of episode.mp4
LongerTVShow S03 E01 Title of episode.mp4
I want to remove everything after E01, while keeping E01
Result:
TVshow S01 E01.mp4
LongerTVShow S03 E01.mp4
I currently have a few other lines that are cleaning out characters I specify, for example finding periods and replacing them with spaces:
get-childitem -recurse | dir -Filter *.mp4 | Rename-Item -NewName { $_.BaseName.replace('.',' ') + $_.Extension }
That works well, as it will apply to all files in the directory. But you need to specify the character to replace.
I was then just going to use multiple instance of the command for E01, E02, E03 etc. In the same way I remove multiple stings like the code below:
get-childitem -recurse | dir -Filter *.txt | Rename-Item -NewName { $_.BaseName.replace('1080p','').replace('720p','').replace('HD','') + $_.Extension }
I was hoping to use something along the same lines, I've seen suggestions for trim or splitting but I can't seem to figure it out and I haven't been able to find anything.
Thanks for any answers!
Edit
I used the code by AdminOfThings and added that into what I have.
get-childitem -recurse |dir -Filter *.mp4 | Rename-Item -NewName { ($_.BaseName -creplace '(?<=S\d+ E\d+)\D.*') + $_.Extension }
So if anyone needs something like this in future, this will rename any .mp4 files in the directory and all sub directories it's run in. Specifically anything after E01, E02, E03 etc. Resulting in the following:
TvShow S01 E04 title_of_show.mp4
TvShow S08 E03Title_of_show.mp4
into:
TvShow S01 E04.mp4
TvShow S08 E03.mp4
Very specific but someone may find this useful.

You can do the following:
Get-ChildItem -Recurse -Filter *.txt -File |
Rename-Item -NewName { ($_.BaseName -creplace '(?<=S\d+ E\d+)\D.*') + $_.Extension }
Explanation:
The -creplace operator performs a case-sensitive regex match (-replace is case-insensitive) and then string replacement. If no string replacement is used, then the matched string is just replaced with empty string.
The regex string is (?<=S\d+ E\d+)\D.*.
(?<=) is a positive lookbehind mechanism. This means that the current position in the string must have previous characters that match the lookbehind expression.
S\d+ matches literal s or S and one or more (+) digits (\d). In our case, only S is matched because we are using -creplace. The space after \d+ is a literal space.
E\d+ matches literal E and one or more digits.
\D matches a non-digit character. This is needed so that the lookbehind \d+ won't give back any digits. It allows us to be currently on a non-digit and know that we matched all previous digits.
.* matches any characters greedily until the end of the string.

This is long, but how about this:
dir -Filter *.mp4 | Rename-Item -NewName { $_.BaseName.Split(' ')[0] + ' ' + $_.BaseName.Split(' ')[1] + ' ' + $_.BaseName.Split(' ')[2] + $_.Extension }

How to maniuplate text in first column of CSV file with script

Have a CSV file with multiple columns with information. Need to remove the opening and closing " in the Employee Name as well as the , as seen below.
Employee Name,Employee #,column3, column4 etc. <br>
"Lastname, Firstname",123,abc,xyz<br>
"Lastname, Firstname",123,abc,xyz<br>
Result:
Employee Name,Employee #,column3, column4 etc.<br>
Lastname Firstname,123,abc,xyz<br>
Lastname Firstname,123,abc,xyz<br>
Tried using the following Powershell script:
(gc C:\pathtocsv.csv) | % {$_ -replace '"', ""} | out-file C:\pathtocsv.csv -Fo -En ascii
This only removes the " " around Lastname , Firstname but the comma is still present when opening the csv file in a text editor. Need this format to send to data to another company. Everything I have tried removes every comma. Novice in powershell and other languages, I am sure this is an easy fix. Please help!

Powershell has a lot of built-in handling for CSV files, instead of trying to treat is as a text file you can use the following to remove just the comma you want:
Import-Csv .\a.csv | % {
$_."Employee Name" = ($_."Employee Name" -replace ',','')
$_ #return modified rows
} | Export-Csv .\b.csv -notype -delim ','
this will by default export everything with double quotes, so you may need to go back and run something like:
(gc .\b.csv -raw) -replace '"','' | Out-File .\c.csv
to also remove all the double quotes.

Warning: quotes are important if text contains special characters (i.e. comma, quote)
If you really want to strip lines, you can process your csv as regular text file:
#sample data
#'
"Lastname, Firstname",123,abc,xyz
"Lastname, Firstname",123,abc,xyz
'# | out-file c:\temp\test.csv
Get-Content c:\temp\test.csv | % {
$match = [Regex]::Match($_,'"([^,]*), ([^"]*)"(.*)')
if ($match.Success) {
$match.Groups[1].Value+' '+$match.Groups[2].Value+$match.Groups[3].Value
} else {
$_ #skip processing if line format do not match pattern
}
}

PowerShell Replace Specific String in File When multiple matches exist

Problem
I am trying to modify a file by replacing a very specific substring within a file; however, in this particular instance, the file contains two lines that are nearly identical.
This line is in an AssemblyInfo file and I am trying to replace the value of the AssemblyVersion. See below:
$CurrentVersion = '1.0.*'
$NewVersion = '1.0.7.1'
# Two similar lines:
// [assembly: AssemblyVersion("1.0.*")] # I want to ignore this line
[assembly: AssemblyVersion("1.0.*")] # I want to target this value
I have been trying several different approaches to this, each with varying results.
$Assembly = 'C:\path\to\AssemblyInfo.cs'
$regex = '(?<!\/\/ \[assembly:AssemblyVersion\(")(?<=AssemblyVersion\(")[^"]*'
$regex2 = ('`n\[assembly: AssemblyVersion\("'+$CurrentVersion+'"\)\]')
Attempt 001
(GC $Assembly) |
ForEach-Object { $_.Replace($CurrentVersion, $NewVersion) } |
Set-Content $Assembly
This was an obvious failure. It ends up replacing both instances of '1.0.*'
Attempt 002
GC $Assembly |
Select-String -Pattern '^\[assembly' -AllMatches |
ForEach-Object { $_.Replace($CurrentVersion, $NewVersion) } |
Set-Content $Assembly
This ended with incompatible command issues...
Attempt 003
(GC $Assembly) | ForEAch-Object {
If ( $_ -MATCH $CurrentVersion ) {
ForEach ($Line in $_) {
$_.Replace($CurrentVersion, $NewVersion)
}
}
} |
Set-Content $Assembly
This ended up removing all lines that contained // as the starting characters... which was not what I wanted...
Attempt 004
GC $Assembly |
ForEach-Object {
$_.Replace($regex2, ('[assembly: AssemblyVersion("'+$NewVersion+'")]'))
} |
Set-Content $Assembly
I get an error saying the file is in use... but that didn't make sense as I couldn't find anything using it...
I have tried several other paths as well, most variations of the above 4 in hopes of achieving my goal. Even going so far as to target the line using the regex line provided above and variations of it to try and grab
Question
Using PowerShell, how can I replace only the line/value of the target line (ignoring the line that begins with // ), but still keep all the lines in the file upon saving the contents?

You are trying to use regex patterns but keep using the string method .Replace() which does not support it. You should be using the -replace operator. That would solve part of your issue.
Looks like you only want to replace the line that does not have anything else on it besides the assembly info.
$path = "C:\temp\test.txt"
$newVersion = "1.0.0.1"
$pattern = '^\s*?\[assembly: AssemblyVersion\("(.*)"\)\]'
(Get-Content $path) | ForEach-Object{
if($_ -match $pattern){
# We have found the matching line
'[assembly: AssemblyVersion("{0}")]' -f $newVersion
} else {
# Output line as is
$_
}
} | Set-Content $path
That would be a verbose yet simple to follow way to do what you want. It will only match if the assembly line is at the start of the line with optional spaces.
I would expect that the pattern '^[assembly: AssemblyVersion\("(.*)"\)\]' works just as well since it should appear at the start of the line anyway.
This comes from another answer about almost this exact problem except now there is more that one match possibility. Also you will see my regex pattern isolates the current version in case you need that. If that is the case looked at the linked question.
You can combine this with other options but if you know ahead of time the version you are going to use then the replacement is pretty simple as well.
$newVersion = "1.6.5.6"
(Get-Content $path -Raw) -replace '(?m)^\[assembly: AssemblyVersion\("(.*)"\)\]', ('[assembly: AssemblyFileVersion("{0}")]' -f $newVersion) | Set-Content $path
That reads the file in as one string and performs the replacement as long as the pattern is at the start of the line in the file. (?m) lets us treat the start of line anchor ^ as something that works at the beginning of lines in the text. Not just the start of the whole string.

Multi-Line String to Single-Line String conversion in PowerShell

I have a text file that has multiple 'chunks' of text. These chunks have multiple lines and are separated with a blank line, e.g.:
This is an example line
This is an example line
This is an example line
This is another example line
This is another example line
This is another example line
I need these chunks to be in single-line format e.g.
This is an example lineThis is an example lineThis is an example line
This is another example lineThis is another example lineThis is another example line
I have researched this thoroughly and have only found ways of making whole text files single-line. I need a way (preferably in a loop) of making an array of string chunks single-line. Is there any way of achieving this?
EDIT:
I have edited the example content to make it a little clearer.

# create a temp file that looks like your content
# add the A,B,C,etc to each line so we can see them being joined later
"Axxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Bxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Cxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Dxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Exxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Fxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Gxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Hxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Ixxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" | Set-Content -Path "$($env:TEMP)\JoinChunks.txt"
# read the file content as one big chunk of text (rather than an array of lines
$textChunk = Get-Content -Path "$($env:TEMP)\JoinChunks.txt" -Raw
# split the text into an array of lines
# the regex "(\r*\n){2,}" means 'split the whole text into an array where there are two or more linefeeds
$chunksToJoin = $textChunk -split "(\r*\n){2,}"
# remove linefeeds for each section and output the contents
$chunksToJoin -replace '\r*\n', ''
# one line equivalent of above
((Get-Content -Path "$($env:TEMP)\JoinChunks.txt" -Raw) -split "(\r*\n){2,}") -replace '\r*\n', ''

A bit of a fudge:
[String] $strText = [System.IO.File]::ReadAllText( "c:\temp\test.txt" );
[String[]] $arrLines = ($strText -split "`r`n`r`n").replace("`r`n", "" );
This relies on the file having Windows CRLFs.

There a several ways to approach a task like that. One is to use a regular expression replacement with a negative lookahead assertion:
(Get-Content 'C:\path\to\input.txt' | Out-String) -replace "`r?`n(?!`r?`n)" |
Set-Content 'C:\path\to\output.txt'
You could also work with a StreamReader and StreamWriter:
$reader = New-Object IO.StreamReader 'C:\path\to\input.txt'
$writer = New-Object IO.StreamWriter 'C:\path\to\output.txt'
while ($reader.Peek() -gt 0) {
$line = $reader.ReadLine()
if ($line.Trim() -ne '') {
$writer.Write($line)
} else {
$writer.WriteLine()
}
}

using Powershell, how can I find a specific line in a multi-line variable, then locate the 3rd word or value within the line

I'm using Windows 2012 powershell to scrape values from a 3rd party executable. My goal is to re-arrange and simplify the output of the tool to display only a subset of the content and allow me to collect the data on many devices at once. I have most of the script working, but I'm stuck on a simple task that's so easy in Linux bash.
The 3rd party program is pulling status of a computer device to standard out. I can successfully set the standard out content to a variable. For example:
PS C:\> $status = mycmd.exe device1 --status
PS C:\> $status
The $status variable would return a multi-line list of values as follows:
Device Number: 1
PCIe slot: 3
Firmware Version: 5.1.4
Temperature: 45C
State: Online
In this case, I would like to create a new variable for the firmware version. In Linux I would use something like this (although there are many options available):
Firmware=$(mycmd device1 --status | grep "Firmware" | cut -c 19-24)
In Powershell I can use the Select-String command to can find the "firmware" pattern and set it to a varible as follows:
$Firmware = $Status | select-string -Pattern "Firmware Version"
This gives me the entire Firmware version line. I can't figure out how to substring just the version number from the line as Powershell seems to only want to manipulate the item I'm searching for and not the content next to the pattern. Since this isn't a built in command with an object name, manipulating text seems much more difficult.
I would like the $Firmware variable to equal "5.1.4" or it could be placed into another variable if necessary.

Firmware = ($Status | Select-String -Pattern '\d{1}\.\d{1,2}\.\d{1,2}' -AllMatches | % { $_.Matches } | % { $_.Value }

$Firmware = ($Status | Select-String -Pattern '(?<=Firmware Version:\s+)[\d.]+').Matches.Value
The regular expression here is looking for 1 or more combinations of digits \d and literal dots which are preceded by the Firmware Version: line.
Note that Select-String returns an object, so we use .Matches.Value to get the actual match (which in this case will only be the number).

Using -replace with a multi-line regex:
$Var =
#'
Device Number: 1
PCIe slot: 3
Firmware Version: 5.1.4
Temperature: 45C
State: Online
'#
$Firmware = $var -replace '(?ms).+^Firmware Version:\s+([0-9.]+).+','$1'
$Firmware
5.1.4

This works...I set used a [system.version] to parse the version number (and remove spaces).
$FirmwareLine = $Status | select-string -Pattern "Firmware Version"| select -expand line
[system.version]$firmware=$firmwareLine -split ":" | select -first 1 -skip 1
If you just need the string, you can remove the "cast"
$firmware=$firmwareLine -split ":" | select -first 1 -skip 1

Another option is to replace the colons with '=', converting the output to key=value pairs and then turn that into a hash table (and then to a PS Object if you want) using ConvertFrom-StringData:
$Var =
#'
Device Number: 1
PCIe slot: 3
Firmware Version: 5.1.4
Temperature: 45C
State: Online
'#
$DeviceStatus = New-object PSObject -Property $(ConvertFrom-StringData $var.Replace(':','='))
$DeviceStatus.'Firmware Version'
5.1.4

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string