Replacing a set of strings containing any character - string

Is there a way to replace a set of string no matter what the string contains?
I am trying to replace one string containing: quotes(""), brackets([]), #, e.
gci C:\test *.txt -recurse | ForEach {(Get-Content $_ | ForEach {$_ -replace '"my"', "money"}) | Set-Content $_ }
but what if a string I want to replace has EVERYTHING in <>:
PowerPlayReport Product_version="10.2.6100.36" xmlns="http://www.cognos.com/powerplay/report[1234#1]" Author="PPWIN" Version="4.0"

So you want to replace everything in [] in the sample text you included in your question. If you were not aware ( although I think you are now ) -replace supports regular expressions. A simple regex can find the text you are looking for. I am also going remove some of the redundancy in your code.
Get-ChildItem C:\test -Filter *.txt -Recurse | ForEach-Object{
$file = $_.FullName
(Get-Content $file) -replace "\[.*?]","[bagel]" | Set-Content $file
}
Explanation borrowed from regex101.com
\[ matches the character [ literally
.*? matches any character (except newline). Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
] matches the character ] literally
So that line would then appear as the following inside the source file.
PowerPlayReport Product_version="10.2.6100.36" xmlns="http://www.cognos.com/powerplay/report[bagel]" Author="PPWIN" Version="4.0"

Related

Manipulate strings in a txt file with Powershell - Search for words in each line containing "." and save them in new txt file

I have a text file with different entries. I want to manipulate it to filter out always the word, containing a dot (using Powershell)
$file = "C:\Users\test_folder\test.txt"
Get-Content $file
Output:
Compass Zype.Compass 1.1.0 thisisaword
Pomodoro Logger zxch3n.PomodoroLogger 0.6.3 thisisaword
......
......
......
Bla Word Program.Name 1.1.1 this is another entry
As you can see, in all lines, the "second" "word" contains a dot, like "Program.Name".
I want to create a new file, which contains just those words, each line one word.
So my file should look something like:
Zype.Compass
zxch3n.PomodoroLogger
Program.Name
What I have tried so far:
Clear-Host
$folder = "C:\Users\test_folder"
$file = "C:\Users\test_folder\test.txt"
$content_txtfile = Get-Content $file
foreach ($line in $content_textfile)
{
if ($line -like "*.*"){
$line | Out-File "$folder\test_filtered.txt"
}
}
But my output is not what I want.
I hope you get what my problem is.
Thanks in advance! :)
Here is a solution using Select-String to find sub strings by RegEx pattern:
(Select-String -Path $file -Pattern '\w+\.\w+').Matches.Value |
Set-Content "$folder\test_filtered.txt"
You can find an explanation and the ability to experiment with the RegEx pattern at RegEx101.
Note that while the RegEx101 demo also shows matches for the version numbers, Select-String gives you only the first match per line (unless argument -AllMatches is passed).
This looks like fixed-width fields, and if so you can reduce it to this:
Get-Content $file | # Read the file
%{ $_.Substring(29,36).Trim()} | # Extract the column
?{ $_.Contains(".") } | # Filter for values with "."
Set-Content "$folder\test_filtered.txt" # Write result
Get-content is slow and -like is sometimes slower than -match. I prefer -match but some prefer -like.
$filename = "c:\path\to\file.txt"
$output = "c:\path\to\output.txt"
foreach ($line in [System.IO.File]::ReadLines($filename)) {
if ($line -match "\.") {
$line | out-file $output -append
}
}
Otherwise for a shorter option, maybe
$filename = "c:\path\to\file.txt"
$output = "c:\path\to\output.txt"
Get-content "c:\path\to\file.txt" | where {$_ -match "\.") | Out-file $output
For other match options that are for the first column, either name the column (not what you do here) or use a different search criteria
\. Means a period anywhere seein the whole line
If it's all periods and at the beginning you can use begining of line so..
"^\." Which means first character is a period.
If it's always a period before the tab maybe do an anything except tab period anything except tab or...
"^[^\t]*\.[^\t]*" this means at the start of the line anything except tab any quantity then a period then anything except a tab any number of times.

Powershell: Replace string in File1 based on string in File2

I am being forced to use Powershell because of my work. I have used it to do a couple of things but one of my codes is now trash because I have to update a string in a file to include a year that is in a second file. Here is what I'm working with:
File1: Contains a few strings but in there is 48 strings that say:
Jenga_Sequence-XXXX.consensus_Bob_0.6_quality_20
The main point of the string is Sequence-XXXX, sorry for the random place holders.
File2: is a table that has the strings:
John/USA/Sequence-XXXX/Year
I need to replace the strings in File1 with the corresponding Strings in File2.
Sample Text of File1:
Jenga_Sequence-0001.consensus_Bob_0.6_quality_20
AAAAAAAAAAAAAAAAAAAAAAAAA
Jenga_Sequence-0002.consensus_Bob_0.6_quality_20
aaaaaaaaaaaaaaaaaaaaaaaaa
Jenga_Sequence-0003.consensus_Bob_0.6_quality_20
bbbbbbbbbbbbbbbbbbbbbbbbb
Jenga_Sequence-0004.consensus_Bob_0.6_quality_20
BBBBBBBBBBBBBBBBBBBBBBBBB
Jenga_Sequence-0005.consensus_Bob_0.6_quality_20
QQQQQQQQQQQQQQQQQQQQQ
Sample Table of File2:
|Sequence_ID|Date|
|---------------------------|----------|
|John/USA/Sequence-0003/2020|10/11/2020|
|John/USA/Sequence-0001/2021|1/5/2021|
|John/USA/Sequence-0005/2021|1/10/2021|
|John/USA/Sequence-0004/2020|12/23/2020|
|John/USA/Sequence-0002/2021|1/6/2021|
So, I need a Powershell code that replaces
Jenga_Sequence-0001.consensus_Bob_0.6_quality_20 with John/USA/Sequence-0001/2021,
Jenga_Sequence-0002.consensus_Bob_0.6_quality_20 with John/USA/Sequence-0002/2021,
Jenga_Sequence-0003.consensus_Bob_0.6_quality_20 with John/USA/Sequence-0003/2020, and so on. There are typically 48 of these in a file.
My previous code simple replaced "Jenga_" with "John/USA/" and ".consensus_Bob_0.6_quality_20" with "/2020" but now that we are seeing "/2021" the static code will not work.
I am still open to replacing pieces of the string and having a code that sets the year replacement to the correct year.
That was the angle I was doing a broad search on but I could never find anything specific enough to help.
Any help will be appreciated!
EDIT: Here is the part of my previous code that dealt with the finding and replacing, even though I feel it needs to be trashed:
$filePath = 'Jenga_Combined.txt'
$tempFilePath = "$env:TEMP\$($filePath | Split-Path -Leaf)"
$find = 'Jenga_'
$replace = 'John/USA/'
$find2 = '.consensus_Bob_0.6_quality_20'
$replace2 = '/2020'
(Get-Content -Path $filePath) -replace $find, $replace -replace $find2, $replace2 | Add-Content -Path $tempFilePath
Remove-Item -Path $filePath
Move-Item -Path $tempFilePath -Destination $filePath
EDIT2: The "Real Data" from file2. File2 is a Tab Delimited .txt file which makes it not "look great" when copy and pasting. Hopefully this helps. File1 is exactly like above (although the AAAAA stuff is roughly 30,000 letters long)
Sequence_ID date
John/USA/Sequence-0003/2020 2020-10-11
John/USA/Sequence-0001/2021 2021-01-05
John/USA/Sequence-0005/2021 2021-01-10
John/USA/Sequence-0004/2020 2020-12-23
John/USA/Sequence-0002/2021 2021-01-06
Dan
The common factor here is the Sequence_ID number in both files.
You can do this like:
$csvData = Import-Csv -Path 'D:\Test\File2.txt' -Delimiter "`t"
$result = switch -Regex -File 'D:\Test\Jenga_Combined.txt' {
'^Jenga_Sequence-(\d+).*' {
$replace = $csvData | Where-Object { $_.Sequence_ID -like "*Sequence-$($matches[1])*" }
if (!$replace) { Write-Warning "No corresponding Sequence_ID $($matches[1]) found!"; $_ }
else { $replace.Sequence_ID }
}
default { $_ }
}
# output on screen
$result
# output to new file
$result | Set-Content -Path 'D:\Test\Jenga_Combined_NEW.txt' -Force
Output on screen:
John/USA/Sequence-0001/2021
AAAAAAAAAAAAAAAAAAAAAAAAA
John/USA/Sequence-0002/2021
aaaaaaaaaaaaaaaaaaaaaaaaa
John/USA/Sequence-0003/2020
bbbbbbbbbbbbbbbbbbbbbbbbb
John/USA/Sequence-0004/2020
BBBBBBBBBBBBBBBBBBBBBBBBB
John/USA/Sequence-0005/2021
QQQQQQQQQQQQQQQQQQQQQ
Of course, you need to change the file paths to match your environment

PowerShell: How to find a word within a line of text and modify the data behind it

I'm looking for a way to find a word with a value behind it in a piece of text and then update the value.
Example:
In the file the are multiple occurrences of 'schema="anon" maxFileSize="??????" maxBufferSize="123"'
I want to find all the lines containing maxFileSize and then update the unknown value ?????? to 123456.
So far, I came up with this:
cls
$Files = "C:\temp1\file.config","C:\temp2\file.config"
$newMaxFileSize = "123456"
ForEach ($File in $Files) {
If ((Test-Path $File -PathType Leaf) -eq $False) {
Write-Host "File $File doesn't exist"
} Else {
# Check content of file and select all lines where maxFileSize isn't equal to 123456 yet
$Result = Get-Content $File | Select-String -Pattern "maxFileSize" -AllMatches | Select-String -Pattern "123456" -NotMatch -AllMatches
Write-Host $Result
<#
ROUTINE TO UPDATE THE SIZE
#>
}
}
Yet, I have no clue how to find the word "maxFileSize", let alone how to update the value behind it...
Assuming the input file is actually XML, use the following XPath expression to locate all nodes that have a maxFileSize attribute (regardless of value):
# Parse input file as XML
$configXml = [xml](Get-Content $file)
# Use Select-Xml to find relevant nodes
$configXml |Select-Xml '//*[#maxFileSize]' |ForEach-Object {
# Update maxFileSize attribute value
$_.Node.SetAttribute('maxFileSize','123456')
}
# Overwrite original file with updated XML
$configXml.Save($file.FullName)
If the config file is some archaic format for which no readily available parser exists, use the -replace operator to update the value where appropriate:
$Results = #(Get-Content $File) -creplace '(?<=maxFileSize=")[^"]*(?=")','123456'
The pattern used above, (?<=maxFileSize=")[^"]*(?="), describes:
(?<= # Positive look-behind assertion, this pattern MUST precede the match
maxFileSize=" # literal string `maxFileSize="`
) # Close look-behind
[^"]* # Match 0 or more non-" characters
(?= # Positive look-ahead assertion, this pattern MUST succeed the match
" # literal string `"`
) # Close look-ahead

Extract substrings where match is found

I have a text file with a number of lines. I would like to search each line individually for a particular pattern and, if that pattern is found output a substring at a particular position relative to where the pattern was found.
i.e. if a line contains the pattern at position 20, I would like to output the substring that begins at position 25 on the same line and lasts for five characters.
The following code will output every line that contains the pattern:
select-string -path C:\Scripts\trimatrima\DEBUG.txt -pattern $PATTERN
Where do I go from here?
You can use the $Matches automatic variable:
Last match is stored in $Matches[0], but you can also use named capture groups, like this:
"test","fest","blah" |ForEach-Object {
if($_ -match "^[bf](?<groupName>es|la).$"){
$Matches["groupName"]
}
}
returns es (from "fest") and la (from "blah")
Couple of options.
Keeping Select-String, you'll want to use the .line property to get your substrings:
select-string -path C:\Scripts\trimatrima\DEBUG.txt -pattern $PATTERN |
foreach { $_.line.Substring(19,5) }
For large files, Get-Content with -ReadCount and -match may be faster:
Get-Content C:\Scripts\trimatrima\DEBUG.txt-ReadCount 1000 |
foreach {
$_ -match $pattern |
foreach { $_.substring(19,5) }
}

Passing string included : signs to -Replace Variable in powershell script

$FilePath = 'Z:\next\ResourcesConfiguration.config'
$oldString = 'Z:\next\Core\Resources\'
$NewString = 'G:\PublishDir\next\Core\Resources\'
Any Idea how can you replace a string having : sign in it. I want to change the path in a config file. Simple code is not working for this. tried following
(Get-Content $original_file) | Foreach-Object {
$_ -replace $oldString, $NewString
} | Set-Content $destination_file
The Replace operator takes a regular expression pattern and '\' is has a special meaning in regex, it's the escape character. You need to double each backslash, or better , use the escape method:
$_ -replace [regex]::escape($oldString), $NewString
Alterntively, you can use the string.replace method which takes a string and doesn't need a special care:
$_.Replace($oldString,$NewString)
Try this,
$oldString = [REGEX]::ESCAPE('Z:\next\Core\Resources\')
You need escape the pattern to search for.
This works:
$Source = 'Z:\Next\ResourceConfiguration.config'
$Dest = 'G:\PublishDir\next\ResourceConfiguration.config'
$RegEx = "Z:\\next\\Core\\Resources"
$Replace = 'G:\PublishDir\next\Core\Resources'
(Get-Content $FilePath) | Foreach-Object { $_ -replace $RegEx,$Replace } | Set-Content $Dest
The reason that your attempt wasn't working is that -replace expects it's first parameter to be a Regular Expression. Simply put, you needed to escape the backslashes in the directory path, which is done by adding an additional backspace (\\). It expects the second parameter to be a string, so no changes need to be done there.

Resources