Select multiple substrings from a multiline string - string

I'm attempting to create a PowerShell script to pick particular lines from large log files. Using Select-String, I've gotten the data down to just the rows I need in a multiline string. Now I'd like to further massage it to return only the ID numbers from those rows, in a single, comma-delimited string.
Current code:
if (Select-String $SearchFile -Pattern $SearchString -Quiet) {
Write-Host "Error message found"
$body += Select-String $SearchFile -Pattern $SearchString -Context 1,0 |
foreach {$_.Context.DisplayPreContext} | Out-String
Send-MailMessage (email_info_here) -Body $body
} else {
Write-Host "No errors found"
}
Currently returning the following string:
INFO | Creating Batch for 197988 | 03/24/2016 02:10 AM
INFO | Creating Batch for 202414 | 03/24/2016 02:10 AM
INFO | Creating Batch for 173447 | 03/24/2016 02:10 AM
Would like to get the output to the format:
197988, 202414, 173447

If Body contains those lines, then you just need to split and index into the column that has our data.
$body | ForEach-Object {$psitem.Split()[5]}
197988
202414
173447
In this sample, we call ForEach-Object to make a little code block to execute on each line. Then, we call the line's $split() method to split on spaces. Then we just index into the fifth column, using $psitem[5].
Assuming you want to save the lines back into $body again, just add $body = to the front of line 1.
Edit: Multi-line-string vs. Array
In the original post, the $body variable was created with Out-String as the last command in the pipeline. This would make it a single multi-line string. Leaving out the | Out-String part would make $body an array of strings. The latter (an array) is easier to work with and is what the answer above assumes, since it is easy to loop through each line in the array with foreach.
Converting between the two is done like this:
$string = #"
INFO | Creating Batch for 197988 | 03/24/2016 02:10 AM
INFO | Creating Batch for 202414 | 03/24/2016 02:10 AM
INFO | Creating Batch for 173447 | 03/24/2016 02:10 AM
"#
$array = #(
"INFO | Creating Batch for 197988 | 03/24/2016 02:10 AM"
"INFO | Creating Batch for 202414 | 03/24/2016 02:10 AM"
"INFO | Creating Batch for 173447 | 03/24/2016 02:10 AM"
)
$array_from_string = $string -split("`n")
$string_from_array = $array | Out-String
In order for the answer to work you need to make sure that $body is an array, else you will only get one ID number:
$string | Foreach-Object {$psitem.Split()[5]}
197988

Replace Out-String with a Where-Object filter that matches the number part of each result line, extract the number submatch, and join the result:
$body += (Select-String $SearchFile -Pattern $SearchString -Context 1,0 |
ForEach-Object { $_.Context.DisplayPreContext } |
Where-Object { $_ -match 'for (\d+) \|' } |
ForEach-Object { $matches[1] }) -join ', '

This might be a dirty way to do it, but it works:
#This can also be your variable
$log = gc "C:\[log path here]"
#Remove beginning of string up to ID
$log = $log -replace '(.*?)for ' , ""
#Select first 6 characters since all IDs shown are 6 characters
$logIDs = #()
foreach($line in $log){
$logIDs += $line.substring(0,6)
}
### At this point $logIDs contains all IDs, now we just need to comma separate them ###
$count = 1
foreach($ID in $logIDs){
if($count -eq $logIDs.count){
$result += $ID
}
else{
$result += $ID+", "
$count++
}
}
#Here are your results separated by commas
$result
Hope this helps, let me know if you need any type of variation.

Related

Write a script to get 10 longest words chart and put them in separate file

I need to sort the words in a text file and output them to a file
Function AnalyseTo-Doc{
param ([Parameter(Mandatory=$true)][string]$Pad )
$Lines = Select-String -Path $Pad -Pattern '\b[A-Za-zA-Яа-я]{2,}\b' -AllMatches
$Words = ForEach($Line in $Lines){
ForEach($Match in $Line.Matches){
[PSCustomObject]#{
LineNumber = $Line.LineNumber
Word = $Match.Value
}
}
}
$Words | Group-Object Word | ForEach-Object {
[PSCustomObject]#{
Count= $_.Count
Word = $_.Name
Longest= $_.Lenght
}
}
| Sort-Object -Property Count | Select-Object -Last 10
}
AnalyseTo-Doc 1.txt
#Get-Content 1.txt | Sort-Bubble -Verbose | Write-Host Sorted Array: | Select-Object -Last 10 | Out-File .\dz11-11.txt
it's don't work
Sort by the Longest property (consider renaming it to Length), which is intended to contain the word length, but must be redefined to $_.Group[0].Word.Length:Tip of the hat to Daniel.
$Words | Group-Object Word | ForEach-Object {
[PSCustomObject]#{
Count= $_.Count
Word = $_.Name
Longest = $_.Group[0].Word.Length
}
} |
Sort-Object -Descending -Property Longest |
Select-Object -First 10
Note that, for conceptual clarity, I've used -Descending to sort by longest words first, which then requires -First 10 instead of -Last 10 to get the top 10.
As for what you tried:
Sorting by the Count property sorts by frequency of occurrence instead, i.e. by how often each word appears in the input file, due to use of Group-Object.
Longest= $_.Length (note that your code had a typo there) accesses the length property of each group object, which is an instance of Microsoft.PowerShell.Commands.GroupInfo, not that of the word being grouped by.
(Since such a GroupInfo instance has no type-native .Length property, but PowerShell automatically provides such a property as an intrinsic member, in the interest of unified handling of collections and scalars. Since a group object itself is considered a scalar (single object), .Length returns 1. PowerShell also provides .Count with the same value - unless present type-natively, which is indeed the case here: a GroupInfo object's type-native .Count property returns the count of elements in the group).
The [pscustomobject] instances wrapping the word at hand are stored in the .Group property, and since they're all the same here, .Group[0].Word.Length can be used to return the length of the word at hand.
Function AnalyseTo-Doc{
param ([Parameter(Mandatory=$true)][string]$Pad )
$Lines = Select-String -Path $Pad -Pattern '\b[A-Za-zA-Яа-я]{2,}\b' -AllMatches
$Words = ForEach($Line in $Lines){
ForEach($Match in $Line.Matches){
[PSCustomObject]#{
LineNumber = $Line.LineNumber
Word = $Match.Value
}
}
}
$Words | Group-Object Word | ForEach-Object {
[PSCustomObject]#{
#Count= $_.Count
Word = $_.Name
Longest = $_.Group[0].Word.Length
}
} |
Sort-Object -Descending -Property Longest | Select-Object -First 10 | Out-File .\dz11-11.txt
}
AnalyseTo-Doc 1.txt

List down column headers and get the maximum length of string per column

I'm looking for a translation of my Excel formula in a form of a script in Powershell, vbscript or Excel VBA. I'm trying to get the list of column headers and the max length of string under it.
Normally, what I do is manually open the .txt file in Excel, from there I can get the header names.. next, I create an array formula =MAX(LEN(A1:A100,000)) for example. This will get the max length of string in the column. I'll do the same formula to other columns.
Right now I can't do this since files have increased to 1GB in size and i can't open them anymore, my desktop crashes. It is also maybe because theyre more than 1 million rows which Excel cant handle. My friend suggested Powershell but I have limited knowledge there.. don't know if it can be done in vbscript or Excel VBA.
Thanks in advance for your help.
Below code works for .csv files but does not with .txt delimited files -
$fileName = "C:\Desktop\EFile.csv"
<#
Sample format of c:\temp\data.csv
"id","name","grade","address"
"1","John","Grade-9","test1"
"2","Ben","Grade-9","test12222"
"3","Cathy","Grade-9","test134343"
#>
$colCount = (Import-Csv $fileName | Get-Member | Where-Object {$_.MemberType -eq 'NoteProperty'} | Measure-Object).Count
$csv = Import-Csv $fileName
$csvHeaders = ($csv | Get-Member -MemberType NoteProperty).name
$dict = #{}
foreach($header in $csvHeaders) {
$dict.Add($header,0)
}
foreach($row in $csv)
{
foreach($header in $csvHeaders)
{
if($dict[$header] -le ($row.$header).Length)
{
$dict[$header] =($row.$header).Length
}
}
}
$dict.Keys | % { "key = $_ , Column Length = " + $dict.Item($_) }
This is how I get my data.
$data = #"
"id","name","grade","address"
"1","John","Grade-9","test1"
"2","Ben","Grade-9","test12222"
"3","Cathy","Grade-9","test134343"
"#
$csv = ConvertFrom-Csv -Delimiter ',' $data
But you should get your data like this
$fileName = "C:\Desktop\EFile.csv"
$csv = Import-Csv -Path $fileName
And then
# Extract the header names
$headers = $csv | Get-Member -MemberType NoteProperty | Select-Object -ExpandProperty Name
# Capture output in $result variable
$result = foreach($header in $headers) {
# Select all items in $header column, find the longest, and select the item for output
$maximum = $csv | Select-Object -ExpandProperty $header | Measure-Object -Maximum | Select-Object -ExpandProperty Maximum
# Generate new object holding the information.
# This will end up in $results
[pscustomobject]#{
Header = $header
Max = $maximum.Length
String = $maximum
}
}
# Simple output
$result | Format-Table
This is what I get:
Header Max String
------ --- ------
address 10 test134343
grade 7 Grade-9
id 1 3
name 4 John
Alternatively, if you have memory issues dealing with large files, you may have to get a bit more dirty with the .NET framework. This snippet processes one csv line at a time, instead of reading the entire file into memory.
$fileName = "$env:TEMP\test.csv"
$delimiter = ','
# Open a StreamReader
$reader = [System.IO.File]::OpenText($fileName)
# Read the headers and turn it into an array, and trim away any quotes
$headers = $reader.ReadLine() -split $delimiter | % { $_.Trim('"''') }
# Prepare a hashtable for the results
$result = #{}
# So long as there's more data, keep running
while(-not $reader.EndOfStream) {
# Read a single line and process it as csv
$csv = $reader.ReadLine() | ConvertFrom-Csv -Header $headers -Delimiter $delimiter
# Determine if the item in the result hashtable is smaller than the current, using the header as a key
foreach($header in $headers) {
$item = $csv | Select-Object -ExpandProperty $header
if($result[$header].Maximum -lt $item.Length) {
$result[$header] = [pscustomobject]#{
Header = $header
Maximum = $item.Length
String = $item
}
}
}
}
# Clean up our spent resource
$reader.Close()
# Simple output
$result.Values | Format-Table

how to format data acquired using powershell import-csv

I imported a csv file using import-csv. I only wanted the account number from the list.
I used
$Numbers = Import-Csv csv.csv |Select-Object -ExpandProperty 'Account Number' |Where-Object {$_ -ne "0"} | Out-Null
but i want enclose the account number with "'" and separate the account number with ",". Ideally the list should look like: 'account1','account2',...,'accountlast'. I could not manipulate the $ number variable like array.
Some string manipulation and a -join should be able to get this for you. Although an array which is returned from Import-Csv csv.csv | Select-Object -ExpandProperty 'Account Number' | Where-Object {$_ -ne "0"} would be considered more versatile!
$numbers = "'{0}'" -f ((Import-Csv csv.csv | Select-Object -ExpandProperty 'Account Number' | Where-Object {$_ -ne "0"}) -join "','")
Join the account numbers with ',' and then we use the format operator to enclose that into the outer single quotes.
You are very close. You can import the csv, select the object, then use a foreach to format each object, and finally write to the host with one line instead of breaks for each.
Import-Csv csv.csv | Select-Object -ExpandProperty 'Account Number' | Where-Object {$_ -ne "0"} | ForEach{"'" + $_ + "', "} | Write-Host -NoNewLine

Adding data to variable results in a singleline string

I have a PowerShell PSCustomObject $result which I am filtering with multiple Where-Object statements.
$SatServers = $Global:result | Where-Object {$_ -like '*sat?2:00*' -and `
$_.MaintenanceWindow -notmatch 'all.da.servers' -and `
$_.Server -match "^IT"} | % {"{0}" -f $_.Server}
$SatServers += $Global:result | Where-Object {$_ -like '*sat?18:00*' -and `
$_.MaintenanceWindow -notmatch 'all.da.servers' -and `
$_.Server -match "^IT"} | % {"{0}" -f $_.Server}
$SatServers | Out-File d:\path\file.txt
If I output to the console or if I pipe to Out-File it looks great but when I send the output to a variable as seen above, I get output on a single line.
Is there something I'm missing in order to get a variable with a multiple line result? -Thanks!
You need to initialize $SatServers as an array first. You first call makes $SatServers a string-object and when you append (+=) a string to a string it simply adds it to the end of the last string.
String:
$string = "Hello"
$string += "Frode"
$string.GetType() | ft -AutoSize
IsPublic IsSerial Name BaseType
-------- -------- ---- --------
True True String System.Object
$string
HelloFrode
Array:
$string = #()
$string += "Hello"
$string += "Frode"
$string.GetType() | ft -AutoSize
IsPublic IsSerial Name BaseType
-------- -------- ---- --------
True True Object[] System.Array
$string
Hello
Frode
You could also have added a NewLine at the end of the string by doing the following change in your Foreach-Object-scriptblock (but personally I prefer the array-solution).
% { ("{0}" -f $_.Server) + [environment]::NewLine }

How to append strings to other strings in a data set?

I want to append several strings in a data set with custom strings.
Example
Content of Dataset:
Test1
Test2
Test3
Result after appending:
Test1.com
Test2.com
Test3.com
Would I have to use regex to parse to the end of each Test[n] to be able to append it with a custom string (.com)? Has anyone got an example that describes exactly how to do it?
I am reading from a SQL-Table and writing values into a DataSet which is exported to CSV the following way:
$DataSet.Tables[0] | ConvertTO-Csv -Delimiter ',' -NotypeInformation |`% { $_ -replace '"','' } | out-file $outfile -Encoding "unicode"
The DataSet contains of Strings such as:
Banana01
Banana02
Apple01
Cherry01
Cherry02
Cherry03
The thing I want to do is append .com to only Cherry01, Cherry02, and Cherry03, and after appending .com, export it as a CSV file.
There are many ways. Here are a few:
# Using string concatenation
'Test1','Test2','Test3' | Foreach-Object{ $_ + '.com' }
# Using string expansion
'Test1','Test2','Test3' | Foreach-Object{ "$_.com" }
# Using string format
'Test1','Test2','Test3' | Foreach-Object{ "{0}{1}" -f $_,'.com' }
You could use something like this:
Example 1
$t = "test"
$t = $t + ".com"
Example 2
$test = #("test1","test2")
$test | ForEach-Object {
$t = $_ + ".com"
Write-Host $t}
With your added code I did this. I don't have a database to test it on, so I made the data set manually, so you might just have to change the $DataSet[0] in my code to $DataSet.Tables[0].
$DataSet[0] | ConvertTO-Csv -Delimiter ',' -NotypeInformation | Foreach-Object{$T=$_
IF($T -match "(Cherry\d\d)"){$T = $T -replace "(Cherry\d\d)(.+)",'$1.com$2'};$T } | out-file $outfile -Encoding "unicode"
$array | %{if($_ -match "^Cherry\d\d"){$_ += ".com"};$_}

Resources