Using Streamreader in Powershell - Illegal characters in path - streamreader

I am attempting to use Powershell to loop through a directory and then use Streamreader to return (on the console) any lines that were found that contain the keyword "FRD-009".
I've read just about every post on Stack Overflow pertaining to Streamreader and I haven't been able to make it work.
The error message I'm receiving is:
New-Object : Exception calling ".ctor" with "1" argument(s): "Illegal characters in path."
However, I imagine even if I was able to get that resolved, my script still wouldn't work the way I intended.
Here's my code:
***$fullname = #("C:\FRD\Test\*.txt")
process ($file = New-Object System.IO.StreamReader $fullname) {
Foreach($file in $fullname) {
If($line.contains("FRD-009")) {
$file.ReadLine()
If($line -eq $null) {
$file.close()
}
}
}
}***
Would someone please get me started on how to loop through a directory of files, searching for a specific keyword, then close that file, and loop to the next file (the order isn't important) in the directory? I have quite a few files and they all have 10,000+ lines so Streamreader seemed the way to go.
Thank you for your time.

Related

How do I use .Contains() on the string value returned from Get-Content?

I am using powershell script to check and see which java file in the directory is the main java file by checking the contents of each file and seeing if it contains "main". The problem is, when the script hits the Contains() method line, it returns false, even when I can see, when debugging, that the contents of my $javafilecontents clearly has the phrase "main".
What am I missing, or how to I get Contains to return true for this case?
Just to test thoroughly, I also tried other small key words that should be in the file such as $javafilecontents.Contains("import"), which is literally the very first word in the file, but it still returns false
I found a similar situation here:
Why is string.contains() returning false?
I assume this might have something to do with the string being too long. In their advice, they say to add the # symbol in front of the long string. I haven't tested this yet because I'm not sure how I would do that since my long string is a variable set from Get-Content
Code:
foreach ($userjavafile in $userjavafiles)
{
$javafilepath = "" + $destination + "\" + $userjavafile
$javafilecontents = Get-Content -Path $javafilepath
$mybool = ($javafilecontents.Contains("main"))
if ($mybool)
{
$mainjavafile = $userjavafile
}
$spaceseparatedjavafiles += "" + $userjavafile + " "
}
$destination is the path to the files that I created earlier in the code.
$javafilecontents has the contents of a .java file which includes the line "public static void main(String[] args){"
These two are tested to be correct since I can see in the debugger that the contents are correctly placed into the variable.
$mybool stays false, but the way I thought I understand Contains() it should be true
Get-Content -Path $javafilepath produces an array of strings where each line of the file is an array element. You will have issues using string method Contains() as PowerShell is actually using the Contains() method of the Array class. For the Array class Contains() method to return true, you would have to match an entire line.
Get-Content -Path $javafilepath -Raw will produce better results because it produces a single string. Then the string class Contains() method will allow for a substring match. Keep in mind the string class method is case-sensitive.
$javafilecontents = Get-Content -Path $javafilepath -Raw
$mybool = $javafilecontents.Contains("main")
The other option is to continue without the -Raw switch and loop through each line. Then you can use the string class Contains() against each line, which will be a single string. The where() method can filter an array based on a condition. Then you can cast the success of that filter to [bool]. A match will yield True and no match will yield False.
$javafilecontents = Get-Content -Path $javafilepath
$mybool = [bool]($javafilecontents.where{$_.Contains("main")})
The same approach as the above method can be used with the case-insensitive -match operator, which produces a cleaner solution. -cmatch is the case-sensitive version.
$javafilecontents = Get-Content -Path $javafilepath
$mybool = [bool]($javafilecontents -match 'main')

Split string in PowerShell by pattern

I have a fairly long string in PowerShell that I need to split. Each section begins with a date in format mm/dd/yyyy hh:mm:ss AM. Essentially what I am trying to do is get the most recent message in the string. I don't need to keep the date/time part as I already have that elsewhere.
This is what the string looks like:
10/20/2018 1:22:33 AM
Some message the first one in the string
It can be several lines long
With multiple line breaks
But this is still the first message in the string
10/21/2018 4:55:11 PM
This would be second message
Same type of stuff
But its a different message
I know how to split a string on specific characters, but I don't know how on a pattern like date/time.
Note:
The solution below assumes that the section are not necessarily chronologically ordered so that you must inspect all time stamps to determine the most recent one.
If, by contrast, you can assume that the last message is the most recent one, use LotPings' much simpler answer.
If you don't know ahead of time what section has the most recent time stamp, a line-by-line approach is probably best:
$dtMostRecent = [datetime] 0
# Split the long input string ($longString) into lines and iterate over them.
# If input comes from a file, replace
# $longString -split '\r?\n'
# with
# Get-Content file.txt
# If the file is large, replace the whole command with
# Get-Content file.txt | ForEach-Object { ... }
# and replace $line with $_ in the script block (loop body).
foreach ($line in $longString -split '\r?\n') {
# See if the line at hand contains (only) a date.
if ($dt = try { [datetime] $line } catch {}) {
# See if the date at hand is the most recent so far.
$isMostRecent = $dt -ge $dtMostRecent
if ($isMostRecent) {
# Save this time stamp as the most recent one and initialize the
# array to collect the following lines in (the message).
$dtMostRecent = $dt
$msgMostRecentLines = #()
}
} elseif ($isMostRecent) {
# Collect the lines of the message associated with the most recent date.
$msgMostRecentLines += $line
}
}
# Convert the message lines back into a single, multi-line string.
# $msgMostRecent now contains the multi-line message associated with
# the most recent time stamp.
$msgMostRecent = $msgMostRecentLines -join "`n"
Note how try { [datetime] $line } catch {} is used to try to convert a line to a [datetime] instance and fail silently, if it can't, in which case $dt is assigned $null, which in a Boolean context is interpreted as $False.
This technique works irrespective of the culture currently in effect, because PowerShell's casts always use the invariant culture when casting from strings, and the dates in the input are in one of the formats the invariant culture understands.
By contrast, the -as operator, whose use would be more convenient here - $dt =$line -as [datetime] - unexpectedly is culture-sensitive, as Esperento57 points out.
This surprising behavior is discussed in this GitHub issue.
Provided the [datetime] sections are ascending,
it should be sufficient to split on them with a RegEx and get the last one
((Get-Content .\test.txt -Raw) -split "\d+/\d+/\d{4} \d+:\d+:\d+ [AP]M`r?`n")[-1]
Output based on your sample string stored in file test.txt
This would be second message
Same type of stuff
But its a different message
you can split it by timestamp pattern like this:
$arr = $str -split "[0-9]{1,2}/[0-9]{1,2}/[0-9]{1,4} [0-9]{1,2}:[0-9]{1,2}:[0-9]{1,2} [AaPp]M\n"
To my knowledge you can't use any of the static String methods like Split() for this. I tried to find a regular expression that would handle the entire thing, but wasn't able to come up with anything that would quite break it up properly.
So, you'll need to go line by line, testing to see if it that line is a date, then concatenate the lines in between like the following:
$fileContent = Get-Content "inputFile.txt"
$messages = #()
$currentMessage = [string]::Empty
foreach($line in $fileContent)
{
if ([Regex]::IsMatch($line, "\d{1,2}/\d{1,2}/\d{4} \d{1,2}:\d{2}:\d{2} (A|P)M"))
{
# The current line is a date, the current message is complete
# Add the current message to the output, and clear out the old message
# from your temporary storage variable $currentMessage
if (-not [string]::IsNullOrEmpty($currentMessage))
{
$messages += $currentMessage
$currentMessage = [string]::Empty
}
}
else
{
# Add this line to the message you're building.
# Include a new line character, as it was stripped out with Get-Content
$currentMessage += "$line`n"
}
}
# Add the last message to the output
$messages += $currentMessage
# Do something with the message
Write-Output $messages
As the key to all of this is recognizing that a given line is a date and therefore the start of a message, let's look a bit more at the regex. "\d" will match any decimal character 0-9, and the curly braces immediately following indicate the number of decimal characters that need to match. So, "\d{1,2}" means "look for one or two decimal characters" or in this case the month of the year. We then look for a "/", 1 or 2 more decimal characters - "\d{1,2}", another "/" and then exactly 4 decimal characters - "\d{4}". The time is more of the same, with ":" in between the decimal characters instead of "/". At the end, there will either be "AM" or "PM" so we look for either an "A" or a "P" followed by an "M", which as a regular expression is "(A|P)M".
Combine all of that, and you get "\d{1,2}/\d{1,2}/\d{4} \d{1,2}:\d{2}:\d{2} (A|P)M" to determine if you have a date on that line. I believe it would also be possible to use[DateTime]::Parse() to determine if the line is a date, but then you wouldn't get to have fun with Regex's and would need a try-catch. For more info on Regex's in Powershell (which are just the .NET regex) see .NET Regex Quick Reference

Extension of Excel files remains when doing an import loop in Stata

I don't understand why when running the following simple piece of code my files get saved as xxx.xls.dta instead of xxx.dta and how to fix that.
clear all
cd "C:\Users\User\Documents\PhD\Research\Data\acqua_costiera"
global RawAcqua "C:\Users\User\Documents\PhD\Research\Data\acqua_costiera"
local acquacost : dir "C:\Users\User\Documents\PhD\Research\Data\acqua_costiera" files "*.xls", respectcase
foreach f of local acquacost {
import excel using "$RawAcqua\\`f'", cellrange(B4:N43310) firstrow clear
save "`f'.dta", replace
}
I have tried looking at other similar pieces of code online but I don't really get what is my mistake.
I'm looping over all of the files in that directory, but the files are named "acqua_costiera_`year'" so I guess some other loop might work too.
Your problem can be easily illustrated using the following toy example:
local acquacost one.xls two.xls three.xls
foreach f of local acquacost {
display "`f'.dta"
}
one.xls.dta
two.xls.dta
three.xls.dta
You need to tell Stata to only keep the filename, not the extension:
foreach f of local acquacost {
display "`= substr("`f'", 1, strpos("`f'", ".") - 1)'.dta"
}
Here i use the strpos() function to get the position of the first period and then use this as a reference point to extract the relevant portion of the string using the substr() function.
For saving the file with the proper name just use save, instead of display.

type mismatch when running Access.Application.DoCmd through Powershell

I'm having an issue with my code. I am hoping to use PowerShell to open an Access file, then export a table to an Excel file. so far, this is my code.
$MsAccess = New-object -com Access.Application
$MsAccess.OpenCurrentDatabase('<Filepath>',$false)
$MsAccess.Application.DoCmd.OpenTable("<TableName>")
$MsAccess.Application.DoCmd.OutputTo('acOutputTable, "<TableName>" , acFormatXLS , "OutputName.xls", true')
$MsAccess.CloseCurrentDatabase()
$MsAccess.Quit()
It will always error at:
$MsAccess.Application.DoCmd.OutputTo('acOutputTable, "TableName" , acFormatXLS , "OutputName.xls", true')
with an error stating:
$MsAccess.Application.DoCmd.OutputTo('acOutputTable, "TableName" , acFormatXLS , "OutputName.xls", true')
Exception calling "OutputTo" with "1" argument(s): "Type mismatch. (Exception
from HRESULT: 0x80020005 (DISP_E_TYPEMISMATCH))"
I have been attempting to export a table to excel for a bit of time, but i cant seem to find much documentation about using excel through PowerShell. Would anyone have any suggestions on how to resolve this issue?
note: placeholders such as TableName, filepath, etc are not actual names, just replacements
Try this:
$MsAccess.Application.DoCmd.OutputTo(0, "<TableName>" , 0, "OutputName.xls", $true)
I ended up using the docmd.transfer spreadsheet command in order to resolve this issue. I guess powershell doesn't like outputto.

Read values from excel and replace them in another file using Powershell

I need to find a way so that I can read values from an excel file and then replace all the corresponding values in another file accordingly. Basically, I found some discrepancy in one of the automated task we run and I need to convert some values within the file before I send it to the automated task. I have an excel file that list the "wrong" values and their corresponding "correct" values and I need to how Power shell can help me in this.
$docID = $args[0] $docid #Read Z ticker file
$Zfile = 'I:\IS\Rishabh\Z tickers Active.xls' # Find the .rps file imported automatically from schwab trust
$RPSFile= 'L:\Trading\Schwab Trust\Import\CS<%dmmdd-01yy>.RPS'
While (Get-Content $ZFile)
{
$_-cmatch 'A$','B$'| Set-Variable X-ticker # End Loop
}
(Get-Content $RPSfile) | ForEach-Object { $_-replace '%, ' ,'X-ticker' #End Loop }
Set-Content $RPSFile
You don't need to use Powershell. Excel itself has built in mechanisms for doing what you want. For example you could use the LOOKUP function in Excel.

Resources