variable and array based on line of text - string

The file i am trying to read looks like example below
variable, arrElement1, arrElement2, arrElemen3, arrelement[n]...
variable, arrElement1, arrElement2, arrElemen3, arrelement[n]...
.
.
.
variable, arrElement1, arrElement2, arrElemen3, arrelement[n]...
what i am trying to achieve is to read this file and assign "variable" as one element variable
and arrElement's as array of elements
something like that:
:PseudoCode:
foreach (line in text file)
$variable=variable
$array = "arrElement1", "arrElement2", "arrElement3", ....
foreach( $element in $array) {
'do some stuff'
}
thank you in advance

Something like this should work:
Get-Content 'C:\your.txt' | % {
$arr = $_ -split '\s*,\s*'
New-Variable -Name $arr[0] -Value $arr[1..$arr.Length]
}
Edit: According to your pseudo code you meant to keep the first field in one variable and the rest of the fields as an array in the second variable. That's even simpler to achieve:
Get-Content 'C:\your.txt' | % {
$var, $arr = $_ -split '\s*,\s*'
$arr | % {
# do stuff
}
}

Related

How to get a hashtable in PowerShell from a multiline string in which keys and values are on different lines?

I have a string of the following format (the number of lines in this string may vary):
$content = #"
key1
value1
key2
value2
key3
value3
"#
I want to put this data in a hashtable.
(In my case, the data in the $content variable is received in the body of the HTTP response from the Invoke-WebRequest cmdlet by Client/Server Protocol in the 'LiveJournal'. But I am interested in the answer to my question for the general case as well.)
I tried to use the cmdlet ConvertFrom-StringData, but it doesn't work for this case:
PS C:\> ConvertFrom-StringData -StringData $content -Delimiter "`n"
ConvertFrom-StringData: Data line 'key1' is not in 'name=value' format.
I wrote the following function:
function toHash($str) {
$arr = $str -split '\r?\n'
$hash = #{}
for ($i = 0; $i -le ($arr.Length - 1); $i += 2) {
$hash[$arr[$i]] = $arr[$i + 1]
}
return $hash
}
This function works well:
PS C:\> toHash($content)
Name Value
---- -----
key3 value3
key2 value2
key1 value1
My question is: is it possible to do the same thing, but shorter or more elegant? Preferably in one-liner (see the definition of this term in the book 'PowerShell 101'). Maybe there is a convenient regular expression for this case?
As commented by #Santiago Squarzon;
The code you already have looks elegant to me, shorter -ne more elegant
For the "Preferably in one-liner", what exactly is definition of a one line:
A single line, meaning a text string with no linefeeds?
Or a single statement meaning a text string with no linefeeds and no semicolons?
Knowing that there are several ways to cheat on this, like assigning a variable in a condition (which is hard to read).
Anyways, a few side notes:
The snippet you show might have a pitfall if you have an odd number of lines and Set-StrictMode -Version Lastest enabled:
Set-StrictMode -Version Latest
toHash "key1`nvalue1`nkey2"
OperationStopped:
Line |
5 | $hash[$arr[$i]] = $arr[$i + 1]
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| Index was outside the bounds of the array.
Name Value
---- -----
key1 value1
The variable neme $content, suggests that you reading the data from a file, possibly with Get-Connent. If that is indeed the case you might consider to stream the input (which conserves memory):
$Content -split '\r?\n' | # Get-Content .\Data.txt
Foreach-Object -Begin {
$Hash = #{}
$Key = $Null
} -Process {
if (!$Key) {
$Key = $_
}
else {
$Hash[$Key] = $_
$Key = $Null
}
} -End {
$Hash
}
And if you create use an [ordered] dictionary, you might even put this is a single statement like:
$Content -split '\r?\n' |Foreach-Object { $h = [Ordered]#{} } { if (!$h.count -or $h[-1]) { $h[$_] = $Null } else { $h[$h.Count - 1] = $_ } } { $h }
(Note that -as with the propose in the question- I do not take into account that there might be empty lines in the input data)
See also PowerShell issue: #13817 Enhance hash table syntax

Manipulate strings in a txt file with Powershell - Search for words in each line containing "." and save them in new txt file

I have a text file with different entries. I want to manipulate it to filter out always the word, containing a dot (using Powershell)
$file = "C:\Users\test_folder\test.txt"
Get-Content $file
Output:
Compass Zype.Compass 1.1.0 thisisaword
Pomodoro Logger zxch3n.PomodoroLogger 0.6.3 thisisaword
......
......
......
Bla Word Program.Name 1.1.1 this is another entry
As you can see, in all lines, the "second" "word" contains a dot, like "Program.Name".
I want to create a new file, which contains just those words, each line one word.
So my file should look something like:
Zype.Compass
zxch3n.PomodoroLogger
Program.Name
What I have tried so far:
Clear-Host
$folder = "C:\Users\test_folder"
$file = "C:\Users\test_folder\test.txt"
$content_txtfile = Get-Content $file
foreach ($line in $content_textfile)
{
if ($line -like "*.*"){
$line | Out-File "$folder\test_filtered.txt"
}
}
But my output is not what I want.
I hope you get what my problem is.
Thanks in advance! :)
Here is a solution using Select-String to find sub strings by RegEx pattern:
(Select-String -Path $file -Pattern '\w+\.\w+').Matches.Value |
Set-Content "$folder\test_filtered.txt"
You can find an explanation and the ability to experiment with the RegEx pattern at RegEx101.
Note that while the RegEx101 demo also shows matches for the version numbers, Select-String gives you only the first match per line (unless argument -AllMatches is passed).
This looks like fixed-width fields, and if so you can reduce it to this:
Get-Content $file | # Read the file
%{ $_.Substring(29,36).Trim()} | # Extract the column
?{ $_.Contains(".") } | # Filter for values with "."
Set-Content "$folder\test_filtered.txt" # Write result
Get-content is slow and -like is sometimes slower than -match. I prefer -match but some prefer -like.
$filename = "c:\path\to\file.txt"
$output = "c:\path\to\output.txt"
foreach ($line in [System.IO.File]::ReadLines($filename)) {
if ($line -match "\.") {
$line | out-file $output -append
}
}
Otherwise for a shorter option, maybe
$filename = "c:\path\to\file.txt"
$output = "c:\path\to\output.txt"
Get-content "c:\path\to\file.txt" | where {$_ -match "\.") | Out-file $output
For other match options that are for the first column, either name the column (not what you do here) or use a different search criteria
\. Means a period anywhere seein the whole line
If it's all periods and at the beginning you can use begining of line so..
"^\." Which means first character is a period.
If it's always a period before the tab maybe do an anything except tab period anything except tab or...
"^[^\t]*\.[^\t]*" this means at the start of the line anything except tab any quantity then a period then anything except a tab any number of times.

How can I modify this PowerShell script to continue looking for one string after another?

I want this power shell script to search for the occurrence of multiple strings, one after the other, and to append the results in a .txt file.
Currently I am specifying the string that I want to look for, waiting for the script to finish looking for that string and transferring the results into a spreadsheet. This is taking a lot of time as I have to keep specifying the string I want to look for, especially since there are well over 100 that I need to look for.
#ERROR REPORTING ALL
Set-StrictMode -Version latest
$path = "C:\Users\username\Documents\FileName"
$files = Get-Childitem $path -Include *.docx,*.doc,*.ppt, *.xls,
*.xlsx, *.pptx, *.eap -Recurse | Where-Object { !($_.psiscontainer) }
$output =
"C:\Users\username\Documents\FileName\wordfiletry.txt"
$application = New-Object -comobject word.application
$application.visible = $False
$findtext = "First_String"
Function getStringMatch
{
# Loop through all *.doc files in the $path directory
Foreach ($file In $files)
{
$document = $application.documents.open($file.FullName,$false,$true)
$range = $document.content
$wordFound = $range.find.execute($findText)
if($wordFound)
{
"$file.fullname has found the string called $findText and it is
$wordfound" | Out-File $output -Append
}
}
$document.close()
$application.quit()
}
getStringMatch
This script will look for 'First_String' successfully, I was hoping to be able to specify 'Second_String', 'Third_String' etc rather than replace First_String every time.
As an alternative to the suggestion from #Mathias, you could use Regex to query the document text instead.
Read the context of the document as a string $text = $document.content.text and then use Select-String $findtext -AllMatches to evaluate the matches with $findtext as string representation of a regular expression instead.
Example:
# pipe delimited string as a regular expression
$findtext = "First_String|Second_String|Third_String"
Function getStringMatch
{
# Loop through all *.doc files in the $path directory
Foreach ($file In $files)
{
$document = $application.documents.open($file.FullName,$false,$true)
$text = $document.content.text
$result = $text | Select-String $findtext -AllMatches
if($result)
{
"$file.fullname has found the strings called $($result.Matches.Value) at indexes $($result.Matches.Index)" | Out-File $output -Append
}
}
$document.close()
$application.quit()
}
Note that if you're trying find strings that do have reserved regex character, you'll need to escape them first

Variable name is content of another variable

I have a list of variables:
$desa = "filtering regex for desa"
$cdo = "different regex for cdo"
etc.
Now, I have a loop:
Foreach ($profilename in ("desa", "cdo")) {
# filter out data from $profilename file where regex is contained in
# variable named after the content of $profilename
}
So, in other words, I need to use a string contained in one of the variables at the top, and the name of that variable is the exact content of the $profilename variable.
Can PowerShell do this?
Might be easier to us a hash table of regexes than separate variables for each one:
$filters = #{
desa = "filtering regex for desa"
cdo = "different regex for cdo"
}
Foreach ($profilename in
( "desa", "cdo")
)
{
(Get-content <profilename file>) -match $filters[$profilename]
}
Just name the keys after your profile names.
As #mjolinor said: hashtables are a better approach for this. However, if for some reason you must expand a "constructed" variable you can do it by using the $ExecutionContext automatic variable:
PS C:\> $a = 'foo'
PS C:\> $b = 'a'
PS C:\> $c = "`$$b"
PS C:\> $c
$a
PS C:\> $ExecutionContext.InvokeCommand.ExpandString($c)
foo
Applied to your code that might look like this:
$desa = "filtering regex for desa"
$cdo = "different regex for cdo"
foreach ($profilename in 'desa','cdo') {
$pattern = $ExecutionContext.InvokeCommand.ExpandString("`$$profilename")
$something | ? { $_ -match $pattern } | ...
}

Optimizing simple search script in PowerShell

I need to create a script to search through just below a million files of text, code, etc. to find matches and then output all hits on a particular string pattern to a CSV file.
So far I made this;
$location = 'C:\Work*'
$arr = "foo", "bar" #Where "foo" and "bar" are string patterns I want to search for (separately)
for($i=0;$i -lt $arr.length; $i++) {
Get-ChildItem $location -recurse | select-string -pattern $($arr[$i]) | select-object Path | Export-Csv "C:\Work\Results\$($arr[$i]).txt"
}
This returns to me a CSV file named "foo.txt" with a list of all files with the word "foo" in it, and a file named "bar.txt" with a list of all files containing the word "bar".
Is there any way anyone can think of to optimize this script to make it work faster? Or ideas on how to make an entirely different, but equivalent script that just works faster?
All input appreciated!
If your files are not huge and can be read into memory then this version should work quite faster (and my quick and dirty local test seems to prove that):
$location = 'C:\ROM'
$arr = "Roman", "Kuzmin"
# remove output files
foreach($test in $arr) {
Remove-Item ".\$test.txt" -ErrorAction 0 -Confirm
}
Get-ChildItem $location -Recurse | .{process{ if (!$_.PSIsContainer) {
# read all text once
$content = [System.IO.File]::ReadAllText($_.FullName)
# test patterns and output paths once
foreach($test in $arr) {
if ($content -match $test) {
$_.FullName >> ".\$test.txt"
}
}
}}}
Notes: 1) mind changed paths and patterns in the example; 2) output files are not CSV but plain text; there is not much reason in CSV if you are interested just in paths - plain text files one path per line will do.
Let's suppose that 1) the files are not too big and you can load it into memory, 2) you really just want the Path of the file, that matches (not the line etc.).
I tried to read the file only once and then iterate through the regexes. There is some gain (it's a faster then the original solution), but the final result will depend on other factors like file sizes, count of files etc.
Also removing 'ignorecase' makes it faster a little bit.
$res = #{}
$arr | % { $res[$_] = #() }
Get-ChildItem $location -recurse |
? { !$_.PsIsContainer } |
% { $file = $_
$text = [Io.File]::ReadAllText($file.FullName)
$arr |
% { $regex = $_
if ([Regex]::IsMatch($text, $regex, 'ignorecase')) {
$res[$regex] = $file.FullName
}
}
}
$res.GetEnumerator() | % {
$_.Value | Export-Csv "d:\temp\so-res$($_.Key).txt"
}

Resources