Filter items that are older than a specific date - azure

I have a portion of a script that will currently provide me items that are up to a specific day old. I would like it instead to go back that many days a d then get anything older than that date. How should I modify this to achieve that result?
If ($null -notlike $UpdatedSinceDays) {
$filterDate = ("(LastUpdatedDateTime gt {0})" -f (Get-Date (get-date).AddDays($UpdatedSinceDays) -UFormat %y-%m-%dT00:00:00z))
If ($null -eq $filterbuilder) {
$filterbuilder = $filterDate
}
Else {
Rest of filter statement
}
}
$filterbuilder gets fed into $ParamCollection.Filter to add several filters to a command.

The Get-ChildItem cmdlet in PowerShell gets files from the specified directory and using recurse parameters it recursively gets files from folders and subfolders.
The Where-Object cmdlet is used to filter condition for files having CreationTime is older than 15 days.
Example Script to find all files older than 15 days in the C:\temp directory
# Search Path
$Folder = "C:\temp\"
# Search using gci cmdlet
Get-ChildItem -Path $folder -Recurse | Where-Object { $_.CreationTime -lt (Get-Date).AddDays(-15) }

Related

Powershell | How can I use Multi Threading for my File Deleter Powershell script?

So I've written a Script to delete files in a specific folder after 5 days. I'm currently implementing this in a directory with hundreds of thousands of files and this is taking a lot of time.
This is currently my code:
#Variables
$path = "G:\AdeptiaSuite\AdeptiaSuite-6.9\AdeptiaServer\ServerKernel\web\repository\equalit\PFRepository"
$age = (Get-Date).AddDays(-5) # Defines the 'x days old' (today's date minus x days)
# Get all the files in the folder and subfolders | foreach file
Get-ChildItem $path -Recurse -File | foreach{
# if creationtime is 'le' (less or equal) than $age
if ($_.CreationTime -le $age){
Write-Output "Older than $age days - $($_.name)"
Remove-Item $_.fullname -Force -Verbose # remove the item
}
else{
Write-Output "Less than $age days old - $($_.name)"
}
}
I've searched around the internet for some time now to find out how to use
Runspaces, however I find it very confusing and I'm not sure how to implement it with this script. Could anyone please give me an example of how to use Runspaces for this code?
Thank you very much!
EDIT:
I've found this post: https://adamtheautomator.com/powershell-multithreading/
And ended up changing my script to this:
$Scriptblock = {
# Variables
$path = "G:\AdeptiaSuite\AdeptiaSuite-6.9\AdeptiaServer\ServerKernel\web\repository\equalit\PFRepository"
$age = (Get-Date).AddDays(-5) # Defines the 'x days old' (today's date minus x days)
# Get all the files in the folder and subfolders | foreach file
Get-ChildItem $path -Recurse -File | foreach{
# if creationtime is 'le' (less or equal) than $age
if ($_.CreationTime -le $age){
Write-Output "Older than $age days - $($_.name)"
Remove-Item $_.fullname -Force -Verbose # remove the item
}
else{
Write-Output "Less than $age days old - $($_.name)"
}
}
}
$MaxThreads = 5
$RunspacePool = [runspacefactory]::CreateRunspacePool(1, $MaxThreads)
$RunspacePool.Open()
$Jobs = #()
1..10 | Foreach-Object {
$PowerShell = [powershell]::Create()
$PowerShell.RunspacePool = $RunspacePool
$PowerShell.AddScript($ScriptBlock).AddArgument($_)
$Jobs += $PowerShell.BeginInvoke()
}
while ($Jobs.IsCompleted -contains $false) {
Start-Sleep 1
}
However I'm not sure if this works correctly now, I don't get any error's however the Terminal doesn't do anything, so I'm not sure wether it works or just doesn't do anything.
I'd love any feedback on this!
The easiest answer is: get PowerShell v7.2.5 (look in the assets for PowerShell-7.2.5-win-x64.zip), download and extract it. It's a no-install PowerShell 7 which has easy multithreading and lets you change foreach { to foreach -parallel {. The executable is pwsh.exe.
But, if it's severely overloading the server, running it several times will only make things worse, right? And I think the Get-ChildItem will be the slowest part, putting the most load on the server, and so doing the delete in parallel probably won't help.
I would first try changing the script to this shape:
$path = "G:\AdeptiaSuite\AdeptiaSuite-6.9\AdeptiaServer\ServerKernel\web\repository\equalit\PFRepository"
$age = (Get-Date).AddDays(-5)
$logOldFiles = [System.IO.StreamWriter]::new('c:\temp\log-oldfiles.txt')
$logNewFiles = [System.IO.StreamWriter]::new('c:\temp\log-newfiles.txt')
Get-ChildItem $path -Recurse -File | foreach {
if ($_.CreationTime -le $age){
$logOldFiles.WriteLine("Older than $age days - $($_.name)")
$_ # send file down pipeline to remove-item
}
else{
$logNewFiles.WriteLine("Less than $age days old - $($_.name)")
}
} | Remove-Item -Force
$logOldFiles.Close()
$logNewFiles.Close()
So it pipelines into remove-item and doesn't send hundreds of thousands of text lines to the console (also a slow thing to do).
If that doesn't help, I would switch to robocopy /L and maybe look at robocopy /L /MINAGE... to do the file listing, then process that to do the removal.
(I also removed the comments which just repeat the lines of code # removed comments which repeat what the code says.
The code tells you what the code says # read the code to see what the code does. Comments should tell you why the code does things, like who wrote the script and what business case was it solving, what is the PFRepository, why is there a 5 day cutoff, or whatever.)

Excluding lines, which are not containing one or multiple strings from text file

I have multiple server log files. In total they contain around 500.000 lines of log text. I only want to keep the lines that contain "Downloaded" and "Log". Lines I want to exclude are focussing on error logs and basic system operations like "client startup", "client restart" and so on.
An example of the lines we are looking for is this one:
[22:29:05]: Downloaded 39 /SYSTEM/SAP logs from System-4, customer (000;838) from 21:28:51,705 to 21:29:04,671
The lines that are to be kept should be complemented by the date string, which is part of the log-file name. ($date)
Further, as the received logs are rather unstructured, the filtered files should be transformed into one csv-file (columns: timestamp, log downloads, system directory, system type, customer, start time, end time, date [to be added to every line from file name]. The replace operation of turning spaced into comma is just a first try to bring in some structure to the data. This file is supposed to be loaded into a python dashboard program.
At the moment it takes 2,5 mins to preprocess 3 Txt-Files, while the target is 5-10 seconds maximum, if even possible.
Thank you really much for your support, as I'm struggeling with this since Monday last week. Maybe powershell is not the best way to go? I'm open for any help!
At the moment I'm running this powershell script:
$files = Get-ChildItem "C:\Users\AnonUser\RestLogs\*" -Include *.log
New-Item C:\Users\AnonUser\RestLogs\CleanedLogs.txt -ItemType file
foreach ($f in $files){
$date = $f.BaseName.Substring(22,8)
(Get-Content $f) | Where-Object { ($_ -match 'Downloaded' -and $_ -match 'SAP')} | ForEach-Object {$_ -replace " ", ","}{$_+ ','+ $date} | Add-Content CleanedLogs.txt
}
This is about the fastest I could manage. I didn't test using -split vs -replace or special .NET methods:
$files = Get-ChildItem "C:\Users\AnonUser\RestLogs\*" -Include *.log
New-Item C:\Users\AnonUser\RestLogs\CleanedLogs.txt -ItemType file
foreach ($f in $files) {
$date = $f.BaseName.Substring(22,8)
(((Get-Content $f) -match "Downloaded.*?SAP") -replace " ",",") -replace "$","$date" | add-content CleanedLogs.txt
}
In general, speed is gained by removing loops and Where-Object "filtering."

Move files that contain a string to a subfolder with the same name as the original (PowerShell)

I'm using PowerShell and it is two days that I'm struggling on this issue.
In the directory C:\dir_1 I have many subfolders (sub_1, sub_2, ..., sub_n). Each of them contains several text files. For each subfolder i=1,2,...,n, I want to move the text files that contain the string "My-String" to the directory C:\dir_2\sub_i.
For example, if the file X in the path C:\dir1\sub_5 contains the string "My-String", I want to move it to the location C:\dir_2\sub_5. The destination folder is already existing.
I tried several modifications of the following code, but it does not work:
Get-ChildItem "C:\dir_1" | Where-Object {$_.PSIsContainer -eq $True} | Foreach-Object {Get-ChildItem "C:\dir_1\$_" | Select-String -pattern "My-String" | group path | select name | %{Move-Item $_.name "C:\dir_2\$_"}}
So, basically, what I tried to do is: foreach subfolder in dir_1, take the files that contain the string and move them to the subfolder in dir_2 with the same name. I tried several small modifications of that code, but I cannot get around my mistakes. The main error is "move-item: The given path format is not supported"... any help?
I feel like I could do better but this is my first approach
$dir1 = "C:\temp\data\folder1"
$dir2 = "C:\temp\data\folder2"
$results = Get-ChildItem $dir1 -recurse | Select-String -Pattern "asdf"
$results | ForEach-Object{
$parentFolder = ($_.Path -split "\\")[-2]
Move-Item -Path $_.Path -Destination ([io.path]::combine($dir2,$parentFolder))
}
Select-String can take file paths for its pipeline input. We feed it all the files that are under $dir1 using -recurse to get all of its children in sub folders. $results would contain an array of match objects. One of the properties is the path of the matched file.
With all of those $results we then go though each and extract the parent folder from the path. Then combine that folder with the path $dir2 in order to move it to it destination.
There are several assumptions that we are taking here. Some we could account for if need be. I will mention the one I know could be an issue first.
Your folders should not have any other subfolders under "sub_1, sub_2, ..., sub_n" else they will attempt to move incorrectly. This can be addressed with a little more string manipulation. In an attempt to make the code terse using -Recurse created this caveat.
Here is a one liner that does what you want too:
Get-ChildItem "C:\dir_1" | Where-Object {$_.PSIsContainer -eq $True} | ForEach-Object {$SubDirName = $_.Name;ForEach ($File in $(Get-ChildItem $_.FullName)){If ($File.Name -like "*My-String*"){Move-Item $File.FullName "C:\dir_2\$SubDirName"}}}
And if you'd like to see it broken out like Matt's answer:
$ParentDir = Get-ChildItem "C:\dir_1" | Where-Object {$_.PSIsContainer -eq $True}
ForEach ($SubDir in $ParentDir){
$SubDirName = $SubDir.Name
ForEach ($File in $(Get-ChildItem $SubDir.FullName)){
If ($File.Name -like "*My-String*"){
Move-Item $File.FullName "C:\dir_2\$SubDirName"
}
}
}

Powershell: Searching Content of files and write results to text file

I'm new to powershell so I don't know where to start. I want a script that searches in all (pdf, word, excell, powerpoint, ...) file content for a specific string combination.
I tried this script but it doesn't work:
function WordSearch ($sample, $staining, $sampleID, $patientID, $folder)
{
$objConnection = New-Object -com ADODB.Connection
$objRecordSet = New-Object -com ADODB.Recordset
$objConnection.Open(“Provider=Search.CollatorDSO;Extended Properties=’Application=Windows’;”)
$objRecordSet.Open(“SELECT System.ItemPathDisplay FROM SYSTEMINDEX WHERE ((Contains(Contents,’$sample’)) or (Contains(Contents,’$sampleID’) and Contains(Contents,’$staining’)) or (Contains(Contents,’$staining’) and Contains(Contents,’$patientID’))) AND System.ItemPathDisplay LIKE ‘$folder\%’”, $objConnection)
if ($objRecordSet.EOF -eq $false) {$objRecordSet.MoveFirst() }
while ($objRecordset.EOF -ne $true) {
$objRecordset.Fields.Item(“System.ItemPathDisplay”).Value
$objRecordset.MoveNext()
}
}
Can someone help me?
You should try this, but first make sure your in the folder you want to start searching down: (if your trying to search your whole computer, start in C:\ , but I imagine the script will take a decent amount of time to run.
$Paths = #()
$Paths = gci . *.* -rec | where { ! $_.PSIsContainer } |? {($_.Extension -eq ".doc") -or ($_.Extension -eq ".ppt") -or ($_.Extension -eq ".pdf") -or ($_.Extension -eq ".xls")} | resolve-path
This will retrieve all the file paths of those file types. If you have Microsoft office 2007 or above you may want to add searches for ".xlsx" or ".docx" or ".pptx"
Then you can begin looking through those files for your "specific string combination
array = #()
foreach($path in $Paths)
{$array += Select-String -Path $Path -Pattern "Search String"}
This will give you all the lines and paths that that string exists on in those files. The actual line output you get may be a little distorted though due to microsoft encrypting their files. Use $array | get-member -MemberType Property to find what items you can index to and the Select-object commandlet to pull those items out.

PowerShell FINDSTR eqivalent?

What's the DOS FINDSTR equivalent for PowerShell? I need to search a bunch of log files for "ERROR".
Here's the quick answer
Get-ChildItem -Recurse -Include *.log | select-string ERROR
I found it here which has a great indepth answer!
For example, find all instances of "#include" in the c files in this directory and all sub-directories.
gci -r -i *.c | select-string "#include"
gci is an alias for get-childitem
Just to expand on Monroecheeseman's answer. gci is an alias for Get-ChildItem (which is the equivalent to dir or ls), the -r switch does a recursive search and -i means include.
Piping the result of that query to select-string has it read each file and look for lines matching a regular expression (the provided one in this case is ERROR, but it can be any .NET regular expression).
The result will be a collection of match objects, showing the line matching, the file, and and other related information.
if ($entry.EntryType -eq "Error")
Being Object Oriented, you want to test the property in question with one of the standard comparison operators you can find here.
I have a PS script watching logs remotely for me right now - some simple modification should make it work for you.
edit: I suppose I should also add that is a cmdlet built for this already if you don't want to unroll the way I did. Check out:
man Get-EventLog
Get-EventLog -newest 5 -logname System -EntryType Error
On a related note, here's a search that will list all the files containing a particular regex search or string. It could use some improvement so feel free to work on it. Also if someone wanted to encapsulate it in a function that would be welcome.
I'm new here so if this should go in it's own topic just let me know. I figured I'd put it her since this looks mostly related.
# Search in Files Script
# ---- Set these before you begin ----
$FolderToSearch="C:\" # UNC paths are ok, but remember you're mass reading file contents over the network
$Search="Looking For This" # accepts regex format
$IncludeSubfolders=$True #BUG: if this is set $False then $FileIncludeFilter must be "*" or you will always get 0 results
$AllMatches=$False
$FileIncludeFilter="*".split(",") # Restricting to specific file types is faster than excluding everything else
$FileExcludeFilter="*.exe,*.dll,*.wav,*.mp3,*.gif,*.jpg,*.png,*.ghs,*.rar,*.iso,*.zip,*.vmdk,*.dat,*.pst,*.gho".split(",")
# ---- Initialize ----
if ($AllMatches -eq $True) {$SelectParam=#{AllMatches=$True}}
else {$SelectParam=#{List=$True}}
if ($IncludeSubfolders -eq $True) {$RecurseParam=#{Recurse=$True}}
else {$RecurseParam=#{Recurse=$False}}
# ---- Build File List ----
#$Files=Get-Content -Path="$env:userprofile\Desktop\FileList.txt" # For searching a manual list of files
Write-Host "Building file list..." -NoNewline
$Files=Get-ChildItem -Include $FileIncludeFilter -Exclude $FileExcludeFilter -Path $FolderToSearch -ErrorAction silentlycontinue #RecurseParam|Where-Object{-not $_.psIsContainer} # #RecurseParam is basically -Recurse=[$True|$False]
#$Files=$Files|Out-GridView -PassThru -Title 'Select the Files to Search' # Manually choose files to search, requires powershell 3.0
Write-Host "Done"
# ---- Begin Search ----
Write-Host "Searching Files..."
$Files|
Select-String $Search #SelectParam| #The # instead of $ lets me pass the hastable as a list of parameters. #SelectParam is either -List or -AllMatches
Tee-Object -Variable Results|
Select-Object Path
Write-Host "Search Complete"
#$Results|Group-Object path|ForEach-Object{$path=$_.name; $matches=$_.group|%{[string]::join("`t", $_.Matches)}; "$path`t$matches"} # Show results including the matches separated by tabs (useful if using regex search)
<# Other Stuff
#-- Saving and restoring results
$Results|Export-Csv "$env:appdata\SearchResults.txt" # $env:appdata can be replaced with any UNC path, this just seemed like a logical place to default to
$Results=Import-Csv "$env:appdata\SearchResults.txt"
#-- alternate search patterns
$Search="(\d[-|]{0,}){15,19}" #Rough CC Match
#>
This is not the best way to do this:
gci <the_directory_path> -filter *.csv | where { $_.OpenText().ReadToEnd().Contains("|") -eq $true }
This helped me find all csv files which had the | character in them.
PowerShell has basically precluded the need for findstr.exe as the previous answers demonstrate. Any of these answers should work fine.
However, if you actually need to use findstr.exe (as was my case) here is a PowerShell wrapper for it:
Use the -Verbose option to output the findstr command line.
function Find-String
{
[CmdletBinding(DefaultParameterSetName='Path')]
param
(
[Parameter(Mandatory=$true, Position=0)]
[string]
$Pattern,
[Parameter(ParameterSetName='Path', Mandatory=$false, Position=1, ValueFromPipeline=$true)]
[string[]]
$Path,
[Parameter(ParameterSetName='LiteralPath', Mandatory=$true, ValueFromPipelineByPropertyName=$true)]
[Alias('PSPath')]
[string[]]
$LiteralPath,
[Parameter(Mandatory=$false)]
[switch]
$IgnoreCase,
[Parameter(Mandatory=$false)]
[switch]
$UseLiteral,
[Parameter(Mandatory=$false)]
[switch]
$Recurse,
[Parameter(Mandatory=$false)]
[switch]
$Force,
[Parameter(Mandatory=$false)]
[switch]
$AsCustomObject
)
begin
{
$value = $Pattern.Replace('\', '\\\\').Replace('"', '\"')
$findStrArgs = #(
'/N'
'/O'
#('/R', '/L')[[bool]$UseLiteral]
"/c:$value"
)
if ($IgnoreCase)
{
$findStrArgs += '/I'
}
function GetCmdLine([array]$argList)
{
($argList | foreach { #($_, "`"$_`"")[($_.Trim() -match '\s')] }) -join ' '
}
}
process
{
$PSBoundParameters[$PSCmdlet.ParameterSetName] | foreach {
try
{
$_ | Get-ChildItem -Recurse:$Recurse -Force:$Force -ErrorAction Stop | foreach {
try
{
$file = $_
$argList = $findStrArgs + $file.FullName
Write-Verbose "findstr.exe $(GetCmdLine $argList)"
findstr.exe $argList | foreach {
if (-not $AsCustomObject)
{
return "${file}:$_"
}
$split = $_.Split(':', 3)
[pscustomobject] #{
File = $file
Line = $split[0]
Column = $split[1]
Value = $split[2]
}
}
}
catch
{
Write-Error -ErrorRecord $_
}
}
}
catch
{
Write-Error -ErrorRecord $_
}
}
}
}
FYI:
If you update to Powershell version 7 you can use grep...
I know egrep is in powershell on Azure CLI...
But SS is there!
An old article here: [https://devblogs.microsoft.com/powershell/select-string-and-grep/]

Resources