Removing log files except for the most recent 5 using Excel VBA - excel

My Excel VBA worksheet creates logs in a directory. Currently, the logs keep building up as I do not remove them.
However, now I would like to only keep the most recent 5. My logs are created with filenames as below:
<worksheet_name>_YYYYMMDD_HH_MM_SS.log
My current method of doing this job is to throw these logs into an array, sort the array, and keep only the first 5.
My question is this: Does anyone have a better method of keeping only the most 5 recent log files?

That sounds like a workable solution. Use the FileSystemObject library to gather all the log files, then loop thru them.
One option: you could try deleting based on Date Created or Date Modified, i.e. if the file was created over x days ago, delete it.
Also, I don't know how important these files are, but you may want to just move them to a folder called Archive instead of outright deleting them.

One system we used a while ago was to keep e.g. 5 log files with a "gap". So you would create the first 5 log files:
Files: 1,2,3,4,5
Then, on the 6th day, your gap is at 6, so create 6 and delete 1
Files: ,2,3,4,5,6
The gap is now at 1. So for the next day, create 1, and delete 2
Files: 1, ,3,4,5,6
The gap is now at 2. So for the next day, create 2, and delete 3
Files: 1,2, ,4,5,6
etc etc
i.e. "Find the Gap" *, fill it with the new file, then delete the one after it.
Just an idea.
_* (yes this is a bad joke referring to the London Underground)

Even though this is an old question, since I needed this exact solution I figured I would add it here. This code assumes that the file name ends in something that is sortable by string comparison, so that could be files of a format SomeName_YYYY-MM-DD. Twenty-four hour time stamps can be incorporated as well. This process does not rename any files, so any incremental numeric scheme will need to be carefully managed by other code (i.e. you want to add _1, _2, etc. to the file names).
Note that this solution leverages collections which serve this purpose much better than an array.
Public Sub CleanBackups(filePathAndBaseName As String, fileExtension As String, maxCopiesToKeep As Integer)
'
' Calling Example
' CleanBackups "C:\Temp\MyLog", ".txt", 5
'
' The above example would keep only the 5 versions of the file pattern "C:\Temp\MyLog*.txt"
' that are "LARGEST" in terms of a string comparison.
' So if MyLog_1.txt thru MyLog_9.txt exist, it will delete MyLog_1.txt - MyLog_4.txt
' and leave MyLog_5.txt - MyLog_9.txt
' Highly recommend using pattern MyLog_{YYYY-MM-DD_HhNn}.txt
Dim pathOnly As String
Dim foundFileName As String
Dim oldestFileIndex As Integer
Dim iLoop As Integer
Dim fileNameCollection As New Collection
pathOnly = Left(filePathAndBaseName, InStrRev(filePathAndBaseName, "\"))
foundFileName = Dir(filePathAndBaseName & "*" & fileExtension, vbNormal)
Do While foundFileName <> ""
fileNameCollection.Add foundFileName
foundFileName = Dir
Loop
Do While fileNameCollection.Count > maxCopiesToKeep
' Find oldest file, using only the name which assumes it ends with YYYY-MM-DD and optionally a 24-hour time stamp
oldestFileIndex = 1
For iLoop = 2 To fileNameCollection.Count
If StrComp(fileNameCollection.Item(iLoop), fileNameCollection.Item(oldestFileIndex), vbTextCompare) < 0 Then
oldestFileIndex = iLoop
End If
Next iLoop
Kill pathOnly & "\" & fileNameCollection.Item(oldestFileIndex)
fileNameCollection.Remove oldestFileIndex
Loop
End Sub

Related

Create a list of files in current file for use later in code

I have a file that is referencing 3 files outside of it. The problem is the outside files will have their version number changed when there is a change.
code is currently written to pull from C:documents/product_eval
01-Main_v1 is the main file that pulls
02-Address_v1
03_Products_v1
However, we now have a v2 for each of these files. I need a way to have file_03 always pull the 3rd file in the list of files, no matter what the version.
I feel like listing the code I tried will just confuse people, as it is so far off.
but here is the coding it will be feeding into
Sub LaunchBatching3_00(sourcePath As String, exportFolder As String, activeName As String)
Dim Adrs_PATH As String
Dim Prod_PATH As String
Adrs_PATH = Application.ActiveWorkbook.Path & "\02-Address_v1.xlms"
Prod_PATH = Application.ActiveWorkbook.Path & "\03-Products_v1.xlms"
Dim sFiles as Object, fFiles as Variant ,ppath as String, count as Long,
Set fFiles=CreateObject("Scripting.FileSystemObject")
set sFiles =fFiles.GetFolder(ppath)'get the folder object from folder path(variable ppath)
count=0
For Each x in sFiles.Files'loops all files in folder search for the last version by filename
If InStr(1,x.Name,"_v")>0 then
If CLng(Mid(x.Name,InStr(1,x.Name,"_v")+2,1))>count then count=CLng(Mid(x.Name,InStr(1,x.Name,"_v")+2,1))'gets the largest version
End If
Next x
For Each x in sFiles.Files
If CLng(Mid(x.Name,InStr(1,x.Name,"_v")+2,1)=If CLng(Mid(x.Name,InStr(1,x.Name,"_v")+2,1)=count then
'... fill your actions, for example,
Workbooks.Open FileName:=x.Name
End if
Next x

Search txt and csv files for string using Excel VBA without opening files

I have a text file that is automatically generated from a machine. The machine writes the txt file in "chunks" (sorry I don't know the exact terminology). I need to pull data from this txt file, but I need the txt file to be finished before pulling data from it. I found a solution to verify that the machine has finished writing to the file... It is not as elegant as i had hoped, but seems to do the trick. Excel VBA opens a command prompt, the command prompt uses a Find command to find the string "End of Report"... This is basically one of the last lines of the txt file and pretty safe to assume the txt file is finished after this is found. This code runs in a loop 1000 times, every 10 seconds, until it finds this string or reaches 1000 tries...
The issue is that "result" returns some other characters besides just "End of Report" this is further complicated by the fact that I am attempting to run this on some csv files too... and "result" returns some additional characters also, but different from the ones returned from the txt files. For example, if I check the length of "result"... The length comes back as 43 on one file and 48 on another file... I think it is counting the file path + "End of Report" + a few more characters?
Anyways, I don't really need the "result"... I really only need a "true" / "false" if "Find" found "End of Report" or not... How can I accomplish this? Is there a different better way to do this? I am not familiar with command prompt programming.
Note: It is important that I search these files without opening them.
Sub test()
Dim SearchStr As String
Dim cmdLine As Object
Dim result As String
Dim FilePath As String
FilePath = "D:\test2.txt"
SearchStr = """End of Report"""
Set cmdLine = CreateObject("WScript.Shell")
result = cmdLine.Exec("%comspec% /C Find " & SearchStr & " " & Chr(34) & FilePath & Chr(34)).STDOut.ReadAll
Debug.Print (result)
End Sub
I am not really an expert in command line, but what I would do is export the result of the FIND command to a file, like here
Then I would check in your VBA code how many rows are in the file (either clean the file before, or check the number of rows before the export is done).
If the number of rows meets the criteria (probably 2 or more rows instead of 1), then you can set the flag to True.

How to know numerically the last saved file name

I am creating and saving .ini files in Excel. The files need to follow a specific naming convention that increments by 1 each time a file is created. Not all the files will be created in Excel, some will be done in a separate program. Is there a way to read the saved files in their folder to know which number is next?
For example there are files named exampleFile1.ini through exampleFile63.ini. From Excel using VBA or other means can I see that exampleFile63.ini was the last file and name the next exampleFile64.ini?
Thank you. I'm very new if thats not obvious.
This function will return the next available .INI file name:
Private Function GetNextFile() As String
Dim i As Long
Dim fileName As String
For i = 1 To 1000
' Update Path and File Name prefix below to where your .INI files will be stored.
fileName = "d:\path\exampleFile" & i & ".ini"
If Dir(fileName) = "" Then
GetNextFile = fileName
Exit Function
End If
Next
End Function
Call it like this:
Dim NextIniFile As String
NextIniFile = GetNextFile()

VBA Create folder and move file based on folder names in table

I have around 10,000 files all in one folder called "Z:\ContactLog\". The files are named "Contact_1.pdf", "Contact_2.pdf" etc. I also have an Access table with the file names listed in the first column and an associated group name in the second column. The group names are "Group1", Group2" etc.
I need help to write the VBA code to create a sub-folder for each group using the group name as the folder name, (e.g. "Z:\ContactLog\Group1\") and then move the files into the folders according to the group names listed against the file names in the table.
My research so far has found code for moving files based on the file name, but not based on a table field entry. Any help to get started with writing the VBA would be greatly appreciated. I am using Access 2010, but will do this in Excel if needed. Thank you.
I hope it isn't considered bad form to answer your own question, but I have just thought of and tested an answer using a completely different approach.
To achieve the goal I did the following:
Export the access table to Excel, so column A has the file name and column B has the name of the desired destination folder.
In column C use the formula...
=CONCATENATE("xcopy Z:\ContactLog\",A1,".pdf Z:\ContactLog\",B1,"\ /C")
Copy the formula downwards for all 10,000 entries
Copy and paste column C into a batch file
Run the batch file
Manually delete the source files
I have tried this on a small sample of the entries and it works perfectly. Xcopy will create the folders that don't exist. The switch "/C" will allow the batch to continue if there is an error (e.g. if the file does not exist).
Looks like your set, but I thought I would add an Access answer for the heck of it.
First back up the entire folder in question so you can revert incase something goes wrong. Next add a column in the file information table called FILE_MOVED so you can use it as a flag.
I've done this sort of thing a lot and have run into many issues like files moved, renamed, locked, etc. (If there's an error in the run, you'll end up with more errors on subsequent runs trying to move file's that have already been moved.) Be sure to update the FILE_MOVED col to 0 or null if you have to revert to original folder. So here's some code that should accomplish what you wanted:
Declare this in a Module:
Declare Function MoveFile Lib "kernel32" Alias "MoveFileA" (ByVal lpExistingFileName As String, ByVal lpNewFileName As String) As Long
Paste this into a Module:
Function OrganizeFiles() As Long
On Error GoTo ErrHandler
Dim rst As New ADODB.Recordset
Dim strFolderFrom As String, strFolderTo As String
Dim strPathFrom As String, strPathTo As String
rst.CursorLocation = adUseClient
rst.CursorType = adOpenForwardOnly
rst.LockType = adLockOptimistic
rst.Open "SELECT * FROM [YourTableName] WHERE nz(FILE_MOVED,0) = 0 ", CurrentProject.Connection
strFolderFrom = "Z:\ContactLog\" 'the main folder will always be the same
Do Until rst.EOF
'destination folder
strFolderTo = strFolderFrom & rst.Fields("[YourGroupCol]") & "\" 'destination folder can change
'make sure the destination folder is there; if not, then create it
If Dir(strFolderTo, vbDirectory) = "" Then MkDir strFolderTo
'get the source file path
strPathFrom = strBaseFolder & rst.Fields("[YourFileNameCol]")
'get the destination file path
strPathTo = strFolderTo & rst.Fields("[YourFileNameCol]")
Call MoveFile(strPathFrom, strPathTo)
'at this point the file should have been moved, so update the flag
rst.Fields("FILE_MOVED") = 1
rst.MoveNext
Loop
rst.Close
ErrHandler:
Set rst = Nothing
If err.Number <> 0 Then
MsgBox err.Description, vbExclamation, "Error " & err.Number
End If
End Function
This task and the my code is pretty basic but this kind of thing can become complicated when working with multiple source and destination folders or changing file names in addition to moving them.

Extract tables from pdf (to excel), pref. w/ vba

I am trying to extract tables from pdf files with vba and export them to excel. If everything works out the way it should, it should go all automatic. The problem is that the table are not standardized.
This is what I have so far.
VBA (Excel) runs XPDF, and converts all .pdf files found in current folder to a text file.
VBA (Excel) reads through each text file line by line.
And the code:
With New Scripting.FileSystemObject
With .OpenTextFile(strFileName, 1, False, 0)
If Not .AtEndOfStream Then .SkipLine
Do Until .AtEndOfStream
//do something
Loop
End With
End With
This all works great. But now I am getting to the issue of extracting the tables from the text files.
What I am trying to do is VBA to find a string e.g. "Year's Income", and then output the data, after it, into columns. (Until the table ends.)
The first part is not very difficult (find a certain string), but how would I go about the second part. The text file will look like this Pastebin. The problem is that the text is not standardized. Thus for example some tables have 3-year columns (2010 2011 2012) and some only two (or 1), some tables have more spaces between the columnn, and some do not include certain rows (such as Capital Asset, net).
I was thinking about doing something like this but not sure how to go about it in VBA.
Find user defined string. eg. "Table 1: Years' Return."
a. Next line find years; if there are two we will need three columns in output (titles +, 2x year), if there are three we will need four (titles +, 3x year).. etc
b. Create title column + column for each year.
When reaching end of line, go to next line
a. Read text -> output to column 1.
b. Recognize spaces (Are spaces > 3?) as start of column 2. Read numbers -> output to column 2.
c. (if column = 3) Recognize spaces as start of column 3. Read numbers -> output to column 3.
d. (if column = 4) Recognize spaces as start of column 4. Read numbers -> output to column 4.
Each line, loop 4.
Next line does not include any numbers - End table. (probably the easiet just a user defined number, after 15 characters no number? end table)
I based my first version on Pdf to excel, but reading online people do not recommend OpenFile but rather FileSystemObject (even though it seems to be a lot slower).
Any pointers to get me started, mainly on step 2?
You have a number of ways to dissect a text file and depending on how complex it is might cause you to lean one way or another. I started this and it got a bit out of hand... enjoy.
Based on the sample you've provided and the additional comments, I noted the following. Some of these may work well for simple files but can get unwieldy with bigger more complex files. Furthermore, there may be slightly more efficient methods or tricks to what I have used here but this will definitely get you going an achieve the desired outcome. Hopefully this makes sense in conjunction with the code provided:
You can use booleans to help you determine what 'section' of the text file you are in. Ie use InStr on the current line to
determine you are in a Table by looking for the text 'Table' and then
once you know you are in the 'Table' section of the file start
looking for the 'Assets' section etc
You can use a few methods to determine the number of years (or columns) you have. The Split function along with a loop will do
the job.
If your files always have constant formatting, even only in certain parts, you can take advantage of this. For example, if you know your
file line will always have a dollar sign in front of the them, then
you know this will define the column widths and you can use this on
subsequent lines of text.
The following code will extract the Assets details from the text file, you can mod it to extract other sections. It should handle multiple rows. Hopefully I've commented it sufficient. Have a look and I'll edit if needs to help out further.
Sub ReadInTextFile()
Dim fs As Scripting.FileSystemObject, fsFile As Scripting.TextStream
Dim sFileName As String, sLine As String, vYears As Variant
Dim iNoColumns As Integer, ii As Integer, iCount As Integer
Dim bIsTable As Boolean, bIsAssets As Boolean, bIsLiabilities As Boolean, bIsNetAssets As Boolean
Set fs = CreateObject("Scripting.FileSystemObject")
sFileName = "G:\Sample.txt"
Set fsFile = fs.OpenTextFile(sFileName, 1, False)
'Loop through the file as you've already done
Do While fsFile.AtEndOfStream <> True
'Determine flag positions in text file
sLine = fsFile.Readline
Debug.Print VBA.Len(sLine)
'Always skip empty lines (including single spaceS)
If VBA.Len(sLine) > 1 Then
'We've found a new table so we can reset the booleans
If VBA.InStr(1, sLine, "Table") > 0 Then
bIsTable = True
bIsAssets = False
bIsNetAssets = False
bIsLiabilities = False
iNoColumns = 0
End If
'Perhaps you want to also have some sort of way to designate that a table has finished. Like so
If VBA.Instr(1, sLine, "Some text that designates the end of the table") Then
bIsTable = False
End If
'If we're in the table section then we want to read in the data
If bIsTable Then
'Check for your different sections. You could make this constant if your text file allowed it.
If VBA.InStr(1, sLine, "Assets") > 0 And VBA.InStr(1, sLine, "Net") = 0 Then bIsAssets = True: bIsLiabilities = False: bIsNetAssets = False
If VBA.InStr(1, sLine, "Liabilities") > 0 Then bIsAssets = False: bIsLiabilities = True: bIsNetAssets = False
If VBA.InStr(1, sLine, "Net Assests") > 0 Then bIsAssets = True: bIsLiabilities = False: bIsNetAssets = True
'If we haven't triggered any of these booleans then we're at the column headings
If Not bIsAssets And Not bIsLiabilities And Not bIsNetAssets And VBA.InStr(1, sLine, "Table") = 0 Then
'Trim the current line to remove leading and trailing spaces then use the split function to determine the number of years
vYears = VBA.Split(VBA.Trim$(sLine), " ")
For ii = LBound(vYears) To UBound(vYears)
If VBA.Len(vYears(ii)) > 0 Then iNoColumns = iNoColumns + 1
Next ii
'Now we can redefine some variables to hold the information (you'll want to redim after you've collected the info)
ReDim sAssets(1 To iNoColumns + 1, 1 To 100) As String
ReDim iColumns(1 To iNoColumns) As Integer
Else
If bIsAssets Then
'Skip the heading line
If Not VBA.Trim$(sLine) = "Assets" Then
'Increment the counter
iCount = iCount + 1
'If iCount reaches it's limit you'll have to redim preseve you sAssets array (I'll leave this to you)
If iCount > 99 Then
'You'll find other posts on stackoverflow to do this
End If
'This will happen on the first row, it'll happen everytime you
'hit a $ sign but you could code to only do so the first time
If VBA.InStr(1, sLine, "$") > 0 Then
iColumns(1) = VBA.InStr(1, sLine, "$")
For ii = 2 To iNoColumns
'We need to start at the next character across
iColumns(ii) = VBA.InStr(iColumns(ii - 1) + 1, sLine, "$")
Next ii
End If
'The first part (the name) is simply up to the $ sign (trimmed of spaces)
sAssets(1, iCount) = VBA.Trim$(VBA.Mid$(sLine, 1, iColumns(1) - 1))
For ii = 2 To iNoColumns
'Then we can loop around for the rest
sAssets(ii, iCount) = VBA.Trim$(VBA.Mid$(sLine, iColumns(ii) + 1, iColumns(ii) - iColumns(ii - 1)))
Next ii
'Now do the last column
If VBA.Len(sLine) > iColumns(iNoColumns) Then
sAssets(iNoColumns + 1, iCount) = VBA.Trim$(VBA.Right$(sLine, VBA.Len(sLine) - iColumns(iNoColumns)))
End If
Else
'Reset the counter
iCount = 0
End If
End If
End If
End If
End If
Loop
'Clean up
fsFile.Close
Set fsFile = Nothing
Set fs = Nothing
End Sub
I cannot examine the sample data as the PasteBin has been removed. Based on what I can glean from the problem description, it seems to me that using Regular Expressions would make parsing the data much easier.
Add a reference to the Scripting Runtime scrrun.dll for the FileSystemObject.
Add a reference to the Microsoft VBScript Regular Expressions 5.5. library for the RegExp object.
Instantiate a RegEx object with
Dim objRE As New RegExp
Set the Pattern property to "(\bd{4}\b){1,3}"
The above pattern should match on lines containing strings like:
2010
2010 2011
2010 2011 2012
The number of spaces between the year strings is irrelevant, as long as there is at least one (since we're not expecting to encounter strings like 201020112012 for example)
Set the Global property to True
The captured groups will be found in the individual Match objects from the MatchCollection returned by the Execute method of the RegEx object objRE. So declare the appropriate objects:
Dim objMatches as MatchCollection
Dim objMatch as Match
Dim intMatchCount 'tells you how many year strings were found, if any
Assuming you've set up a FileSystemObject object and are scanning the text file, reading each line into a variable strLine
First test to see if the current line contains the pattern sought:
If objRE.Test(strLine) Then
'do something
Else
'skip over this line
End If
Set objMatches = objRe.Execute(strLine)
intMatchCount = objMatches.Count
For i = 0 To intMatchCount - 1
'processing code such as writing the years as column headings in Excel
Set objMatch = objMatches(i)
e.g. ActiveCell.Value = objMatch.Value
'subsequent lines beneath the line containing the year strings should
'have the amounts, which may be captured in a similar fashion using an
'additional RegExp object and a Pattern such as "(\b\d+\b){1,3}" for
'whole numbers or "(\b\d+\.\d+\b){1,3}" for floats. For currency, you
'can use "(\b\$\d+\.\d{2}\b){1,3}"
Next i
This is just a rough outline of how I would approach this challenge. I hope there is something in this code outline that will be of help to you.
Another way to do this I have some success with is to use VBA to convert to a .doc or .docx file and then search for and pull tables from the Word file. They can be easily extracted into Excel sheets. The conversion seems to handle tables nicely. Note however that it works on a page by page basis so tables extending over a page end up as separate tables in the word doc.

Resources