Efficiently reading txt file (or other data sources) using VBA - excel

I have a (large) column of data stored in a txt file.
I need to copy the column vector in an Excel sheet. Here is my code:
Dim t As Single
t = Timer
Dim sFile As String
inputFile = "C:\Temp\vector.txt"
Dim rowNum As Long
rowNum = 1
Dim dest As Range
Set dest = Sheet1.Cells(rowNum, 1)
Open inputFile For Input As #1
Do Until EOF(1)
Input #1, ReadData
If Not IsEmpty(ReadData) Then
dest.Cells = ReadData
rowNum = rowNum + 1
Set dest = Sheet1.Cells(rowNum, 1)
End If
Loop
Close #1 'close the opened file
Sheet1.[C2].Value = Timer - t
I wonder whether there is a more efficient/fast way to accomplish the same task.
To this aim, does it make sense to convert the txt file into another format (say .csv, .xlsx or any other file type) instead of reading lines from the .txt file?
Any help is highly appreciated.
S

Following this link I have tried different solutions.
The following code provides a much faster solution to the problem (importing in Excel a column of 500,000 random numbers) as compared to the code proposed in the initial question.
Dim t As Single
t = Timer
Dim inputFile As String
inputFile = "C:\Temp\vector.txt"
Dim wbI As Workbook, wbO As Workbook
Dim wsI As Worksheet
Set wbI = ThisWorkbook
Set wsI = wbI.Sheets("Sheet1")
Set wbO = Workbooks.Open(inputFile)
wbO.Sheets(1).Columns(1).Copy wsI.Columns(1)
wbO.Close SaveChanges:=False
Sheet1.[C2].Value = Timer - t
In particular, after 20 trials, the average computational time was 1.50 seconds, while con the first code it was 10.2 seconds.
Hope this helps!

If you want to use the first approach (which I'd recommend as it doesn't involve opening the files through Excel) then you can reduce the run time by batching the prints.
Also you may want to consider using scripting.filesystemobject rather than the older IO interface.
See example below (Note this code hasn't been tested)
const path as string = ""
const max_print_rows as integer = 10000
dim print_start_cell as range
dim print_arr () as string
dim i as integer,j as long
dim fso as scripting.filesystemobject
dim in_file as scripting.textstream
set print_start_cell=thisworkbook.names("Start_Cell").referstorange
set fso=new scripting.filesystemobject
set in_file=fso.opentextfile(path,forreading)
redim print_arr(1 to max_print_rows,1 to 1)
do until in_file.atendofstream
i=i+1
print_arr(i)=in_file.readline
if I=max_print_rows then
print_start_cell.offset(j).resize(max_print_rows).value=print_arr
j=j+i
erase print_arr
redim print_arr(1 to max_print_rows)
i=1
end if
loop
print_start_cell.offset(j).resize(max_print_rows).value=print_arr
erase print_arr
in_file.close
set in_file=nothing
set print_start_cell=nothing
set fso=nothing

Related

FreeFile Multiple CSVs Error 67 Too many files

Background
I need to open multiple csvs in multiple folders, and for this matter I use FreeFile as input: let's say there are over 1000 csvs (powerquery will not have use here, since I only need the last row of data of each csv and then analyze that). I have seen that expanding to 512 may temporarily fix it in a way, but I do not think that is the core cause, hence, not providing a long term solution.
Problem
Seems like even if I close the file, the memory is not properly cleared, hence I get an error 67 after some looping on files has been done.
Code
I created a function to retrieve the Last Line within my main sub code, I even attempted to loop until freefile is 1 again (I added some sleep as well), but no luck, at some point, grows at 2.
Function Return_VarInCSVLine(ByRef NumLineToReturnTo As Long, ByRef TxtFilePathCSV As String, Optional ByRef IsLastLine As Boolean) As Variant
If NumLineToReturnTo = 0 Then NumLineToReturnTo = 1
'NumLineToReturnTo has to be at least 1 even if LastLine is set to true so no error is arised from IIF
Dim NumFileInMemory As Long
Dim ArrVarTxtLines() As Variant
Dim CounterArrTxtLines As Long
Dim TxtInLine As String
NumFileInMemory = FreeFile: CounterArrTxtLines = 1
Open TxtFilePathCSV For Input As #NumFileInMemory: DoEvents
Do While Not EOF(NumFileInMemory)
Line Input #NumFileInMemory, TxtInLine
ReDim Preserve ArrVarTxtLines(1 To CounterArrTxtLines)
ArrVarTxtLines(CounterArrTxtLines) = TxtInLine
CounterArrTxtLines = CounterArrTxtLines + 1
Loop
LoopUntilClosed:
Close #NumFileInMemory: Sleep (10): DoEvents
NumFileInMemory = FreeFile
If NumFileInMemory > 1 Then GoTo LoopUntilClosed
Return_VarInCSVLine = IIf(IsLastLine = True, ArrVarTxtLines(UBound(ArrVarTxtLines)), ArrVarTxtLines(NumLineToReturnTo))
End Function
Question
How can I avoid this error in this scenario? Or what are my alternatives? I used to do workbooks.Open but that is slower than just using FreeFile and then Open for input
You could try to use the FileSystemObject on a Windows PC
Function fsoReadLine(fileName As String, lineNo As Long, Optional lastLine As Boolean) As String
Dim fso As Object
Dim textFile As Object
Set fso = CreateObject("Scripting.FileSystemObject")
Set textFile = fso.OpenTextFile(fileName, 1)
Dim vDat As Variant
' Read the whole file and split it by lines
vDat = Split(textFile.ReadAll, vbCrLf)
Dim totalLines As Long
totalLines = UBound(vDat) + 1 ' zero based array!
If lastLine Then
fsoReadLine = vDat(totalLines - 1)
Else
If lineNo <= totalLines Then
fsoReadLine = vDat(lineNo - 1)
End If
End If
textFile.Close
End Function
And if you only need the last line you could shorten the code to
Function fsoLastLine(fileName As String) As String
Dim fso As Object
Dim textFile As Object
Set fso = CreateObject("Scripting.FileSystemObject")
Set textFile = fso.OpenTextFile(fileName, 1)
Dim vDat As Variant
' Read the whole file and split it by lines
vDat = Split(textFile.ReadAll, vbCrLf)
fsoLastLine = vDat(UBound(vDat))
textFile.Close
End Function

Search multiple text files for specific lines of data and import into excel using VBA macros

I am very new to VBA and I'm looking to use it to automate some of my processes. I have looked around this website (and others) and although I find very similar queries, I can't seem to find one that fits my needs exactly.
So far the closest thing I've found to what I'm looking to do is this: Wanting to create a search field and button to trigger VBA script to run
I have a source folder with all my data. My data is stored in multiple text files. Here is an example of what the data in the files looks like:
10001,1,205955.00
10001,2,196954.00
10001,3,4.60
10001,4,92353.00
10001,5,85015.00
10001,6,255.90
10001,7,804.79
10001,8,205955.00
10001,9,32465.00
In each row, the first number is a geographic code, second number is a numeric code for a specific indicator (not important for what I'm trying to do), and the third number is the value I want to import into my spreadsheet. Each geographic code is associated with 2247 rows.
I want to use a search box control in Excel that I can type a specific geographic code into, click a button and then the macro would run, searching the files for that specific code and then importing all the values - in the order they are listed in the data file - into my desired range in the workbook.
So far I've gotten this code written. Again, forgive me if this is bad code... I tried to re-purpose the code from the other forum post I mentioned earlier.
I think I setup the import location right... I want it to import into column C, row 3 of the sheet that the search box/button combo will be present on. But now, I am unsure how I would get the import aspect to work. Thanks in advance for anyone who can help on this issue.
Sub SearchFolders()
Dim FSO As Object
Dim Folder As Object
Dim File As Object
Dim TS As Object
Dim SourceFolder As String
Dim Search As String
Dim LineNumber As Long
Dim DataSh As Worksheet
SourceFolder = "C:\Users\MarMar\Desktop\Data\Census2016\DataFiles\"
Search = TextBox1.Value
Set FSO = CreateObject("Scripting.FileSystemObject")
Set Folder = FSO.GetFolder(SourceFolder)
Set DataSh = ActiveSheet.Cells(3, 3)
For Each File In Folder.Files
Set TS = File.OpenAsTextStream()
LineNumber = 0
Do While Not TS.AtEndOfStream
LineNumber = LineNumber + 1
If InStr(TS.ReadLine, Search) Then
'Code to Import Values to DataSh ???
End If
Loop
TS.Close
Next File
End Sub
Maybe something like this:
Dim arr
For Each File In Folder.Files
Set TS = File.OpenAsTextStream()
LineNumber = 0
Do While Not TS.AtEndOfStream
arr = Split(TS.ReadLine, ",") 'split line to array
'check first element in array
If arr(0) = Search Then
datash.Resize(1, UBound(arr) + 1).Value = arr
Set datash = datash.Offset(1, 0)
End If
Loop
TS.Close
Next File
Final result that worked for me!
Sub SearchImportData1()
Dim FSO As Object
Dim SourceFolder As String
Dim Folder As Object
Dim Import As Range
Dim Search As String
Dim TextBox1 As TextBox
Dim File As Object
Dim TS As Object
Dim LineNumber As Integer
Dim Arr As Variant
SourceFolder = "C:\Users\MarMar\Desktop\Data\Census2016\DataFiles\"
Set FSO = CreateObject("Scripting.FileSystemObject")
Set Folder = FSO.GetFolder(SourceFolder)
Set Import = ActiveSheet.Cells(2, 3)
Search = ActiveSheet.TextBox1.Text
For Each File In Folder.Files
Set TS = File.OpenAsTextStream()
LineNumber = 0
Do While Not TS.AtEndOfStream
Arr = Split(TS.ReadLine, ",")
If Arr(0) = Search Then
Import.Resize(1, 1).Value = Arr(2)
Set Import = Import.Offset(1, 0)
End If
Loop
TS.Close
Next File
End Sub

For Each - Open specified workbooks in array

I have about 200ish workbooks That I need opened in a dir containing thousands of files. The name of these 200 files have been placed in Sheet 4 B3:B231.
Whenever I run the macro however, I only get the code to work on 4 of these files, 1 of the 4 isn't even mentioned in the specified range.
Dim wb As Workbook, shtDest As Worksheet
Dim MyFile As String
Dim Filepath As String
Dim WoBo As Variant
Dim DirArray As Variant
DirArray = Sheets("Sheet4").Range("B3:B231").Value
Filepath = "C:\"
MyFile = Dir(Filepath)
Set shtDest = ThisWorkbook.Sheets("Sheet1")
For Each WoBo In DirArray
'code
Next WoBo
End Sub
I'm quite new to arrays and I don't really understand the For Each line, so I suspect the error to be there.
Try something like the following. A For Loop is faster than a For Each for arrays and using Transpose turns it into a one dimensional array to loop over.
Dim dirArray()
dirArray = Application.Transpose(ThisWorkbook.Worksheets("Sheet4").Range("B3:B231").Value)
For i = LBound(dirArray) To Ubound(dirArray)
If instr(dirArray(i),"xls") > 0 Then '<== very basic check
Workbooks.Open(dirArray(i)) '<== should be full filepath of file
' Do stuff
End If
Next

Loop through excel-spreadsheet-rows until empty using VBA-macro in powerpoint. For each row read values and write to 2-dim-array. No .select

I wanna do a simple search and replace in powerpoint.
I am trying to loop through an excel spreadsheet using a VBA-macro in powerpoint.
The spreadsheet has two columns and ~100 rows. I want the macro to loop through the rows until it reaches an empty cell.
For each row it shell read the values of column 1 and column 2 and write those to an 2-dimensional-array.
I had it running using various .select-statements but I didn't like it that way (is select buggy? Search and replace worked a few times, but after changing the spreadsheet too often the macro always crashed).
I am trying to use a more robust way with better performance.
Dim excelDataArray(120, 2) As String
Dim slidedeck As Presentation
Set slidedeck = ActivePresentation
Dim singleslide As Slide
Dim excelFile As Excel.Workbook
Set excelFile = Excel.Application.Workbooks.Open(spreadsheetFolder)
Dim excelSheet As Excel.Worksheet
Set excelSheet = excelFile.Worksheets("Sheet1")
'Loop through each row in Column A until empty row
Dim N As Integer
N = excelSheet.Cells(excelSheet.Rows.Count, "A").End(xlUp).Row
For i = 1 To N
excelDataArray(i, 0) = excelSheet.Cells(i, "A").Value
excelDataArray(i, 1) = excelSheet.Cells(i, "B").Value
Next
You can dump it directly to a variant array without loops.
I have tidied your variables for completeness.
Pls change the path to your xl file here, "C:\temp\test.xlsx"
Sub likethis()
Dim slidedeck As Presentation
Dim singleslide As Slide
Dim XLS As Excel.Application
Dim excelFile As Excel.Workbook
Dim excelSheet As Excel.Worksheet
Dim lngROw As Long
Dim X
Set slidedeck = ActivePresentation
Set XLS = New Excel.Application
Set excelFile = XLS.Workbooks.Open("C:\temp\test.xlsx")
Set excelSheet = excelFile.Worksheets("Sheet1")
lngROw = excelSheet.Cells(excelSheet.Rows.Count, "A").End(xlUp).Row
ReDim X(1 To lngROw, 1 To 2)
X = excelSheet.Range("A1:B" & lngROw)
End Sub

I only want to import the last 3 days of text files into Excel

I have a folder with over 16,000 files and I've managed to find some code that won't break Excel when it searches all of the files. Now I need some code that will import the last 3 days worth of text files. Any help would be appreciated.
Current code:
Sub ReadFilesIntoActiveSheet()
Dim fso As FileSystemObject
Dim folder As folder
Dim file As file
Dim FileText As TextStream
Dim TextLine As String
Dim Items() As String
Dim i As Long
Dim cl As Range
Dim dateToCheck As Date
Dim daysBack As Integer
Dim filesCheckd As Integer
filesChecked = 0
daysBack = 5
dateToCheck = DateAdd("d", -daysBack, Date)
Dim StrFile As String
StrFile = Dir("X:\TMS\TRUCK_OUT\")
Do While Len(StrFile) > 0
filesChecked = filesChecked + 1
StrFile = Dir
Loop
MsgBox filesChecked
Set FileText = Nothing
Set file = Nothing
Set folder = Nothing
Set fso = Nothing
End Sub
I told you the function to use and your response was to ask me to explain how to use it. The implication is that you know so little about Excel VBA that even knowing which function to use is not enough to add two extra statements to your code.
You must invest some time in learning the basics of Excel VBA. Trying to get a macro written for you without knowing the basics will probably fail. If you are successful, it will take a long time if you are only going to get a few lines per question.
Search the internet for "Excel VBA tutorial". There are many to choose from. Alternatively, visit a good bookshop or library and look for Excel VBA Primers. Again many to choose from. The time you invest in learning the basics will very quickly repay itself.
Issue 1
The first declaration is Dim fso As FileSystemObject. This will only compile if one of your references is for the Microsoft Scripting RunTime. You do not use fso. Are you planning to use it later? Why have you used Dir rather than the Files property of the folder object you have declared?
Issue 2
What do you mean by: "I've managed to find some code that won't break Excel when it searches all of the files." What code did you have which broke Excel?
Issue 3
Dim daysBack As Integer declares a 16-bit integer. Unless you have an old 16-bit computer, this will require extra processing. Dim i As Long declares a 32-bit integer and it the correct choice.
Issue 4
Dir returns a Variant. Dir$ returns a String and is faster.
None of the above issues are immediately important. I trying to show some of the little things you do not know about Excel VBA and the traps waiting to catch you when you do not know the basics.
For the version of your macro below, I have:
Indented the code to make it easier to read.
Commented out statements you do not current use.
Added two statements so only files with a Last Modified Date after dateToCheck are counted.
This will take you a little further.
Sub ReadFilesIntoActiveSheet()
'Dim fso As FileSystemObject
'Dim folder As folder
'Dim file As file
'Dim FileText As TextStream
'Dim TextLine As String
'Dim Items() As String
'Dim i As Long
'Dim cl As Range
Dim dateToCheck As Date
Dim daysBack As Integer
Dim filesCheckd As Integer
Dim StrFile As String
Dim PathRoot As String
filesChecked = 0
daysBack = 5
dateToCheck = DateAdd("d", -daysBack, Date)
PathRoot = "X:\TMS\TRUCK_OUT\"
StrFile = Dir$(PathRoot)
Do While Len(StrFile) > 0
If FileDateTime(PathRoot & StrFile) < dateToCheck Then
filesChecked = filesChecked + 1
End If
StrFile = Dir$
Loop
MsgBox filesChecked
'Set FileText = Nothing
'Set file = Nothing
'Set folder = Nothing
'Set fso = Nothing
End Sub

Resources