There are many hits on google about this but i'm wondering what is the best/fastest way to get some data out of a CSV file?
There are some that load the entire CSV file in excel, some load it in an array. I've seen some people like to do search for a specific word.
Basically I need to retrieve 4 values out of each present CSV file. (start/end time, equipment and substrate) Note that the equipment will repeat itself multiple times inside every file. The other 3 are unique.
Which method is best/fastest?
Here's a small example of the CSV file:
/port_name A
#data 01
#slot_no 2
#m_start 2020/03/26 19:15:27
#m_end 2020/03/26 19:23:21
#u_start ????/??/?? ??:??:??
#u_end ????/??/?? ??:??:??
$result 1 1 -4,-4 2548
<result_info> 1 : Kind :
&no_of_image 3
&i_name 01 S02.tif
~i_info Digital_Zoom 1.0
~i_info Equipment 4000 SERIAL NO. : 31
&i_name 02 S02.tif
~i_info Digital_Zoom 1.0
~i_info Equipment 4000 SERIAL NO. : 31
~CMS_substrate_id 2 "8939-02"
/end_of_file
A quick start of a macro might look like this:
Sub readCSVfile()
Dim textline As String
Dim Filename
Filename = "D:\TEMP\excel\61039635\CSVfile.txt"
Dim row As Integer
Cells(1, 1).Value = "m_start"
Cells(1, 2).Value = "m_end"
Cells(1, 3).Value = "Equipment"
Cells(1, 4).Value = "CMS_substrate_id"
row = 2
Open Filename For Input As #1
Do While Not EOF(1)
Line Input #1, textline
Select Case True
Case InStr(textline, "#m_start") > 0:
Cells(row, 1).Value = mysub(textline, "#m_start")
Case InStr(textline, "#m_end") > 0:
Cells(row, 2).Value = mysub(textline, "#m_end")
Case InStr(textline, "Equipment") > 0:
Cells(row, 3).Value = mysub(textline, "Equipment")
Case InStr(textline, "CMS_substrate_id") > 0:
Cells(row, 4).Value = mysub(textline, "CMS_substrate_id")
row = row + 1
End Select
Loop
Close (1)
End Sub
Function mysub(t As String, s As String) As String
mysub = Trim(Mid(t, InStr(t, s) + Len(s) + 1))
End Function
My answer is similar to #Luuk, but I'm not checking for "Equipment" as it appears in the sample data twice per record. Instead, I am checking for "&i_name 01" and then skipping down a few lines.
Sub sGetData()
On Error GoTo E_Handle
Dim strFile As String
Dim intFile As Integer
Dim strInput As String
Dim lngRow As Long
strFile = "J:\downloads\sample.txt"
intFile = FreeFile
Open strFile For Input As intFile
lngRow = 1
Do
Line Input #intFile, strInput
If InStr(strInput, "#m_start") > 0 Then
lngRow = lngRow + 1
ActiveSheet.Cells(lngRow, 1) = Mid(strInput, 12)
ElseIf InStr(strInput, "#m_end") > 0 Then
ActiveSheet.Cells(lngRow, 2) = Mid(strInput, 12)
ElseIf InStr(strInput, "&i_name 01") > 0 Then
Line Input #intFile, strInput
Line Input #intFile, strInput
ActiveSheet.Cells(lngRow, 3) = Mid(strInput, 41, 4)
ElseIf InStr(strInput, "~CMS_substrate_id") > 0 Then
ActiveSheet.Cells(lngRow, 4) = Mid(strInput, 24)
End If
Loop Until EOF(intFile)
sExit:
On Error Resume Next
Reset
Exit Sub
E_Handle:
MsgBox Err.Description & vbCrLf & vbCrLf & "sGetData", vbOKOnly + vbCritical, "Error: " & Err.Number
Resume sExit
End Sub
As this data file is probably not line terminated with the normal Carriage Return/Line Feed combination that VBA deals with, I've created a new sub that reads the data into an array, split on the end of line character being used (in this case Line Feed) before processing it.
Sub sGetData2()
On Error GoTo E_Handle
Dim strFile As String
Dim intFile As Integer
Dim strInput As String
Dim astrData() As String
Dim lngLoop1 As Long
Dim lngCount As Long
Dim lngRow As Long
strFile = "J:\downloads\sample1.txt"
intFile = FreeFile
Open strFile For Input As intFile
strInput = input(LOF(intFile), intFile)
astrData() = Split(strInput, vbLf)
lngCount = UBound(astrData)
lngRow = 1
For lngLoop1 = 3 To lngCount
If InStr(astrData(lngLoop1), "#m_start") > 0 Then
lngRow = lngRow + 1
ActiveSheet.Cells(lngRow, 1) = Mid(astrData(lngLoop1), 12)
ElseIf InStr(astrData(lngLoop1), "#m_end") > 0 Then
ActiveSheet.Cells(lngRow, 2) = Mid(astrData(lngLoop1), 12)
ElseIf InStr(astrData(lngLoop1), "&i_name 01") > 0 Then
lngLoop1 = lngLoop1 + 2
ActiveSheet.Cells(lngRow, 3) = Mid(astrData(lngLoop1), 41, 4)
ElseIf InStr(astrData(lngLoop1), "~CMS_substrate_id") > 0 Then
ActiveSheet.Cells(lngRow, 4) = Mid(astrData(lngLoop1), 24)
End If
Next lngLoop1
sExit:
On Error Resume Next
Reset
Exit Sub
E_Handle:
MsgBox Err.Description & vbCrLf & vbCrLf & "sGetData2", vbOKOnly + vbCritical, "Error: " & Err.Number
Resume sExit
End Sub
Regards,
Get String from your text file by adodb.stream object.
Extract what you are looking for from the imported string with regexp.
Put the second contents of the submatches of the extracted match collection into an array. Equipment items have two identical contents, so they are increased by two.
The data in the array is transferred to the sheet.
Sub Test()
Dim Ws As Worksheet
Dim Path As String
Dim s As String
Dim pattn(1 To 4) As String
'Dim Match(1 To 4) As MatchCollection
Dim Match(1 To 4) As Object
Dim vR() As Variant
Dim i As Long, n As Long, j As Integer, k As Long
Path = ThisWorkbook.Path & "\regextest.txt" '<~~ Your text file full Path
s = getString(Path) '<~~ get text form your text file
Set Ws = ActiveSheet
'** This is regular Expression
pattn(1) = "(m_start[ ]{1,})(\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2})"
pattn(2) = "(m_end[ ]{1,})(\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2})"
pattn(3) = "(~i_info Equipment[ ]{1,})(\d{1,})"
pattn(4) = "(~CMS_substrate_id[ ]{1,})(\d{1,}[ ]{1,}" & Chr(34) & "\d{1,}-\d{1,}" & Chr(34) & ")"
For i = 1 To 4
Set Match(i) = GetRegEx(s, pattn(i))
Next i
n = Match(1).Count
ReDim vR(1 To n, 1 To 4)
For i = 0 To n - 1
For j = 1 To 4
If j = 3 Then
vR(i + 1, j) = Match(j).Item(k).SubMatches(1)
k = k + 2
Else
vR(i + 1, j) = Match(j).Item(i).SubMatches(1)
End If
Next j
Next i
With Ws
.Cells.Clear
.Range("a1").Resize(1, 4) = Array("m_start", "m_end", "Equipment", "CMS_substrate_id")
.Range("a2").Resize(n, 4) = vR
.Range("a:b").NumberFormatLocal = "yyyy/mm/dd hh:mm:ss"
End With
End Sub
Function GetRegEx(StrInput As String, strPattern As String) As Object
'Dim RegEx As New RegExp
Dim RegEx As Object
'Set RegEx = New RegExp
Set RegEx = CreateObject("VBscript.RegExp")
With RegEx
.Global = True
.IgnoreCase = False
.MultiLine = True
.Pattern = strPattern
End With
If RegEx.Test(StrInput) Then
Set GetRegEx = RegEx.Execute(StrInput)
End If
End Function
Function getString(Path As String)
'Dim objStream As ADODB.Stream
Dim objStream As Object
'Set objStream = New ADODB.Stream
Set objStream = CreateObject("ADODB.Stream")
With objStream
.Charset = "Utf-8"
.Open
.LoadFromFile Path
getString = .ReadText
.Close
End With
End Function
Result image (3 types of data)
I am trying to copy a specific cell and 3 columns from multiple files into a single column on another spreadsheet.
The part called "import" simply allows to select multiple files. The part "Datacopy" should copy the desired values.
Sub import()
Dim oFileDialog As FileDialog
Set oFileDialog = Application.FileDialog(msoFileDialogFilePicker)
oFileDialog.AllowMultiSelect = True
oFileDialog.InitialFileName = "C:\Users\L18938\Desktop\New_folder" ' can set your default directory here
oFileDialog.Show
Dim iCount As Integer
For iCount = 1 To oFileDialog.SelectedItems.Count
Call Datacopy(oFileDialog.SelectedItems(iCount))
Next
End Sub
Public Function Datacopy(strPath As String)
Dim filePath As String
Dim FileNum As Integer
filePath = strPath
Dim startDate As String
If Range("A2").Value <> "" Then
Range("A1").End(xlDown).Offset(1, 0).Select
Else:
Range("A1:A" & Range("A" & Rows.Count).End(xlUp).Row).Offset(1, 0).Select
End If
currentRow = 0
rowNumber = 0
Open filePath For Input As #1
'EOF(1) checks for the end of a file
Do Until EOF(1)
If rowNumber = 0 Then
startDate = lineitems(2)
End If
If rowNumber > 18 And item <> "" Then
ActiveCell.Offset(currentRow, 0) = startDate
ActiveCell.Offset(currentRow, 1) = lineitems(0)
ActiveCell.Offset(currentRow, 2) = lineitems(1)
ActiveCell.Offset(currentRow, 3) = lineitems(2)
currentRow = currentRow + 1
End If
End If
Next item
rowNumber = rowNumber + 1
Loop
Close #1
End Function
When I run it I get the error "sub or function not defined".
The cells I am targeting are:
C1 -> is a date, different in each file, to be copied in column A
Columns A18:A, B18:B, C18:C -> are data to be copied in columns B, C, D respectively.
It is important to copy multiple files, as I have more than 180.
Your problem is "startDate = lineitems(2)". There's nothing in your code that assigns any kind of value to "lineitems".
I have the following excel spreadsheet, tracker.xls with 2 columns: category and count.
I want to loop through a subdirectory which contains pdf files which match the category column in my spreadsheet.
This is what I've done so far, but my code does not seem to be working:
Sub CV()
Function CVCount()
CategoryArray = Range("A2:A3")
CountArray = Range("B2:B3")
For i = 1 To UBound(CategoryArray)
For j = 1 To UBound(CategoryArray, 2)
'get name of category
Dim Category As String
Category = (myarray(i, j))
FolderPath = Category + "\"
path = FolderPath & "\*.pdf"
Filename = Dir(path)
For k = 1 To UBound(CountArray)
For l = 1 To UBound(CountArray, 2)
'get count
Do While Filename <> ""
count = count + 1
Filename = Dir()
Loop
'assign count to cell
Range("").Value = count
Next k
Next j
Next i
End Function
End Sub
I can't seem to figure out how to assign a count to a cell. Any ideas how to?
You're on the right track, but it's far simpler than that:
Option Explicit
Private Const baseFolder As String = "C:\Users\Hazel\Documents\Personal\Accounts\Childcare\"
Sub countFiles()
Dim path As String
Dim fileName As Variant
Dim count As Integer
Dim i As Integer
i = 1
Do While Range("A" & i + 1).Value <> ""
count = 0
i = i + 1
path = baseFolder & Range("A" & i).Value & "\"
fileName = Dir(path)
Do While fileName <> ""
If UCase$(Right(fileName, 3)) = "PDF" Then count = count + 1
fileName = Dir()
Loop
Range("B" & i).Value = count
Loop
End Sub
Just change the "baseFolder" constant to match your starting directory.
Here is my code:
Private Sub UserForm_Initialize()
Dim MyFolder As String
Dim MyFile As String
Dim j As Integer
Dim Funds(1000)
CD_Date = Cells(1, 4)
Range("A2").Select
i = 1
Do
Funds(i) = UCase(ActiveCell.Value)
i = i + 1
ActiveCell.Offset(1, 0).Select
Loop Until ActiveCell.Value = ""
MyFolder = "C:\windows\"
r = 0
For k = 1 To i - 1
MyFile = Dir$(MyFolder & "*" & Funds(k) & "*")
Do While MyFile <> ""
datka = FileDateTime(MyFolder & MyFile)
If Format(datka, "yymmdd") = Format(CD_Date, "yymmdd") Then
With UserForm1.ListBox1
.AddItem
.List(r, 0) = Funds(k)
.List(r, 1) = MyFile
r = r + 1
End With
End If
MyFile = Dir$
Loop
Next k
Range("A2").Select
End Sub
Code works perfectly, but it doesn't change dynamically when I move CD_Date for prior day or current - 2. It always add item for entered date and only for first run. When I change date in cell it always return me list from first initialize. It resets when I close file and open it again, each time for different date. Is it possible to modify my code that it will be filling dynamically after I change CD_Date or when someone add file to the folder?? I want to avoid closing and opening macro over and over just to get actual data :(
You could add in a loop to remove all items from the ListBox before the point in your code where items are added to effectively refresh it each time. To do this you are looking for something like:
With UserForm1.ListBox1
For i = 1 to .ListCount
.RemoveItem(0)
Next i
End With
In your code it seems like each time you open the form you only want one item in the ListBox, so although the loop here is unnecessary its a good practice to remember. Try placing this in your module just before you add the item:
Private Sub UserForm_Initialize()
Dim MyFolder As String
Dim MyFile As String
Dim j As Integer
Dim Funds(1000)
CD_Date = Cells(1, 4)
Range("A2").Select
i = 1
Do
Funds(i) = UCase(ActiveCell.Value)
i = i + 1
ActiveCell.Offset(1, 0).Select
Loop Until ActiveCell.Value = ""
MyFolder = "C:\windows\"
r = 0
For k = 1 To i - 1
MyFile = Dir$(MyFolder & "*" & Funds(k) & "*")
Do While MyFile <> ""
datka = FileDateTime(MyFolder & MyFile)
If Format(datka, "yymmdd") = Format(CD_Date, "yymmdd") Then
With UserForm1.ListBox1
For i = 1 to .ListCount 'Here
.RemoveItem(0)
Next i
.AddItem
.List(r, 0) = Funds(k)
.List(r, 1) = MyFile
r = r + 1
End With
End If
MyFile = Dir$
Loop
Next k
Range("A2").Select
End Sub
I have been working in this project step by step. I can't understand why it is not copying the row string values from the "SheetName" used as argument being passed into this function(SheetName). The function can read a file and create a second file with checkboxes based on the number of column titles found in the first file, but the column titles are not being copied into the second file as captions for the checkboxes. Any help is appreciated.
Function CallFunction(SheetName As Variant) As Long
Dim text As String
Dim titles(200) As String ' Dim titles(200) As String ' Array
Dim nTitles As Integer
Dim wks As Worksheet
Dim myCaption As String
Dim NewBook As Workbook
PathName = Range("F22").Value
Filename = Range("F23").Value
ControlFile = ActiveWorkbook.Name
Workbooks.Open Filename:=PathName & "\" & Filename
Set wks = ActiveWorkbook.Worksheets(SheetName)
For i = 1 To 199
If Trim(wks.Cells(4, i).Value) = "" Then
nTitles = i - 1
Exit For
End If
titles(i - 1) = wks.Cells(4, i).Value
Next
i = 1
Workbooks.Add
Set NewBook = ActiveWorkbook
NewBook.SaveAs fileExported
Workbooks.Open (fileExported)
For Each cell In Range(Sheets(SheetName).Cells(4, 1), Sheets(SheetName).Cells(4, 1 + nTitles))
myCaption = Sheets(SheetName).Cells(4, i).Value
With Sheets(SheetName).checkBoxes.Add(cell.Left, _
cell.Top, cell.Width, cell.Height)
.Interior.ColorIndex = 12
.Caption = myCaption
.Characters.text = myCaption
.Border.Weight = xlThin
.Name = myCaption
End With
i = i + 1
Next
End Function
I found the answer to my own question I just forgot to add the answer here. Ok, here it is
' Save all Jira column titles into jTitles
If sj = True Or ji = True Then
For j = 1 To 199
If Trim(wks1.Cells(4, j).Value) = "" Then
titlesj = j - 1
Exit For
End If
jTitles(j - 1) = wks1.Cells(4, j).Value
Next
j = 1
' Add column titles as checkboxes
For j = 0 To titlesj
Sheet1.ListBox1.AddItem jTitles(j)
Sheet1.ListBox3.AddItem jTitles(j)
Next
wb1.Close
End If