I have tab separated csv files which I want to transform to xlsx. So each csv should be transformed to a xlsx. Filename should be the same. However, the files are tab separated. For example, see this test file screenshot:
When I run my code (I created a subfolder xlsx before):
Sub all()
Dim sourcepath As String
Dim sDir As String
Dim newpath As String
sourcepath = "C:\Users\PC\Desktop\Test\"
newpath = sourcepath & "xlsx\"
'make sure subfolder xlsx was created before
sDir = Dir$(sourcepath & "*.csv", vbNormal)
Do Until Len(sDir) = 0
Workbooks.Open (sourcepath & sDir)
With ActiveWorkbook
.SaveAs Filename:=Replace(Left(.FullName, InStrRev(.FullName, ".")), sourcepath, newpath) & "xlsx", FileFormat:=xlOpenXMLWorkbook, CreateBackup:=False
.Close
End With
sDir = Dir$
Loop
End Sub
it does work, but when I look into the excel file:
I can see that the tab separator was not detected. I think my local settings are that the separator is a semi-colon and that's why it is not working. Now I wanted to set dataType to xlDelimited and tab to True, with changing the one line to:
Workbooks.Open (Spath & sDir), DataType:=xlDelimited, Tab:=True
I also tried
Workbooks.Open (Spath & sDir, DataType:=xlDelimited, Tab:=True)
or
Workbooks.Open FileName:=Spath & sDir, DataType:=xlDelimited, Tab:=True
But this leads to an error message. I then tried another approach, where I set the delimiter to Chr(9) (tab) and local to false:
Sub all()
Dim wb As Workbook
Dim strFile As String
Dim strDir As String
strDir = "C:\Users\PC\Desktop\Test\"
strFile = Dir(strDir & "*.csv")
Do While strFile <> ""
Set wb = Workbooks.Open(Filename:=strDir & strFile, Delimiter:=Chr(9), Local:=False)
With wb
.SaveAs Replace(wb.FullName, ".csv", ".xlsx"), 51
.Close True
End With
Set wb = Nothing
strFile = Dir
Loop
End Sub
It does not lead to an error. But when I open the file, it looks like:
So again same problem, the tab separator is not recognized. How can I fix this?
(I also tried it with local:True together with Delimiter:=Chr(9), but same problem and I also tried it with adding Format:=6)
I tried it this way with csv as I did not want to go the same way with txt file extension. Reason is that using csv easily allows special language characters like "ö" and "ü". So that is why I wanted to convert csv to xlsx and not use the workaround of using txt instead, as I then run into the problem that when I try to convert txt to xlsx certain special characters are not properly recognised and I hope to avoid this problem with using csv.
The csv (or actually these are tsv, because they have the tab as separator and not semi-colon) files have different columns. So could be one csv file has 5 columns, the other 6 and the datatypes vary too.
EDIT:
In repsonse to EEMs answer:
Check this Test.csv file, it looks like this:
Separated by tab. Not semi-colon.
When I run the code (plus adding .TextFileDecimalSeparator = "." to the code) and check the resulting xlsx file it looks like this:
Values in the second column (ColumnÄ), like 9987.5 are correctly transformed to 9987,5. But values in the last column (ColumnI) are wrongly transformed. This is my problem now. (I dont know why special character does not get transformed correctly, as in my original files this does work.)
As mentioned by #RonRosenfeld, files with .csv extension will be opened by excel as a text file with tab delimiters.
Also the following assumption is not accurate:
Option1 txt is not a way for me, as I face a new problem that UTF-8
special characters, like äö and so are not properly imported.
The handling of the characters or code page has nothing to do with the extension of the files, instead it's managed by the Origin parameter of the Workbooks.OpenText method or by the TextFilePlatform property of the QueryTable object.
Therefore unless the files are renamed with a extension different than csv the [Workbooks.OpenText method] will not be effective.
The solution proposed below, uses the QueryTable object and consist of two procedures:
Tab_Delimited_UTF8_Files_Save_As_Xlsx
Sets the source and target folder
Creates the xlsx folder if not present
Gets all csv files in the source folder
Open_Csv_As_Tab_Delimited_Then_Save_As_Xls
Process each csv files
Adds a workbook to hold the Query Table
Imports the csv file
Deletes the Query
Saves the File as `xlsx'
EDIT I These lines were added to ensure the conversion of the numeric data:
.TextFileDecimalSeparator = "."
.TextFileThousandsSeparator = ","
EDIT II A Few changes to rename the worksheet (marked as '# )
Tested with this csv file:
Generated this `xlsx' file:
Hopefully, it should be straightforward to add these procedure to you project, let me know of any problem or question you might have with the resources used.
Sub Tab_Delimited_UTF8_Files_Save_As_Xlsx()
Dim sFile As String
Dim sPathSrc As String, sPathTrg As String
Dim sFilenameSrc As String, sFilenameTrg As String
Dim bShts As Byte, exCalc As XlCalculation
Rem sPathSrc = "C:\Users\PC\Desktop\Test\"
sPathTrg = sPathSrc & "xlsx\"
Rem Excel Properties OFF
With Application
.EnableEvents = False
.DisplayAlerts = False
.ScreenUpdating = False
exCalc = .Calculation
.Calculation = xlCalculationManual
.CalculateBeforeSave = False
bShts = .SheetsInNewWorkbook
.SheetsInNewWorkbook = 1
End With
Rem Validate Target Folder
If Len(Dir$(sPathTrg, vbDirectory)) = 0 Then MkDir sPathTrg
Rem Process Csv Files
sFile = Dir$(sPathSrc & "*.csv")
Do Until Len(sFile) = 0
sFilenameSrc = sPathSrc & sFile
sFile = Left(sFile, -1 + InStrRev(sFile, ".csv")) '#
sFilenameTrg = sPathTrg & sFile & ".xlsx" '#
Call Open_Csv_As_Tab_Delimited_Then_Save_As_Xls(sFile, sFilenameSrc, sFilenameTrg) '#
sFile = Dir$
Loop
Rem Excel Properties OFF
With Application
.SheetsInNewWorkbook = bShts
.Calculation = exCalc
.CalculateBeforeSave = True
.ScreenUpdating = True
.DisplayAlerts = True
.EnableEvents = True
End With
End Sub
…
Sub Open_Csv_As_Tab_Delimited_Then_Save_As_Xls(sWsh As String, sFilenameSrc As String, sFilenameTrg As String) '#
Dim Wbk As Workbook
Rem Workbook - Add
Set Wbk = Workbooks.Add
With Wbk
With .Worksheets(1)
Rem QueryTable - Add
With .QueryTables.Add(Connection:="TEXT;" & sFilenameSrc, Destination:=.Cells(1))
Rem QueryTable - Properties
.SaveData = True
.TextFileParseType = xlDelimited
.TextFileDecimalSeparator = "."
.TextFileThousandsSeparator = ","
.TextFileConsecutiveDelimiter = False
.TextFileTabDelimiter = True
.TextFileTrailingMinusNumbers = True
.TextFilePlatform = 65001 'Unicode (UTF-8)
.Refresh BackgroundQuery:=False
Rem QueryTable - Delete
.Delete
End With
Rem Rename Worksheet '#
On Error Resume Next '# Ignore error in case the Filename is not valid as Sheetname
.Name = sWsh '#
On Error GoTo 0 '#
End With
Rem Workbook - Save & Close
.SaveAs Filename:=sFilenameTrg, FileFormat:=xlOpenXMLWorkbook
.Close
End With
End Sub
With a delimited file that has a csv extension, Excel's Open and vba Workbooks.Open and Workbooks.OpenText methods will always assume that the delimiter is a comma, no matter what you put into the argument.
You can change the file extension (eg to .txt), and then the .Open method should work.
You could read it into a TextStream object and parse it line by line in VBA
You can Import the file rather than Opening the file.
You could use Power Query to import it.
Or you could use a variation of the code below, which was just generated by the macro recorder, so you'll have to clean it up and adapt it a bit to your specifics.
With ActiveSheet.QueryTables.Add(Connection:= _
"TEXT;D:\Users\Ron\Desktop\myFile.csv", Destination:=Range("$A$12"))
.CommandType = 0
.Name = "weather_1"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.TextFilePromptOnRefresh = False
.TextFilePlatform = 437
.TextFileStartRow = 1
.TextFileParseType = xlDelimited
.TextFileTextQualifier = xlTextQualifierDoubleQuote
.TextFileConsecutiveDelimiter = False
.TextFileTabDelimiter = True
.TextFileSemicolonDelimiter = False
.TextFileCommaDelimiter = False
.TextFileSpaceDelimiter = False
.TextFileColumnDataTypes = Array(1, 1, 1, 1)
.TextFileTrailingMinusNumbers = True
.Refresh BackgroundQuery:=False
End With
This method uses File System Object and Text Stream to change from tab delimited to comma delimited. Select Microsoft Scripting Runtime inside of Tools/References to make available the library used. If semicolon delimited also does not open correctly, you can replace third parameter of REPLACE function in line fileContents = Replace(fileContents, vbTab, ";") with comma and try again. Note we are creating a .csv file with Set textStreamObj = fileSystemObj.CreateTextFile("filePath2.csv"), not overwriting our original.
Option Explicit
Sub changeDelimitedMarker()
Dim fileSystemObj As Scripting.FileSystemObject
Dim textStreamObj As Scripting.TextStream
Dim fileContents As String
Set fileSystemObj = New FileSystemObject
Set textStreamObj = fileSystemObj.OpenTextFile("filePath1.csv")
fileContents = textStreamObj.ReadAll
textStreamObj.Close
fileContents = Replace(fileContents, vbTab, ",")
Set textStreamObj = fileSystemObj.CreateTextFile("filePath2.csv")
textStreamObj.Write fileContents
textStreamObj.Close
End Sub
Using the ADODB.Stream object, you can create a user-defined function.
Sub all()
Dim sourcepath As String
Dim sDir As String
Dim newpath As String
Dim vResult As Variant
Dim Wb As Workbook
Dim Fn As String
sourcepath = "C:\Users\PC\Desktop\Test\"
newpath = sourcepath & "xlsx\"
'make sure subfolder xlsx was created before
sDir = Dir$(sourcepath & "*.csv", vbNormal)
Application.ScreenUpdating = False
Do Until Len(sDir) = 0
'Workbooks.Open (sourcepath & sDir)
'use adodb.stream
vResult = TransToTextWithCsvUTF_8(sourcepath & sDir)
Fn = Replace(sDir, ".csv", ".xlsx")
Set Wb = Workbooks.Add
With Wb
Range("a1").Resize(UBound(vResult, 1) + 1, UBound(vResult, 2) + 1) = vResult
.SaveAs Filename:=newpath & Fn, FileFormat:=xlOpenXMLWorkbook, CreateBackup:=False
.Close
End With
sDir = Dir$
Loop
Application.ScreenUpdating = True
End Sub
Function TransToTextWithCsvUTF_8(strFn As String) As Variant
Dim vR() As String
Dim i As Long, r As Long, j As Integer, c As Integer
Dim objStream As Object
Dim strRead As String
Dim vSplit, vRow
Dim s As String
Set objStream = CreateObject("ADODB.Stream")
With objStream
.Charset = "utf-8"
.Open
.LoadFromFile strFn
strRead = .ReadText
.Close
End With
vSplit = Split(strRead, vbCrLf)
r = UBound(vSplit)
c = UBound(Split(vSplit(0), vbTab, , vbTextCompare))
ReDim vR(0 To r, 0 To c)
For i = 0 To r
vRow = Split(vSplit(i), vbTab, , vbTextCompare)
If UBound(vRow) = c Then 'if it is empty line, skip it
For j = 0 To c
If IsNumeric(vRow(j)) Then
s = Format(vRow(j), "#,##0.000")
s = Replace(s, ".", "+")
s = Replace(s, ",", ".")
s = Replace(s, "+", ",")
vR(i, j) = s
Else
vR(i, j) = vRow(j)
End If
Next j
End If
Next i
TransToTextWithCsvUTF_8 = vR
Set objStream = Nothing
End Function
The Text to Columns should work for this:
Sub all()
Dim sourcepath As String
Dim sDir As String
Dim newpath As String
sourcepath = "C:\Users\snapier\Downloads\Test\"
newpath = sourcepath & "xlsx\"
'make sure subfolder xlsx was created before
sDir = Dir$(sourcepath & "*.csv", vbNormal)
Do Until Len(sDir) = 0
Workbooks.Open (sourcepath & sDir)
With ActiveWorkbook.Worksheets(1)
.UsedRange.TextToColumns Destination:=Range("A1"), DataType:=xlDelimited, _
TextQualifier:=xlDoubleQuote, ConsecutiveDelimiter:=True, Tab:=True, _
Semicolon:=False, Comma:=False, Space:=False, Other:=False
End With
With ActiveWorkbook
.SaveAs Filename:=Replace(Left(.FullName, InStrRev(.FullName, ".")), sourcepath, newpath) & "xlsx", FileFormat:=xlOpenXMLWorkbook, CreateBackup:=False
.Close
End With
sDir = Dir$
Loop
End Sub
Related
I have tried writing my own VBA code, everything and all files work great. However, I do have these 4 files that are converted to respective columns but in the file explorer, it's not shown as an excel workbook.
I've tried using fileformat 51, which gave me the same result, so I'm not sure where have gone wrong, please assist. Thanks in advance
Here is my code:
Sub CSVtoXlsx()
Do While fname <> ""
Dim wb As Workbook
Dim mydata As String, temparray() As String, strdata() As String
Dim i As Integer
Set wb = Workbooks.Add
csvfilepath = csvfolder & fname
Open csvfilepath For Binary As #1
mydata = LOF(1)
Get #1, , mydata
Close #1
strdata() = Split(mydata, vbCrLf)
temparray() = Split(strdata(0), "|")
ReDim arcol(0 To UBound(temparray))
For i = 0 To UBound(temparray)
arcol(i) = 2
Next i
With wb.Sheets(1).QueryTables.Add( _
Connection:="text;" & csvfilepath, Destination:=wb.Sheets(1).Range("a1"))
.Name = "formatted_data"
.FieldNames = True
.TextFilePlatform = 437
.TextFileStartRow = 1
.TextFileParseType= xlDelimited
.TextFileTextQualifier = xlTextQualifierNone
.TextFileOtherDelimiter = "|"
.TextFileColumnDataTypes = arcol
.Refresh BackgroundQuery:=False
End With
Application.DisplayAlerts = False
wb.SaveAs (xlsfolder & fname & ".xlsx"), xlOpenXMLWorkbook
Application.DisplayAlerts = True
wb.Close savechanges:=False
fname = Dir
In order to eliminate csv extension, please replace:
wb.SaveAs (xlsfolder & fname & ".xlsx"), xlOpenXMLWorkbook
with:
wb.SaveAs (xlsfolder & Split(fname, ".")(0) & ".xlsx"), xlOpenXMLWorkbook
I have txt files that are automatically exported to me from another system (I cannot change this system). When I try to convert these txt files to excel with the following code (I created a subfolder xlsx manually):
Sub all()
Dim sourcepath As String
Dim sDir As String
Dim newpath As String
sourcepath = "C:\Users\PC\Desktop\Test\"
newpath = sourcepath & "xlsx\"
'make sure subfolder xlsx was created before
sDir = Dir$(sourcepath & "*.txt", vbNormal)
Do Until Len(sDir) = 0
Workbooks.Open (sourcepath & sDir)
With ActiveWorkbook
.SaveAs Filename:=Replace(Left(.FullName, InStrRev(.FullName, ".")), sourcepath, newpath) & "xlsx", FileFormat:=xlOpenXMLWorkbook, CreateBackup:=False
.Close
End With
sDir = Dir$
Loop
End Sub
it does work, however certain special characters, like ä,ö and Ü and so, are not properly displayed. I.e. when I open the xlsx files later on, I can see that these have been replaced by something like ä and so. I could use a work around and now start to replace these afterwards, however I would like to improve my txt to xlsx code. According to this post or this one it should be possible using ADODB.Stream. However, I don't know how to implement this into my code (loop) to get it working here in my case? If there is another approach instead of ADOB.Stream I am also fine with that. It is not necessary for me to use ADOB.Stream.
Have you tried coercing the code page, using the Origin parameter? I don't know if you need a particular one, but the UTF-8 constant might be a starting point. I personally like this page as a reference source: https://learn.microsoft.com/en-us/windows/win32/intl/code-page-identifiers
So the solution might turn out to be as simple as this - it worked in my dummy tests:
Option Explicit
Private Const CP_UTF8 As Long = 65001
Public Sub RunMe()
Dim sDir As String, sourcePath As String, fileName As String
Dim fso As Object
sourcePath = "C:\anyoldpath\"
Set fso = CreateObject("Scripting.FileSystemObject")
sDir = Dir(sourcePath & "*.txt", vbNormal)
Do While Len(sDir) > 0
fileName = sourcePath & "xlsx\" & fso.GetBaseName(sDir) & ".xlsx"
Application.Workbooks.OpenText sourcePath & sDir, CP_UTF8
ActiveWorkbook.SaveAs fileName, xlOpenXMLWorkbook
ActiveWorkbook.Close False
sDir = Dir()
Loop
End Sub
Assuming that these txt files are tab delimited.
The handling of the characters or code page it's managed by the Origin parameter of the Workbooks.OpenText method or by the TextFilePlatform property of the QueryTable object.
These txt files should be opened with Workbooks.OpenText method, however in order to handle problem of the Decimal.Separator been different than then one in your system, I suggest to use the QueryTable method also applied to the tab separated files with a csv extension.
We just need to replace these lines:
sFile = Dir$(sPathSrc & "*.csv")
sFilenameTrg = sPathTrg & Left(sFile, InStrRev(sFile, ".csv")) & "xlsx"
With these:
sFile = Dir$(sPathSrc & "*.txt")
sFilenameTrg = sPathTrg & Left(sFile, InStrRev(sFile, ".txt")) & "xlsx"
No changes to Procedure `Open_Csv_As_Tab_Delimited_Then_Save_As_Xls, perhaps a change in the name to reflect its versatility.
Tested with this tst file:
Generated this `xlsx' file:
Hopefully, it should be straightforward to add these procedure to you project, let me know of any problem or question you might have with the resources used.
Sub Tab_Delimited_UTF8_Files_Save_As_Xlsx()
Dim sFilenameSrc As String, sFilenameTrg As String
Dim sPathSrc As String, sPathTrg As String
Dim sFile As String
Dim bShts As Byte, exCalc As XlCalculation
sPathSrc = "C:\Users\PC\Desktop\Test\"
sPathTrg = sPathSrc & "xlsx\"
Rem Excel Properties OFF
With Application
.EnableEvents = False
.DisplayAlerts = False
.ScreenUpdating = False
exCalc = .Calculation
.Calculation = xlCalculationManual
.CalculateBeforeSave = False
bShts = .SheetsInNewWorkbook
.SheetsInNewWorkbook = 1
End With
Rem Validate Target Folder
If Len(Dir$(sPathTrg, vbDirectory)) = 0 Then MkDir sPathTrg
Rem Process Csv Files
sFile = Dir$(sPathSrc & "*.txt")
Do Until Len(sFile) = 0
sFilenameSrc = sPathSrc & sFile
sFilenameTrg = sPathTrg & Left(sFile, InStrRev(sFile, ".txt")) & "xlsx"
Call Open_Csv_As_Tab_Delimited_Then_Save_As_Xls(sFilenameSrc, sFilenameTrg)
sFile = Dir$
Loop
Rem Excel Properties OFF
With Application
.SheetsInNewWorkbook = bShts
.Calculation = exCalc
.CalculateBeforeSave = True
.ScreenUpdating = True
.DisplayAlerts = True
.EnableEvents = True
End With
End Sub
…
Sub Open_Txt_As_Tab_Delimited_Then_Save_As_Xls(sFilenameSrc As String, sFilenameTrg As String)
Dim Wbk As Workbook
Rem Workbook - Add
Set Wbk = Workbooks.Add(Template:="Workbook")
With Wbk
Rem Txt File - Import
With .Worksheets(1)
Rem QueryTable - Add
With .QueryTables.Add(Connection:="TEXT;" & sFilenameSrc, Destination:=.Cells(1))
Rem QueryTable - Properties
.SaveData = True
.TextFileParseType = xlDelimited
.TextFileDecimalSeparator = "."
.TextFileThousandsSeparator = ","
.TextFileConsecutiveDelimiter = False
.TextFileTabDelimiter = True
.TextFileTrailingMinusNumbers = True
.TextFilePlatform = 65001 'Unicode (UTF-8)
.Refresh BackgroundQuery:=False
Rem QueryTable - Delete
.Delete
End With: End With
Rem Workbook - Save & Close
.SaveAs Filename:=sFilenameTrg, FileFormat:=xlOpenXMLWorkbook
.Close
End With
End Sub
VBA code not looping through the folder of .csv's
The code below is doing the function I need but is not looping and it would be good to add a line to delete the .csv's once copied
Option Explicit
Private Sub SaveAs_Files_in_Folder()
Dim CSVfolder As String, XLSfolder As String
Dim CSVfilename As String, XLSfilename As String
Dim template As String
Dim wb As Workbook
Dim wbm As Workbook 'The template I want the data pasted into
Dim n As Long
CSVfolder = "H:\Case Extracts\input" 'Folder I have the csv's go
XLSfolder = "H:\Case Extracts\output" 'Folder for the xlsx output
If Right(CSVfolder, 1) <> "\" Then CSVfolder = CSVfolder & "\"
If Right(XLSfolder, 1) <> "\" Then XLSfolder = XLSfolder & "\"
n = 0
CSVfilename = Dir(CSVfolder & "*.csv", vbNormal)
template = Dir("H:\Case Extracts\template.xlsx", vbNormal)
While Len(CSVfilename) <> 0
n = n + 1
Set wb = Workbooks.Open(CSVfolder & CSVfilename)
Range("A1:M400").Select
Selection.Copy
Set wbm = Workbooks.Open(template, , , , "Password") 'The template has a password
With wbm
Worksheets("Sheet2").Activate
Sheets("Sheet2").Cells.Select
Range("A1:M400").PasteSpecial
Worksheets("Sheet1").Activate
Sheets("Sheet1").Range("A1").Select
wbm.SaveAs Filename:=XLSfolder & CSVfilename & ".xlsx", FileFormat:=xlOpenXMLWorkbook
wbm.Close
End With
With wb
.Close False
End With
CSVfilename = Dir()
Wend
End Sub
The code works for the first .csv file I just can't get the loop to keep going through the files. It would also be good to add a line to delete the .csv's once they have been copied
Work with objects. You may want to see How to avoid using Select in Excel VBA. Declare objects for both the csv and template and work with them.
Your DIR is not working because of template = Dir("H:\Case Extracts\template.xlsx", vbNormal) which is right after CSVfilename = Dir(CSVfolder & "*.csv", vbNormal). It is getting reset. Reverse the position as shown below. Move it before the loop as #AhmedAU mentioned.
Copy the range only when you are ready to paste. Excel has an uncanny habit of clearing the clipboard. For example, I am pasting right after I cam copying the range.
Is this what you are trying? (Untested)
Option Explicit
Private Sub SaveAs_Files_in_Folder()
Dim CSVfolder As String, XLSfolder As String
Dim CSVfilename As String, XLSfilename As String
Dim wbTemplate As Workbook, wbCsv As Workbook
Dim wsTemplate As Worksheet, wsCsv As Worksheet
CSVfolder = "H:\Case Extracts\input" '<~~ Csv Folder
XLSfolder = "H:\Case Extracts\output" '<~~ For xlsx output
If Right(CSVfolder, 1) <> "\" Then CSVfolder = CSVfolder & "\"
If Right(XLSfolder, 1) <> "\" Then XLSfolder = XLSfolder & "\"
XLSfilename = Dir("H:\Case Extracts\template.xlsx", vbNormal)
CSVfilename = Dir(CSVfolder & "*.csv")
Do While Len(CSVfilename) > 0
'~~> Open Csv File
Set wbCsv = Workbooks.Open(CSVfolder & CSVfilename)
Set wsCsv = wbCsv.Sheets(1)
'~~> Open Template file
Set wbTemplate = Workbooks.Open(XLSfolder & XLSfilename, , , , "Password")
'~~> Change this to relevant sheet
Set wsTemplate = wbTemplate.Sheets("Sheet1")
'~~> Copy and paste
wsCsv.Range("A1:M400").Copy
wsTemplate.Range("A1").PasteSpecial xlPasteValues
'~~> Save file
wbTemplate.SaveAs Filename:=XLSfolder & CSVfilename & ".xlsx", _
FileFormat:=xlOpenXMLWorkbook
'~~> Close files
wbTemplate.Close (False)
wbCsv.Close (False)
'~~> Get next file
CSVfilename = Dir
Loop
'~~> Clear clipboard
Application.CutCopyMode = False
End Sub
I think must be something like this, adapted to very fast looping through huge of csvs files
reference “Microsoft Scripting Runtime” (Add using
Tools->References from the VB menu)
Sub SaveAs_Files_in_Folder()
Dim myDict As Dictionary, wb As Workbook, eachLineArr As Variant
Set myDict = CreateObject("Scripting.Dictionary")
CSVfolder = "H:\Case Extracts\input\"
XLSfolder = "H:\Case Extracts\output\"
Template = ThisWorkbook.path & "\template.xlsx"
fileMask = "*.csv"
csvSeparator = ";"
csvLineBreaks = vbLf ' or vbCrLf
With Application
.ScreenUpdating = False
.DisplayAlerts = False
.EnableEvents = False
.Calculation = xlManual
'.Visible = False ' uncomment to hide templates flashing
End With
LookupName = CSVfolder & fileMask
Results = CreateObject("WScript.Shell").Exec("CMD /C DIR """ & LookupName & Chr(34) & " /S /B /A:-D").StdOut.ReadAll
filesList = Split(Results, vbCrLf)
For fileNr = LBound(filesList) To UBound(filesList) - 1
csvLinesArr = Split(GetCsvFData(filesList(fileNr)), csvLineBreaks) ' read each csv to array
ArrSize = UBound(Split(csvLinesArr(lineNr), csvSeparator))
For lineNr = LBound(csvLinesArr) To UBound(csvLinesArr)
If csvLinesArr(lineNr) <> "" Then
eachLineArr = Split(csvLinesArr(lineNr), csvSeparator) ' read each line to array
ReDim Preserve eachLineArr(ArrSize) ' to set first line columns count to whoole array size
myDict.Add Dir(filesList(fileNr)) & lineNr, eachLineArr ' put all lines into dictionary object
End If
Next lineNr
Set wb = Workbooks.Open(Template, , , , "Password")
wb.Worksheets("Sheet1").[a1].Resize(myDict.Count, ArrSize) = TransposeArrays1D(myDict.Items)
Set fso = CreateObject("Scripting.FileSystemObject")
csvName = fso.GetBaseName(filesList(fileNr))
Set fso = nothing
wb.SaveAs FileName:=XLSfolder & csvName & ".xlsx"
wb.Close
Set wb = Nothing
Next fileNr
With Application
.ScreenUpdating = True
.DisplayAlerts = True
.EnableEvents = True
.Calculation = xlManual
.Visible = True
End With
End Sub
Function GetCsvFData(ByVal filePath As String) As Variant
Dim MyData As String, strData() As String
Open filePath For Binary As #1
MyData = Space$(LOF(1))
Get #1, , MyData
Close #1
GetCsvFData = MyData
End Function
Function TransposeArrays1D(ByVal arr As Variant) As Variant
Dim tempArray As Variant
ReDim tempArray(LBound(arr, 1) To UBound(arr, 1), LBound(arr(0)) To UBound(arr(0)))
For y = LBound(arr, 1) To UBound(arr, 1)
For x = LBound(arr(0)) To UBound(arr(0))
tempArray(y, x) = arr(y)(x)
Next x
Next y
TransposeArrays1D = tempArray
End Function
I have to import a number of text files into excel and add each text file to a new sheet. The number of lines on some files are in excess of 350,000. Loops take so long that it's not really user friendly. I've tried using this to read the data in quickly
Dim arrLines() As String
Dim lineValue As String
lineValue = ts.ReadAll
DoEvents
arrLines() = Split(lineValue, vbCrLf)
Dim Destination As Range
Set Destination = Worksheets(WorksheetName).Range("A2")
Set Destination = Destination.Resize(UBound(arrLines), 1)
Destination.Value = Application.Transpose(arrLines)
but this results in every value AFTER line 41243 simply having a value of "#N/A". I was thinking to use a Application.Index to split up the array into smaller arrays, but you need to give the index function an array of lines that you want to compose the new array, and that would mean creating a loop to run through the numbers 1-41000, then 41001-82000, etc. At the point i'm doing a loop to create the arrays it's not really faster. looping through the file line by line is similarly too slow. What's a good way of reading in a such a large number of lines without ending up with the missing values?
You could use and automate the 'Data' -> 'From Text/CSV' wizard of Excel.
Using the Macro recorder you end up with this, which should be a good start:
ActiveWorkbook.Queries.Add Name:="MyFile", Formula:="let" & Chr(13) & "" & Chr(10) & " Source = Table.FromColumns({Lines.FromBinary(File.Contents(""C:\Path\MyFile.txt""), null, null, 1252)})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & " Source"
ActiveWorkbook.Worksheets.Add
With ActiveSheet.ListObjects.Add(SourceType:=0, Source:="OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""MyFile"";Extended Properties=""""", Destination:=Range("$A$1")).QueryTable
.CommandType = xlCmdSql
.CommandText = Array("SELECT * FROM [MyFile]")
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.PreserveColumnInfo = True
.ListObject.DisplayName = "MyFile"
.Refresh BackgroundQuery:=False
End With
Copy Text Files to Excel
Credits to simple-solution for suggesting (in the comments) to open the text files with Workbooks.Open.
The Code
Sub CopyTextFilesToExcel()
' Search Folder Path
Const cStrPath As String _
= "D:\Excel\MyDocuments\StackOverflow\"
Const cStrExt As String = "*.txt" ' File Extension
Const cFolderPicker As Boolean = False ' True to enable FolderPicker
Dim wb As Workbook ' Current File
Dim strPath As String ' Path of Search Folder (Incl. "\" at the end.)
Dim strFileName As String ' Current File Name
With Application
.ScreenUpdating = False
.DisplayAlerts = False
End With
On Error GoTo ProcedureExit
' Determine Search Path ("\" Issue)
If cFolderPicker Then
With Application.FileDialog(msoFileDialogFolderPicker)
If .Show = False Then Exit Sub
strPath = .SelectedItems(1) & "\"
End With
Else
If Right(cStrPath, 1) <> "\" Then
strPath = cStrPath & "\"
Else
strPath = cStrPath
End If
End If
' Determine first Current File Name.
strFileName = Dir(strPath & cStrExt)
With ThisWorkbook ' Target Workbook
' Loop through files in folder.
Do While strFileName <> ""
' Create a reference to the Current File.
Set wb = Workbooks.Open(cStrPath & strFileName)
' Copy first worksheet of Current File after the last sheet
' (.Sheets.Count) in Target Workbook.
wb.Worksheets(1).Copy After:=.Worksheets(.Sheets.Count)
' Close Current File without saving changes (False).
wb.Close False
' Find next File(name).
strFileName = Dir()
Loop
End With
MsgBox "All files copied!"
ProcedureExit:
With Application
.ScreenUpdating = True
.DisplayAlerts = True
End With
End Sub
Mathieu Guindon had EXACTLY the solution I was hoping for. Eliminating the transpose has solved the issue with the #N/A values. Thank you!
Edit:
The code just loops the arrayed data a second time into a two dimensional array and then posts it to the range without the transpose effect. It's a little slower than the old way (taking about two minutes or so longer), but it's still pretty fast and produces the results I want. Code is as follows:
lineValue = ts.ReadAll
DoEvents
arrLines() = Split(lineValue, vbCrLf)
Dim arrBetween() As Variant
ReDim arrBetween(UBound(arrLines), 0)
LoopLength = UBound(arrLines) - 1
For i = 0 To LoopLength
arrBetween(i, 0) = arrLines(i)
DoEvents
If i Mod 2500 = 0 Or i = LoopLength Then
Application.StatusBar = "Importing " & WorksheetName & " " & (i) & " ."
End If
Next i
Dim Destination As Range
Set Destination = Worksheets(WorksheetName).Range("A2:A" & UBound(arrLines))
Destination.Value = arrBetween
I'm using the following code to import all CSV files from D:\Report into Excel with each file on a new Sheet with the name of the file as the sheet name.
I'm looking to include some error control to allow the code to be run a second time if a file was not in the Report directory. The current problem is that the code will run again but bombs out as you cannot have the same name for two sheets and I dont want the same files imported again.
Sub ImportAllReportData()
'
' Import All Report Data
' All files in D:\Report will be imported and added to seperate sheets using the file names in UPPERCASE
'
Dim strPath As String
Dim strFile As String
'
strPath = "D:\New\"
strFile = Dir(strPath & "*.csv")
Do While strFile <> ""
With ActiveWorkbook.Worksheets.Add
With .QueryTables.Add(Connection:="TEXT;" & strPath & strFile, _
Destination:=.Range("A1"))
.Parent.Name = Replace(UCase(strFile), ".CSV", "")
.TextFileParseType = xlDelimited
.TextFileTextQualifier = xlTextQualifierDoubleQuote
.TextFileConsecutiveDelimiter = False
.TextFileTabDelimiter = False
.TextFileSemicolonDelimiter = False
.TextFileCommaDelimiter = True
.TextFileSpaceDelimiter = False
.TextFileColumnDataTypes = Array(1)
.TextFileTrailingMinusNumbers = True
.Refresh BackgroundQuery:=False
End With
End With
strFile = Dir
Loop
End Sub
Any help would be greatly appreciated
Use the following function to test if a WS already exists:
Function SheetExists(strShtName As String) As Boolean
Dim ws As Worksheet
SheetExists = False 'initialise
On Error Resume Next
Set ws = Sheets(strShtName)
If Not ws Is Nothing Then SheetExists = True
Set ws = Nothing 'release memory
On Error GoTo 0
End Function
Use it in your code like this:
....
strPath = "D:\New\"
strFile = Dir(strPath & "*.csv")
Do While strFile <> ""
If Not SheetExists(Replace(UCase(strFile), ".CSV", "")) Then
With ActiveWorkbook.Worksheets.Add
With .QueryTables.Add(Connection:="TEXT;" & strPath & strFile, _
.....
End If