I have pulled together some code which works, but I must confess to not understanding the nitty gritty of what its doing and I seem to open two lots of connections, which seems slow and messy, one to get the sheetname which I think I need for the SQL call, and the call itself.
I use it as function but have stripped out to a Sub to try and improve it. It gets used a lot pulling in utility data from up to 700 separate files as part of a process, and running across multiple clients. So if it can be streamlined it will cut time down massively.
The file format varies depending on the task:
Alarm Data 51 Columns Wide and either 7 lines or 700 sites *7 lines
Meter Data 50 or 99 Columns Wide with a blank column at 51, 15 lines, or up to 700 * 15 lines
I can't control the file formats/lengths and don't know the sheetname as it can vary by source
Any help tidying it up is massively appreciated.
FootSore
Edit: The files will only ever have one Sheet in them, but name unknown. I only need that sheet.
Function ReadExcelFile(ByRef InputFileArray() As Variant, InputFileName As String, InputFileLocation As String, HeaderYesNo As String)
'Reads Excel File and returns InputFileArray
Dim ReadFileArray() As Variant
Dim connectionString As String
Dim sql As String
Set FSO = CreateObject("scripting.filesystemobject")
If FSO.FileExists(InputFileLocation & InputFileName) = True Then
connectionString = _
"Provider=Microsoft.ACE.OLEDB.12.0;" & _
"Data Source=""" & InputFileLocation & InputFileName & """;" & _
"Extended Properties=""Excel 12.0;HDR=" & HeaderYesNo & ";IMEX=1"""
'This assumes the Excel file contains column headers -- HDR=Yes
'Routine to get unknown sheet name
Set conn = CreateObject("ADODB.Connection")
conn.connectionString = "Provider=Microsoft.ACE.OLEDB.12.0;" & _
"Data Source=""" & InputFileLocation & InputFileName & """;" & _
"Extended Properties=""Excel 12.0;HDR=Yes"""
conn.Open
Set bs = conn.OpenSchema(20) ' 20 = adSchemaTables
Do Until bs.EOF = True
'Debug.Print bs.Fields!Table_Name.Value
SheetName = bs.Fields!Table_Name.Value
bs.MoveNext
Loop
bs.Close: conn.Close
Set bs = Nothing
Set conn = Nothing
'Get the contents of the Excel via SQL saves opening file
sql = "SELECT * FROM [" + SheetName + "]" '
'Go to the VBE's Tools, References then locate and put a check beside 'Microsoft ActiveX Data Objects 6.1 Library' to include the library in your project.
Dim rs As New ADODB.Recordset
rs.Open sql, connectionString
ReadFileArray() = rs.GetRows 'Puts the data from the recordset into an array
rs.Close
Set rs = Nothing
'Debugging Tool
'Dim row As Variant, column As Variant
'For row = 0 To UBound(TotalFileArray, 2)
' For column = 0 To UBound(InputFileArray, 1)
' Debug.Print InputFileArray(column, row)
' Next
'Next
'Limitations mean the columns and rows are read in wrong order.
'Public Sub to transpose array
TransposeArray ReadFileArray, InputFileArray
Erase ReadFileArray
Else
End If
End Function
You can re-use the one connection and recordset. Note if your input file has multiple sheets and/or named ranges then this just picks the first one listed.
Also - you're not getting field headers in the returned array.
Sub Tester()
Dim arr
arr = ReadExcelFile("LookupTable.xlsx", "C:\Temp\", True)
If Not IsEmpty(arr) Then 'read any data?
Sheet1.Range("A1").Resize(UBound(arr, 1), UBound(arr, 2)).Value = arr
End If
End Sub
Function ReadExcelFile(InputFileName As String, InputFileLocation As String, _
HeaderYesNo As String) As Variant
Dim arr As Variant, SheetName As String
Dim sql As String, conn As Object, rs As Object
'ideally you do this check *before* calling the function though...
If Dir(InputFileLocation & InputFileName, vbNormal) = "" Then
MsgBox "File not found!"
Exit Function
End If
Set conn = CreateObject("ADODB.Connection")
conn.Open "Provider=Microsoft.ACE.OLEDB.12.0;" & _
"Data Source=""" & InputFileLocation & InputFileName & """;" & _
"Extended Properties=""Excel 12.0;HDR=" & HeaderYesNo & ";IMEX=1"""
Set rs = conn.OpenSchema(20) ' 20 = adSchemaTables, NOTE: also reads named ranges...
If Not rs.EOF Then SheetName = rs.Fields("Table_Name").Value 'Always only one sheet?
rs.Close
If Len(SheetName) > 0 Then 'got a sheet?
rs.Open "SELECT * FROM [" + SheetName + "]", conn 're-use connection
If Not rs.EOF Then ReadExcelFile = TransposeArray(rs.GetRows())
End If
End Function
Function TransposeArray(arr)
Dim arrout(), r As Long, c As Long
ReDim arrout(LBound(arr, 2) To UBound(arr, 2), LBound(arr, 1) To UBound(arr, 1))
For r = LBound(arr, 1) To UBound(arr, 1)
For c = LBound(arr, 2) To UBound(arr, 2)
arrout(c, r) = arr(r, c)
Next c
Next r
TransposeArray = arrout
End Function
Related
The routine below reads in a csv file using ADO. The csv file I am reading has 139,000 lines of data, with 136 columns. The routine is not working as intended. One of the columns has zero values for all rows except for 500 or so rows, where it takes a decimal value of, say, 0.05 or 0.03 etc. Because this method uses ADO, it determines the data type of the field using a setting in the registry, TypeGuessRows, where, based on a pre-specfied number of rows, it makes a guess at the data type of that column. So, for the column in the example, it is, I think, assuming an integer data type because the first couple of hundred values are all zero. The few values that are decimal and non-zero are then forced to fit the assumed data type and therefore also become zero. I cannot change the value of TypeGuessRows because, in the company I work for, I do not have permissions to change the registry. Of the 136 columns, there are many other columns with a similar problem.
Is there a way around this? I have seen a suggestion that I could use a dummy first row with the value that will imply the desired data type, but this is an overhead I would rather not incur.
Or do I simply need to use a method of data import that does not use ADO?
Sub GetDataTextFile1(strFilePath As String, strSheet As String, strRange As String, strField As String, strValue As String)
Dim strFolder As String, strFile As String, strSQL As String
Dim objConnection As ADODB.Connection
Dim objRecordSet As ADODB.Recordset
'If an error occurs then handle it
'On Error GoTo ErrorTrap
'Get the name of the file and the folder
strFile = Right(strFilePath, Len(strFilePath) - InStrRev(strFilePath, "\"))
strFolder = Left(strFilePath, Len(strFilePath) - Len(strFile) - 1)
Set objConnection = New ADODB.Connection
Set objRecordSet = New ADODB.Recordset
'Open Connection
objConnection.Open "Provider=Microsoft.ACE.OLEDB.12.0;" _
& "Data Source=" & strFolder & ";" _
& "Extended Properties=""text;HDR=YES;FMT=Delimited"""
'Generate SQL code to extract data from the file
If strField <> "" And strValue <> "" Then
strSQL = "SELECT * FROM [" & strFile & "] WHERE CSTR([" & strField & "]) IN ('" & strValue & "');"
Else
strSQL = "SELECT * FROM [" & strFile & "];"
End If
'Execute the SQL code
Set objRecordSet = objConnection.Execute(strSQL)
'Copy the data in to the relevant range in the spreadsheet
ThisWorkbook.Sheets(strSheet).Range(strRange).CopyFromRecordset objRecordSet
'Close the recordset and the connection to the database
objRecordSet.Close
objConnection.Close
Set objRecordSet = Nothing
Set objConnection = Nothing
ExitPoint:
Exit Sub
ErrorTrap:
Call ErrorHandler(Err.Number, Err.Description, "GetDataTextFile1")
End Sub
Replace the line
ThisWorkbook.Sheets(strSheet).Range(strRange).CopyFromRecordset objRecordSet
With the following
Dim r as range
Dim f as field
dim x as long
Set r = ThisWorkbook.Sheets(strSheet).Range(strRange)
Do while not objrecordset.eof
x = 0
For each f in objrecordset.fields
r.offset(0,x) = objrecordset(x)
x = x +1
next f
objrecordset.movenext
set r = r.offset(1,0)
loop
This will bring the data in bit by bit. If that isn't sufficient to avoid it guessing the datatype you can add a select case f.name routine to force the datatype of certain fields
You can import your CSV, and even multiple CSV files, using the script below.
Sub ReadFilesIntoActiveSheet()
Dim fso As FileSystemObject
Dim folder As folder
Dim file As file
Dim FileText As TextStream
Dim i As Long
Dim cl As Range
Set fso = New FileSystemObject
Set folder = fso.GetFolder("C:\Users\Excel\Desktop\test\")
Set cl = ActiveSheet.Cells(1, 1)
Application.ScreenUpdating = False
For Each file In folder.Files
Set FileText = file.OpenAsTextStream(ForReading)
cl.Value = file.Name
i = 1
Do While Not FileText.AtEndOfStream
cl.Offset(i, 0).Value = FileText.ReadLine
i = i + 1
Loop
FileText.Close
Set cl = cl.Offset(0, 1)
Next file
Application.ScreenUpdating = True
Set FileText = Nothing
Set file = Nothing
Set folder = Nothing
Set fso = Nothing
End Sub
The current codes I am working on requires me to rearrange the columns in VBA. It has to arranged according to the header, and the headers are "V-d(1)", "V-g(1)", "I-d(1)", "I-g(1)", and this set repeats for numbers 2, 3, etc etc. (e.g V-d(2), I-g(4)). These data are usually jumbled up and I have to arrange them in ascending numbers.
It does not matter if V-g, V-d, I-d or I-g comes first.
Dim num, numadj As Integer
Dim colu, coladj
Range("A1").Select
Do While Range("A1").Offset(0, i - 1).Value <> ""
colu = ActiveCell.Value
coladj = ActiveCell.Offset(0, 1).Value
num = Left(Right(colu.Text, 2), 1)
numadj = Left(Right(coladj.Text, 2), 1)
If num > numadj Then
colu.EntireColumn.Cut Destination:=Columns("Z:Z")
coladj.EntireColumn.Cut Destination:=colu
Columns("Z:Z").Select.Cut Destination:=coladj
i = i + 1
Else
i = i + 1
End If
Loop
I am very new to VBA so please forgive me for any dumb codes that I have created!!! Thank you in advance everyone!
Consider an SQL and RegEx solution to select columns in a specified arrangement. SQL works in Excel for PC which can access Windows' Jet/ACE SQL Engine to query its own workbook like a database table.
Due to the variable nature of sets ranging 3-10, consider finding the highest number set by extracting the numbers from column headers with RegEx using the defined function, FindHighestNumberSet. Then have RunSQL subroutine call the function to build SQL string dynamically.
Below assumes you have data currently in a tab named DATA with an empty tab named RESULTS which will output query results. Two ADO connection strings are available.
Function (iterating across column headers to extract highest number)
Function FindHighestNumberSet() As Integer
Dim lastcol As Integer, i As Integer
Dim num As Integer: num = 0
Dim regEx As Object
' CONFIGURE REGEX OBJECT
Set regEx = CreateObject("VBScript.RegExp")
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = "[^0-9]"
End With
With Worksheets("DATA")
lastcol = .Cells(7, .Columns.Count).End(xlToLeft).Column
For i = 1 To lastcol
' EXTRACT NUMBERS FROM COLUMN HEADERS
num = Application.WorksheetFunction.Max(num, CInt(regEx.Replace(.Cells(1, i), "")))
Next i
End With
FindHighestNumberSet = num
End Function
Macro (main module looping through result of above function)
Sub RunSQL()
On Error GoTo ErrHandle
Dim conn As Object, rst As Object
Dim strConnection As String, strSQL As String
Dim i As Integer
Set conn = CreateObject("ADODB.Connection")
Set rst = CreateObject("ADODB.Recordset")
' DRIVER AND PROVIDER CONNECTION STRINGS
' strConnection = "DRIVER={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};" _
' & "DBQ=" & Activeworkbook.FullName & ";"
strConnection = "Provider=Microsoft.ACE.OLEDB.12.0;" _
& "Data Source='" & ActiveWorkbook.FullName & "';" _
& "Extended Properties=""Excel 8.0;HDR=YES;"";"
' FIRST THREE SETS
strSQL = " SELECT t.[V-d(1)], t.[I-d(1)], t.[I-g(1)]," _
& " t.[V-d(2)], t.[I-d(2)], t.[I-g(2)]," _
& " t.[V-d(3)], t.[I-d(3)], t.[I-g(3)]"
' VARIABLE 4+ SETS
For i = 4 To FindHighestNumberSet
strSQL = strSQL & ", t.[V-d(" & i & ")], t.[I-d(" & i & ")], t.[I-g(" & i & ")]"
Next i
' FROM CLAUSE
strSQL = strSQL & " FROM [DATA$] t"
' OPEN DB CONNECTION
conn.Open strConnection
rst.Open strSQL, conn
' COLUMN HEADERS
For i = 1 To rst.Fields.Count
Worksheets("RESULTS").Cells(1, i) = rst.Fields(i - 1).Name
Next i
' DATA ROWS
Worksheets("RESULTS").Range("A2").CopyFromRecordset rst
rst.Close: conn.Close
Set rst = Nothing: Set conn = Nothing
MsgBox "Successfully ran SQL query!", vbInformation
Exit Sub
ErrHandle:
Set rst = Nothing: Set conn = Nothing
MsgBox Err.Number & " = " & Err.Description, vbCritical
Exit Sub
End Sub
You can sort vertically by a helper row with something like this (tested):
Sub test() ': Cells.Delete: [b2:d8] = Split("V-d(10) V-d(2) V-d(1)") ' used for testing
Dim r As Range: Set r = ThisWorkbook.Worksheets("Sheet1").UsedRange ' specify the range to be sorted here
r.Rows(2).Insert xlShiftDown ' insert helper row to sort by. (used 2nd row instead 1st so that it is auto included in the range)
r.Rows(2).FormulaR1C1 = "=-RIGHT(R[-1]C,LEN(R[-1]C)-3)" ' to get the numbers from the column header cells above, so adjust if needed
r.Sort r.Rows(2) ' sort vertically by the helper row
r.Rows(2).Delete xlShiftUp ' delete the temp row
End Sub
I have many excel files in many folders and I need to delete the rows from all files where in column for ex. B are words from array:
For ex. my bad words list:
the sun, tree, big car, cup, ....
If A2 column is 'The Sun is the star at the center of the Solar System.' - this row has been deleted.
If in column is 'thesunis the..' - this row has been deleted. But is bad!
And my questions:
How to delete rows with exact words of array element?
How to count array elements?
How to escape single quote in array element (example in code below)
How to open all files from folder "C://folder" and after run code save all?
Here is my code:
Sub code()
Dim MyValue As String
Dim a As Integer
'------------------------------------------------------
ArrayValueToRemove = Array("the sun", "code 'in", "another")
Range("B:B").Select
'------------------------------------------------------
For Each cell In Selection
MyValue = CStr(cell.Value)
For a = 0 To 2
If InStr(1, LCase(MyValue), LCase(ArrayValueToRemove(a))) > 0 Then
cell.EntireRow.Delete
Exit For
End If
Next
Next cell
End Sub
Sub deleteBadWordRows()
Dim currentFile, currentSheet, badWords As Variant, lastRow, i As Integer, baseDirectory As String
'------------------------------------------------------
baseDirectory = "c:\folder\"
badWords = Array("the sun", "code 'in", "another")
'------------------------------------------------------
currentFile = Dir(baseDirectory)
While (currentFile <> "")
Workbooks.Open baseDirectory + currentFile
For Each currentSheet In Workbooks(currentFile).Worksheets
lastRow = currentSheet.Cells(currentSheet.Rows.Count, "B").End(xlUp).Row
For j = 1 To lastRow
For i = 0 To UBound(badWords)
If InStr(1, LCase(CStr(currentSheet.Cells(j, "B").Value)), LCase(badWords(i))) > 0 Then
currentSheet.Rows(j).Delete
j = j - 1
lastRow = lastRow - 1
Exit For
End If
Next
Next
Next
Workbooks(currentFile).Save
Workbooks(currentFile).Close
currentFile = Dir
Wend
End Sub
Consider an SQL solution to query your string searches using the LIKE operator with wildcard, %. Excel for PC can connect to the Jet/ACE SQL Engine (Window .dll files) and query workbooks. Here you avoid the nested looping except for iterating through workbooks.
Below assumes all worksheets are tabular in structure with column headers all beginning at A1. Query results are dumped to a new worksheet where you can delete current worksheet afterwards. Be sure to replace placeholders with actual names, CurrentWorksheet, ColumnA, NewWorksheet:
Sub DeleteSQL()
Dim conn As Object, rst As Object
Dim strConnection As String, strSQL As String
Dim i As Integer
Dim wb As Workbook
Dim dirpath As String: dirpath = "C:\\Folder"
Dim xlfile As Variant: xlfile = Dir(dirpath & "\*.xls*")
Do While (xlfile <> "")
Set wb = Workbooks.Open(dirpath & "\" & xlfile)
Set conn = CreateObject("ADODB.Connection")
Set rst = CreateObject("ADODB.Recordset")
' WORKBOOK CONNECTION
strConnection = "Provider=Microsoft.ACE.OLEDB.12.0;" _
& "Data Source='" & dirpath & "\" & xlfile & "';" _
& "Extended Properties=""Excel 8.0;HDR=YES;"";"
' OPEN DB CONNECTION
conn.Open strConnection
' OPEN RECORDSET
strSQL = " SELECT * FROM [CurrentWorksheet$]" _
& " WHERE [ColumnA] LIKE ""%the sun%"" OR [ColumnA]" _
& " LIKE ""%code 'in%"" OR [ColumnA] LIKE ""%another%"""
rst.Open strSQL, conn
wb.Sheets.Add(After:=wb.Sheets(wb.Sheets.Count).Name = "NewWorkSheet"
' RESULTSET COLUMNS
For i = 1 To rst.Fields.Count
wb.Worksheets("NewWorkSheet").Cells(1, i) = rst.Fields(i - 1).Name
Next i
' RESULTSET DATA ROWS
wb.Worksheets("NewWorkSheet").Range("A2").CopyFromRecordset rst
wb.Close True
rst.Close: conn.Close
Set rst = Nothing: Set conn = Nothing
xlfile = Dir
Loop
End Sub
I am trying to use ACE sql to return array from recordset. The function works well with table ranges.
Problem: the function returns the right number of records if the query returns 2 or more records. However if only one record is found, all rows are filled repetitively with this one row. This is wrong but I cannot find the reason why.
In addition, I wish my function would return column names from Recordset. I have no idea how to glue it together with the array returned from recordset.
Here is the code, credits are due to the author of another solution that I am trying to adapt to my needs: Performing SQL queries on an Excel Table within a Workbook with VBA Macro
Function SQL(dataRange As Range, CritA As String) As Variant
Application.Volatile
Dim cn As ADODB.Connection
Dim rs As ADODB.Recordset
Dim currAddress As String
currAddress = ActiveSheet.Name & "$" & dataRange.Address(False, False)
strFile = ThisWorkbook.FullName
strCon = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & strFile _
& ";Extended Properties=""Excel 12.0;HDR=Yes;IMEX=1"";"
Set cn = CreateObject("ADODB.Connection")
Set rs = CreateObject("ADODB.Recordset")
cn.Open strCon
strSQL = "SELECT * FROM [" & currAddress & "]" & _
"WHERE [A] = '" & CritA & "' " & _
"ORDER BY 1 ASC"
rs.Open strSQL, cn
'SQL = rs.GetString
SQL = Application.Transpose(rs.GetRows)
Set rs = Nothing
Set cn = Nothing
End Function
If you want the field names, you will have to loop through the array from the recordset too:
With rs
If Not .EOF Then
vData = .getrows()
ReDim vDataOut(LBound(vData, 2) To UBound(vData, 2) + 1, LBound(vData, 1) To UBound(vData, 1))
' get headers
For i = 1 To .Fields.Count
vDataOut(0, i - 1) = .Fields(i - 1).Name
Next i
' Copy data
For x = LBound(vData, 2) To UBound(vData, 2)
For y = LBound(vData, 1) To UBound(vData, 1)
vDataOut(x + 1, y) = vData(y, x)
Next y
Next x
End If
End With
for example.
I have an Excel worksheet with around 100 cols. Does anyone know of an easy way to write the contents of each column to a csv or txt file?
I don't have Excel in front of me, but I think this code is approximately what you need, give or take some syntax errors. It should write each column into a separate file, with each cell on a different row. It will work for arbitrary column heights, though the number of columns is in a variable (for now).
dim fso as FileSystemObject
dim ts as TextStream
dim i as Integer
dim myCell as Range
set fso = FileSystemObject
for i = 0 to TotalColumnNumber
' last argument, True, says to create the text file if it doesnt exist, which is
' good for us in this case
Set ts = fso.OpenTextFile("column_" & i, ForWriting, True)
' set mycell to the first cell in the ith column
set myCell = SheetName.cells(1,i)
' continue looping down the column until you reach a blank cell
' writing each cell value as you go
do until mycell.value = ""
ts.writeline mycell.value
set myCell = myCell.offset(1,0)
loop
ts.close
next
set ts = nothing
set fso = nothing
Let me know if that helps or not, I can take another look later if you would like
Perhaps
Dim cn As Object
Dim rs As Object
Dim strFile As String
Dim strCon As String
Dim strSQL As String
Dim i As Integer
''This is not the best way to refer to the workbook
''you want, but it is very conveient for notes
''It is probably best to use the name of the workbook.
strFile = ActiveWorkbook.FullName
''Note that if HDR=No, F1,F2 etc are used for column names,
''if HDR=Yes, the names in the first row of the range
''can be used.
''This is the Jet 4 connection string, you can get more
''here : http://www.connectionstrings.com/excel
strCon = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & strFile _
& ";Extended Properties=""Excel 8.0;HDR=Yes;IMEX=1"";"
''Late binding, so no reference is needed
Set cn = CreateObject("ADODB.Connection")
Set rs = CreateObject("ADODB.Recordset")
cn.Open strCon
''WHERE 1=1 = headers only, note hdr=yes above
strSQL = "SELECT * " _
& "FROM [Sheet1$] " _
& "WHERE 1=1"
''Open the recordset for more processing
''Cursor Type: 3, adOpenStatic
''Lock Type: 3, adLockOptimistic
''Not everything can be done with every cirsor type and
''lock type. See http://www.w3schools.com/ado/met_rs_open.asp
rs.Open strSQL, cn, 3, 3
''Output including nulls. Note that this will fail if the file
''exists.
For i = 0 To rs.Fields.Count - 1
strSQL = "SELECT [" & rs.Fields(i).Name & "] " _
& "INTO [Text;HDR=YES;FMT=Delimited;IMEX=2;DATABASE=C:\Docs\]." _
& rs.Fields(i).Name & ".CSV " _
& "FROM [Sheet1$] "
''To skip nulls and empty cells, add a WHERE statement
''& "WHERE Trim([" & rs.Fields(i).Name & "] & '')<>'' "
cn.Execute strSQL
Next
''Tidy up
rs.Close
Set rs = Nothing
cn.Close
Set cn = Nothing
A very quick line to get you started ...
for i = 1 to 100
open "file" & i & ".txt" as #1
for each c in columns(i).cells
print #1, c.value
next c
close #1
next i