I'm trying to help someone who has to go through 80k rows on excel between two sheets and identify differences and then load the changed records into a database.
The below code works but slows down significantly with bigger data set, at 10k rows it takes 00:02:22 but with 20k it takes 00:10:13, full 80k rows takes under 2 hours which is still a lot faster than someone doing it manually over a day but I hoping someone can tell me what can potentially be impacting the performance with a higher number of records and how I can solve it?
Sub Button1_Click()
'Option Explicit
Application.ScreenUpdating = False
Application.EnableEvents = False
Set Day1_Sheet = ThisWorkbook.Sheets("Day1")
Set Day2_Sheet = ThisWorkbook.Sheets("Day2")
Set VBA_Export = ThisWorkbook.Sheets("VBA_Export")
Dim Day1Code, Day2Code As String
Dim Day1CodeRow As Long, Day2CodeRow As Long, CurrentRow As Long, CurrentColumn As Long, AccountsN As Long, n As Long
Dim LastEmptyColumnResult As Long, LastEmptyRowResult As Long
Dim BolUpdated As Boolean
Dim cTime, eTime As Variant
Day1_Sheet_Rows = Day1_Sheet.Cells(Rows.Count, "B").End(xlUp).Row
Day2_Sheet_Rows = Day2_Sheet.Cells(Rows.Count, "B").End(xlUp).Row
LastEmptyColumnResult = 4
LastEmptyRowResult = 2
BolUpdated = False
VBA_Export.Range("A2:E10000").Clear
cTime = Now()
For Each c In Day1_Sheet.Range("B2:B" & Day1_Sheet_Rows)
BolUpdated = False
Day1Code = c
For Each e In Day2_Sheet.Range("B2:B" & Day2_Sheet_Rows)
If c = e Then
Day2Code = e
Day2CodeRow = e.Row
CurrentRow = c.Row
Exit For
End If
Next e
CurrentColumn = 3
While CurrentColumn <> 17
If Day1_Sheet.Cells(CurrentRow, CurrentColumn).Value = Day2_Sheet.Cells(Day2CodeRow, CurrentColumn).Value Then
Else
If BolUpdated Then
Else
Day2_Sheet.Rows(Day2CodeRow).EntireRow.Copy VBA_Export.Range("A" & LastEmptyRowResult)
LastEmptyRowResult = LastEmptyRowResult + 1
BolUpdated = True
End If
End If
CurrentColumn = CurrentColumn + 1
Wend
Next c
LastLine:
Set Day1_Sheet = Nothing
Set Day2_Sheet = Nothing
eTime = Now()
MsgBox ("Start Time " & cTime & ".End Time " & eTime)
Debug.Print "Elapsed Time " & eTime - cTime
Application.ScreenUpdating = True
Application.EnableEvents = True
End Sub
See below for suggestions for speeding up your process - basically do everything you can using arrays and avoid cell-by-cell access.
Sub Button1_Click()
Const NUM_COLS As Long = 16 'number of columns in your datasets
Dim Day1_Sheet As Worksheet, Day2_Sheet As Worksheet, VBA_Export As Worksheet
Dim data1, data2, destRow As Long, changed As Boolean, rw1 As Long, rw2 As Variant
Dim col As Long, cTime, eTime
Set Day1_Sheet = ThisWorkbook.Sheets("Day1")
Set Day2_Sheet = ThisWorkbook.Sheets("Day2")
Set VBA_Export = ThisWorkbook.Sheets("VBA_Export")
'load both datasets into arrays for faster access
data1 = Day1_Sheet.Range("A1").Resize(Day1_Sheet.Cells(Rows.Count, "B").End(xlUp).Row, NUM_COLS).Value
data2 = Day2_Sheet.Range("A1").Resize(Day2_Sheet.Cells(Rows.Count, "B").End(xlUp).Row, NUM_COLS).Value
VBA_Export.Range("A2:E10000").Clear
destRow = 2
cTime = Now()
GoFast 'turn on speed enhancements
For rw1 = 2 To UBound(data1, 1) 'loop over Day1 data
'try to match on colB - using Match on the worksheet is quite fast
rw2 = Application.Match(data1(rw1, 2), Day2_Sheet.Columns("B"), 0) 'find matching row...
If Not IsError(rw2) Then 'got a match on Day2 ?
changed = False 'reset flag
For col = 3 To NUM_COLS 'loop over columns
If data1(rw1, col) <> data2(rw2, col) Then
changed = True 'flag row as changed
Exit For 'no need to check further
End If
Next col
If changed Then 'Day2 is different?
Day2_Sheet.Rows(rw2).Copy VBA_Export.Cells(destRow, "A")
destRow = destRow + 1 'next paste row
End If
Else
'no Col B match was found. Do something?
End If
Next rw1
GoFast False 'turn off speed enhancements
eTime = Now()
MsgBox ("Start Time " & cTime & ".End Time " & eTime)
Debug.Print "Elapsed Time " & eTime - cTime
End Sub
'maximize code speed by turning off unneeded stuff
'******** must reset !!!!
Sub GoFast(Optional bYesNo As Boolean = True)
With Application
.ScreenUpdating = Not bYesNo
.Calculation = IIf(bYesNo, xlCalculationManual, xlCalculationAutomatic)
End With
End Sub
Related
I have an excel file with a LOT of worksheets, and i wanted to run a macro that would hide a range of rows based on the value at the top of this range.
My macro works, but since i have a ton or worksheets, it is taking forever to run...
Can somebody help me in optimizing it, because i may have done things unorthox-ly...
Sub MasquerPrix()
Dim RowNum As Long
Dim StartRow As Long
Dim ColNum As Long
Columns("D:H").Select
Range("F1").Activate
Selection.EntireColumn.Hidden = False
Columns("E:F").Select
Selection.EntireColumn.Hidden = True
Range("A1").Select
StartRow = 1
RowNum = 1
ColNum = 2
Do While Cells(RowNum, ColNum).Value <> "Prix Total (Public HT)"
If Cells(RowNum, ColNum).Value <> "Prix Total (Public HT)" Then
Rows(RowNum).Resize(12).EntireRow.Hidden = True
Rows(StartRow & ":" & (RowNum)).EntireRow.Hidden = False
End If
RowNum = RowNum + 1
Loop
End Sub
Thanks a million !
Not having the raw data and not understanding the logic of its processing, sketched a version of the procedure using Application.Match() instead of Do ... Loop
Sub MasquerPrix() 'processes ActiveSheet
Const STR_TO_MATCH = "Prix Total (Public HT)", COL_NUM = 2
Dim RowNum As Long, StartRow As Long
Application.ScreenUpdating = False
Columns("D:H").Hidden = False
Columns("E:F").Hidden = True
StartRow = 1
RowNum = Application.Match(STR_TO_MATCH, Columns(COL_NUM), 0)
If IsNumeric(RowNum) Then
Rows(RowNum).Resize(12).EntireRow.Hidden = True
Rows(StartRow & ":" & RowNum).EntireRow.Hidden = False
End If
Application.ScreenUpdating = True
End Sub
I have a macro that threw the below error, and I have a theory why, but am having trouble finding any literature to back it up. Pages I found are typically people posting silly mistakes with incorrect variable types.
I don't think there's anything wrong with the code, I just think the nature of the task takes too long, therefore overloading the temp folder. Per TechWalla (emphasis mine):
The Runtime Error 6 occurs in the Visual Basic program. It is an overflow issue that can occur when the Visual Basic program attempts to store too much data in the temporary folders area. Runtime files help Windows translate a program's language into Windows language so the program runs faster. You can get the Runtime Error 6 message for several reasons. One reason is that you are using a backslash instead of a forward slash in one of your calculations. Other reasons include an overloaded temporary folder, outdated software or a registry error.
(Caveat: I haven't seen this explanation elsewhere and can't vouch for how reliable Techwalla is. I don't know if I'm not searching with the right keywords, but like I said, I haven't found much of anything other than code-specific forum posts.)
Is there a way to determine if this is the case? I outline below why I think this is what's causing the error, which might help, but doesn't change the question. If this is the case, is there a way to find out? And if so, is there a way to prevent it?
(I'll be running it again tonight now that I've used a registry cleaner that found 1GB, though I don't know how much was from Excel. For reference, my C: drive has 180GB free...)
EDIT: Removing code, because I'm asking not asking about that, but whether or not the temporary folder overloading could actually cause this.
EDIT2: After being swayed by the people, I am re-adding the code. And I know, it's not efficient. Thank you for the suggestions though.
EDIT3 (LAST ONE, I SWEAR): Though I realize the description above specifically mentions Visual Basic, which is not VBA, I'm keeping it in as I know Excel uses/creates temporary files, and has memory limits, which is ultimately what I'm curious about.
Sub getCBU()
Dim rowCount As Long, newRow(1 To 17) As Variant, compareRow(1 To 17) As Variant, nextFile As String, s As Long
Dim location As String, lastRow As Long, match As Boolean, startTime As Double, secondsElapsed As String
location = "C:\Users\swallin\Documents\CBU History\"
nextFile = Dir(location & "CBU*")
rowCount = 2
startTime = Timer
Do While nextFile <> ""
Workbooks.Open (location & nextFile)
lastRow = Workbooks(nextFile).Worksheets(1).Cells(Rows.Count, 1).End(xlUp).Row
For s = 18 To lastRow
match = True
For x = 1 To 17
newRow(x) = Workbooks(nextFile).Worksheets(1).Cells(s, x)
Next x
For y = 2 To rowCount
If Val(newRow(11)) = Val(ThisWorkbook.Worksheets(1).Cells(y, 11)) Then
For j = 1 To 17
compareRow(j) = ThisWorkbook.Worksheets(1).Cells(y, j).Value
Next j
For v = 1 To 17
If Val(compareRow(v)) <> Val(newRow(v)) Then
match = False
Exit For
Else
match = True
End If
Next v
If match = True Then
Exit For
End If
Else
match = False
End If
Next y
y = 2
If match = False Then
rowCount = rowCount + 1
For t = 1 To 17
ThisWorkbook.Worksheets(1).Cells(rowCount, t) = newRow(t)
Next t
End If
Next s
s = 18
Workbooks(nextFile).Close
nextFile = Dir()
Loop
secondsElapsed = Format((Timer - startTime) / 86400, "hh:mm:ss")
ThisWorkbook.Worksheets(2).Cells(1, 1) = secondsElapsed
End Sub
This opens a new instance for each file and closes it afterwards. Give it a try (I could not test it). This includes all the suggestions I made in the chat.
Option Explicit
Sub getCBU()
Dim location As String
location = "C:\Users\swallin\Documents\CBU History\"
Dim nextFile As String
nextFile = Dir(location & "CBU*")
Dim rowCount As Long
rowCount = 2
Dim startTime As Double
startTime = Timer
Dim newRow(1 To 17) As Variant, compareRow(1 To 17) As Variant
Dim lastRow As Long, match As Boolean
Dim s As Long, x As Long, y As Long, j As Long, v As Long, t As Long
Dim objExcel As Object, ActWb As Workbook
Do While nextFile <> ""
Set objExcel = CreateObject("Excel.Application") 'new excel instance
Set ActWb = objExcel.Workbooks.Open(Filename:=location & nextFile, ReadOnly:=True)
lastRow = ActWb.Worksheets(1).Cells(Rows.Count, 1).End(xlUp).Row
For s = 18 To lastRow
match = True
For x = 1 To 17
newRow(x) = ActWb.Worksheets(1).Cells(s, x)
Next x
For y = 2 To rowCount
If Val(newRow(11)) = Val(ThisWorkbook.Worksheets(1).Cells(y, 11)) Then
For j = 1 To 17
compareRow(j) = ThisWorkbook.Worksheets(1).Cells(y, j).Value
Next j
For v = 1 To 17
If Val(compareRow(v)) <> Val(newRow(v)) Then
match = False
Exit For
Else
match = True
End If
Next v
If match = True Then
Exit For
End If
Else
match = False
End If
Next y
y = 2
If match = False Then
rowCount = rowCount + 1
For t = 1 To 17
ThisWorkbook.Worksheets(1).Cells(rowCount, t) = newRow(t)
Next t
End If
Next s
s = 18
ActWb.Close SaveChanges:=False
objExcel.Quit 'close excel instance
Set objExcel = Nothing 'free variable
nextFile = Dir()
Loop
Dim secondsElapsed As String
secondsElapsed = Format$((Timer - startTime) / 86400, "hh:mm:ss")
ThisWorkbook.Worksheets(2).Cells(1, 1) = secondsElapsed
End Sub
Not sure about the writing back to the sheet part (i would still allocate the values to an array and write it back all together, but that depends on what you have in the sheet already, plus whatever newRow() does), but can you give this a try and see if there is any improvement in speed?
Sub getCBU()
Dim rowCount As Long, newRow(1 To 17) As Variant, compareRow(1 To 17) As Variant, nextFile As String
Dim location As String, lastRow As Long, match As Boolean, startTime As Double, secondsElapsed As String
Dim arrData, arrOutput()
Dim arrTemp(): ReDim arrOutput(1 To 17, 1 To 1)
Dim R As Long, C As Long
location = "C:\Users\swallin\Documents\CBU History\"
nextFile = Dir(location & "CBU*")
rowCount = 2
startTime = Timer
Do While nextFile <> ""
Workbooks.Open (location & nextFile)
lastRow = Workbooks(nextFile).Worksheets(1).Cells(Rows.Count, 1).End(xlUp).row
With Workbooks(nextFile).Worksheets(1)
arrData = .Range(.Cells(1, 1), .Cells(lastRow, 17))
End With
For s = 18 To lastRow
match = True
For X = 1 To 17
newRow(X) = arrData(s, X)
Next X
For y = 2 To rowCount
If Val(newRow(11)) = Val(arrData(y, 11)) Then
For j = 1 To 17
compareRow(j) = arrData(y, j).Value
Next j
For v = 1 To 17
If Val(compareRow(v)) <> Val(newRow(v)) Then
match = False
Exit For
Else
match = True
End If
Next v
If match = True Then
Exit For
End If
Else
match = False
End If
Next y
y = 2
If match = False Then
rowCount = rowCount + 1
ReDim Preserve arrTemp(1 To 17, 1 To rowCount)
For t = 1 To 17
arrTemp(t, rowCount) = newRow(t)
Next t
End If
Next s
s = 18
Workbooks(nextFile).Close
nextFile = Dir()
Loop
'Transpose the array
ReDim arrOutput(1 To UBound(arrTemp, 2), 1 To UBound(arrTemp))
For C = LBound(arrTemp) To UBound(arrTemp)
For R = LBound(arrTemp, 2) To UBound(arrTemp, 2)
arrOutput(R, C) = arrTemp(C, R)
Next R
Next C
'Allocate back to the spreadsheet
With ThisWorkbook.Worksheets(1)
.Range(.Cells(2, 1), .Cells(UBound(arrOutput) + 1, 17)) = arrOutput
End With
secondsElapsed = Format((Timer - startTime) / 86400, "hh:mm:ss")
ThisWorkbook.Worksheets(2).Cells(1, 1) = secondsElapsed
End Sub
PS: As others suggested, is a good idea to use Option Explicit, and eventually to step through to code and see if everything is working as intended.
As for the Overflow issue... stepping through code would/should resolve that as well eventually. See Overflow (Error 6) for more info.
EDIT: I've added further management to holding the values in an array, and writing back to the spreadsheet.
Here's a revamp of your code that should be quicker and more memory friendly. (updated to be able to handle any number of results).
Sub getCBU()
Dim wb As Workbook
Dim wsDest As Worksheet
Dim wsTime As Worksheet
Dim hUnqVals As Object
Dim hUnqRows As Object
Dim aHeaders() As Variant
Dim aCompare() As Variant
Dim aResults() As Variant
Dim aStartingData() As Variant
Dim sFolder As String
Dim sFile As String
Dim sDelim As String
Dim sTemp As String
Dim lMaxResults As Long
Dim lCompareStartRow As Long
Dim lValCompareCol As Long
Dim ixCompare As Long
Dim ixResult As Long
Dim ixCol As Long
Dim dTimer As Double
dTimer = Timer
Set wb = ThisWorkbook
Set wsDest = wb.Worksheets(1)
Set wsTime = wb.Worksheets(2)
Set hUnqRows = CreateObject("Scripting.Dictionary")
Set hUnqVals = CreateObject("Scripting.Dictionary")
sDelim = "|"
lMaxResults = 100000
lCompareStartRow = 18
lValCompareCol = 11
sFolder = Environ("UserProfile") & "\Documents\CBU History\" 'Be sure to including ending \
sFile = Dir(sFolder & "CBU*.xlsx")
With wsDest.Range("A2:Q" & wsDest.Cells(wsDest.Rows.Count, lValCompareCol).End(xlUp).Row)
If .Row > 1 Then
aHeaders = .Offset(-1).Resize(1).Value
aStartingData = .Value
ReDim aResults(1 To lMaxResults, 1 To .Columns.Count)
For ixResult = 1 To UBound(aStartingData, 1)
For ixCol = 1 To UBound(aStartingData, 2)
sTemp = sTemp & sDelim & aStartingData(ixResult, ixCol)
Next ixCol
If Not hUnqRows.Exists(sTemp) Then hUnqRows.Add sTemp, sTemp
If Not hUnqVals.Exists(aStartingData(ixResult, lValCompareCol)) Then hUnqVals.Add aStartingData(ixResult, lValCompareCol), aStartingData(ixResult, lValCompareCol)
sTemp = vbNullString
Next ixResult
Erase aStartingData
Else
'No data to compare against, so no data can be added, exit macro
MsgBox "No data found in [" & wsDest.Name & "]" & Chr(10) & "Exiting Macro.", , "Error"
Exit Sub
End If
End With
With Application
.Calculation = xlCalculationManual
.EnableEvents = False
.ScreenUpdating = False
End With
ixResult = 0
Do While Len(sFile) > 0
Application.StatusBar = "Processing " & sFile & "..."
With Workbooks.Open(sFolder & sFile, , True).Worksheets(1)
With .Range("A" & lCompareStartRow & ":Q" & .Cells(.Rows.Count, lValCompareCol).End(xlUp).Row)
If .Row >= lCompareStartRow Then
aCompare = .Value
For ixCompare = 1 To UBound(aCompare, 1)
If hUnqVals.Exists(aCompare(ixCompare, lValCompareCol)) Then
For ixCol = 1 To UBound(aCompare, 2)
sTemp = sTemp & sDelim & aCompare(ixCompare, ixCol)
Next ixCol
If Not hUnqRows.Exists(sTemp) Then
hUnqRows.Add sTemp, sTemp
ixResult = ixResult + 1
For ixCol = 1 To UBound(aCompare, 2)
aResults(ixResult, ixCol) = aCompare(ixCompare, ixCol)
Next ixCol
If ixResult = lMaxResults Then OutputResults wsDest, aResults, ixResult, aHeaders
End If
sTemp = vbNullString
End If
Next ixCompare
Erase aCompare
End If
End With
.Parent.Close False
End With
sFile = Dir()
Loop
Application.StatusBar = vbNullString
If ixResult > 0 Then OutputResults wsDest, aResults, ixResult, aHeaders
wsTime.Range("A1").Value = Format((Timer - dTimer) / 86400, "hh:mm:ss")
With Application
.Calculation = xlCalculationAutomatic
.EnableEvents = True
.ScreenUpdating = True
End With
End Sub
Sub OutputResults(ByRef arg_ws As Worksheet, ByRef arg_aResults As Variant, ByRef arg_ixResult As Long, ByVal arg_aHeaders As Variant)
Static wsDest As Worksheet
Dim rDest As Range
Dim lMaxRows As Long
Dim lMaxCols As Long
If wsDest Is Nothing Then Set wsDest = arg_ws
lMaxRows = UBound(arg_aResults, 1)
lMaxCols = UBound(arg_aResults, 2)
Set rDest = wsDest.Range("A1").Resize(, lMaxCols).EntireColumn.Find("*", wsDest.Range("A1"), xlValues, xlWhole, , xlPrevious)
If rDest Is Nothing Then Set rDest = wsDest.Range("A2") Else Set rDest = wsDest.Cells(rDest.Row, "A")
If rDest.Row + 1 + arg_ixResult > wsDest.Rows.Count Then
Set wsDest = wsDest.Parent.Worksheets.Add(After:=wsDest)
With wsDest.Range("A1").Resize(, lMaxCols)
.Value = arg_aHeaders
.Font.Bold = True
End With
Set rDest = wsDest.Range("A2")
End If
rDest.Resize(arg_ixResult, lMaxCols).Value = arg_aResults
Erase arg_aResults
ReDim arg_aResults(1 To lMaxRows, 1 To lMaxCols)
End Sub
I've one workbook with 170K rows, I will delete all rows when the result between cells is 0,
For those operation, normally I use the code below, but with 170K (the rows will be deleted are 90K) the code run very slowly.
Someone know another way more performance.
Thank
Last = Cells(Rows.Count, "K").End(xlUp).Row
For i = Last To 2 Step -1
If (Cells(i, "K").Value + Cells(i, "L").Value) < 1 Then
Cells(i, "A").EntireRow.Delete
End If
Next i
As long as your fine putting the data on a new tab, the code below will do everything you need in 1.5 seconds.
Sub ExtractRows()
Dim vDataTable As Variant
Dim vNewDataTable As Variant
Dim vHeaders As Variant
Dim lastRow As Long
Dim i As Long, j As Long
Dim Counter1 As Long, Counter2 As Long
With Worksheets(1)
lastRow = .Cells(Rows.Count, "K").End(xlUp).row
vHeaders = .Range("A1:L1").Value2
vDataTable = .Range("A2:L" & lastRow).Value2
End With
For i = 1 To UBound(vDataTable)
If vDataTable(i, 11) + vDataTable(i, 12) > 0 Then
Counter1 = Counter1 + 1
End If
Next
ReDim vNewDataTable(1 To Counter1, 1 To 12)
For i = 1 To UBound(vDataTable)
If vDataTable(i, 11) + vDataTable(i, 12) > 0 Then
Counter2 = Counter2 + 1
For j = 1 To 12
vNewDataTable(Counter2, j) = vDataTable(i, j)
Next j
End If
Next
Worksheets.Add After:=Worksheets(1)
With Worksheets(2)
.Range("A1:L1") = vHeaders
.Range("A2:L" & Counter1 + 1) = vNewDataTable
End With
End Sub
Here, my approach for your problem according to rwilson's idea.
I already tested it. It very very reduce executing time. Try it.
Sub deleteRow()
Dim newSheet As Worksheet
Dim lastRow, newRow As Long
Dim sheetname As String
Dim startTime As Double
sheetname = "sheetname"
With Sheets(sheetname)
Set newSheet = ThisWorkbook.Worksheets.Add(After:=Sheets(.Name))
'Firstly copy header
newSheet.Rows(1).EntireRow.Value = .Rows(1).EntireRow.Value
lastRow = .Cells(.Rows.Count, "K").End(xlUp).row
newRow = 2
For row = 2 To lastRow Step 1
If (.Cells(row, "K").Value + .Cells(row, "L").Value) >= 1 Then
newSheet.Rows(newRow).EntireRow.Value = .Rows(row).EntireRow.Value
newRow = newRow + 1
End If
Next row
End With
Application.DisplayAlerts = False
Sheets(sheetname).Delete
Application.DisplayAlerts = True
newSheet.Name = sheetname
End Sub
Here is a non-VBA option you can try:
In column M compute the sum of columns K and L
Highlight column M and the click Find and select > Find
Type in 0 in the Find what box and also select values in the Look in box
Select Find all and in the box that shows the found items select all entires (click in the box and press CTRL + A)
On the ribbon select Delete and then Delete sheet rows
Now manually delete column M
I haven't tried this with 170k+ rows but maybe worth assessing performance versus the VBA loop.
thank at all for your ideas but the really fast code is: use an array tu populate whit the correct date and replare all table of the end sort the table:
Sub Macro13(control As IRibbonControl)
Dim avvio As Date
Dim arresto As Date
Dim tempo As Date
Application.ScreenUpdating = False
Application.Calculation = xlManual
avvio = Now()
Dim sh As Worksheet
Dim arng As Variant
Dim arrdb As Variant
Dim UR As Long, x As Long, y As Long
Dim MyCol As Integer
Set sh = Sheets("Rol_db")
MyCol = 1
sh.Select
UR = sh.Cells(Rows.Count, MyCol).End(xlUp).Row
ReDim arrdb(2 To UR, 1 To 12) As Variant
For x = 2 To UR
If Cells(x, 11) + Cells(x, 12) > 0 Then
For y = 1 To 12
arrdb(x, y) = Cells(x, y)
Next y
Else
For y = 1 To 12
arrdb(x, y) = ""
Next y
End If
Next x
sh.Range("A2:L" & UR) = arrdb
arresto = Now()
tempo = arresto - avvio
Debug.Print "Delete empty rows " & tempo
Range("A2:L" & UR).Sort key1:=Range("A2:L" & UR), _
order1:=xlAscending, Header:=xlNo
Range("A4").Select
ActiveWindow.FreezePanes = True
conclusioni:
Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic
End Sub
time for my sheet 170K 00:00:07.
as soon as I have a minute I feel a loop of the columns
I have 100 names in one column. And next to each name in the next cell is a numerical value that the name is worth.There are 6 positions in a company that each name could potentially hold. And that is also in a cell next to each name.
So the spreadsheet looks something like this.
John Smith Lawyer $445352
Joe Doe Doctor $525222
John Doe Accountant $123192
etc....
I want excel to give me 10 people who make a combined amount between 2 and 3 million dollars. But I require that 2 of the people be doctors 2 be lawyers and 2 be accountants etc. How would I create this?
I set up sheet 1 with the following data:
Goal:
Return 10 people
Salary between 1000000 and 6000000 range
Min 2 each doc, lawyer, accountant
Run this Macro:
Sub macro()
Dim rCell As Range
Dim rRng As Range
Dim rangelist As String
Dim entryCount As Long
Dim totalnum As Long
Set rRng = Sheet1.Range("A1:A12")
Dim OccA As String
Dim OccCntA As Long
Dim OccASalmin As Long
Dim OccASalmax As Long
Dim OccB As String
Dim OccCntB As Long
Dim OccBSalmin As Long
Dim OccBSalmax As Long
Dim OccC As String
Dim OccCntC As Long
Dim OccCSalmin As Long
Dim OccCSalmax As Long
'Set total number of results to return
totalnum = 10
'Set which occupations that must be included in results
OccA = "Accountant"
OccB = "Doctor"
OccC = "Lawyer"
'Set minimum quantity of each occupation to me returned in results
OccCntA = 2
OccCntB = 2
OccCntC = 2
'Set min and max salary ranges to return for each occupation
OccASalmin = 1000000
OccASalmax = 6000000
OccBSalmin = 1000000
OccBSalmax = 6000000
OccCSalmin = 1000000
OccCSalmax = 6000000
'Get total number of entries
entryCount = rRng.Count
'Randomly get first required occupation entries
'Return list of rows for each Occupation
OccAList = PickRandomItemsFromList(OccCntA, entryCount, OccA, OccASalmin, OccASalmax)
OccBList = PickRandomItemsFromList(OccCntB, entryCount, OccB, OccBSalmin, OccBSalmax)
OccCList = PickRandomItemsFromList(OccCntC, entryCount, OccC, OccCSalmin, OccCSalmax)
For Each i In OccAList
If rangelist = "" Then
rangelist = "A" & i
Else
rangelist = rangelist & "," & "A" & i
End If
Next i
For Each i In OccBList
If rangelist = "" Then
rangelist = "A" & i
Else
rangelist = rangelist & "," & "A" & i
End If
Next i
For Each i In OccCList
If rangelist = "" Then
rangelist = "A" & i
Else
rangelist = rangelist & "," & "A" & i
End If
Next i
'Print the rows that match criteria
Dim rCntr As Long
rCntr = 1
Dim nRng As Range
Set nRng = Range(rangelist)
For Each j In nRng
Range(j, j.Offset(0, 2)).Select
Selection.Copy
Range("E" & rCntr).Select
ActiveSheet.Paste
rCntr = rCntr + 1
Next j
'Get rest of rows randomly and print
OccList = PickRandomItemsFromListB(totalnum - rCntr + 1, entryCount, rangelist)
For Each k In OccList
Set Rng = Range("A" & k)
Range(Rng, Rng.Offset(0, 2)).Select
Selection.Copy
Range("E" & rCntr).Select
ActiveSheet.Paste
rCntr = rCntr + 1
Next k
End Sub
Function PickRandomItemsFromListB(nItemsToPick As Long, nItemsTotal As Long, avoidRng As String)
Dim rngList As Range
Dim idx() As Long
Dim varRandomItems() As Variant
Dim i As Long
Dim j As Long
Dim booIndexIsUnique As Boolean
Set rngList = Range("B1").Resize(nItemsTotal, 1)
ReDim idx(1 To nItemsToPick)
ReDim varRandomItems(1 To nItemsToPick)
For i = 1 To nItemsToPick
Do
booIndexIsUnique = True ' Innoncent until proven guilty
idx(i) = Int(nItemsTotal * Rnd + 1)
For j = 1 To i - 1
If idx(i) = idx(j) Then
' It's already there.
booIndexIsUnique = False
Exit For
End If
Next j
Set isect = Application.Intersect(Range("A" & idx(i)), Range(avoidRng))
If booIndexIsUnique = True And isect Is Nothing Then
Exit Do
End If
Loop
varRandomItems(i) = idx(i)
Next i
PickRandomItemsFromListB = varRandomItems
' varRandomItems now contains nItemsToPick unique random
' items from range rngList.
End Function
Function PickRandomItemsFromList(nItemsToPick As Long, nItemsTotal As Long, Occ As String, Salmin As Long, Salmax As Long)
Dim rngList As Range
Dim idx() As Long
Dim varRandomItems() As Variant
Dim i As Long
Dim j As Long
Dim booIndexIsUnique As Boolean
Set rngList = Range("B1").Resize(nItemsTotal, 1)
ReDim idx(1 To nItemsToPick)
ReDim varRandomItems(1 To nItemsToPick)
For i = 1 To nItemsToPick
Do
booIndexIsUnique = True ' Innoncent until proven guilty
idx(i) = Int(nItemsTotal * Rnd + 1)
For j = 1 To i - 1
If idx(i) = idx(j) Then
' It's already there.
booIndexIsUnique = False
Exit For
End If
Next j
If booIndexIsUnique = True And Range("B" & idx(i)).Value = Occ And Range("B" & idx(i)).Offset(0, 1).Value >= Salmin And Range("B" & idx(i)).Offset(0, 1).Value <= Salmax Then
Exit Do
End If
Loop
varRandomItems(i) = idx(i)
Next i
PickRandomItemsFromList = varRandomItems
End Function
Results are printed in column E with the first results meeting the criteria. After those, the rest are random but don't repeat the previous ones:
I'm not doing very much error checking such as what happens if there are not 2 doctors or not enough entries left to meet the required number of results. You'll have to fine tune it for your purposes. You'll probably also want to set up the inputs as a form so you don't have to mess with code every time you change your criteria.
I have the following VBA code within excel. It's goal is to remove a row if the given text is found, as well as remove the row directly below it. It needs to scan roughly 700k rows and is taking roughly an hour to do 100k rows. Does anyone see any optimization?
Sub RemovePageHeaders()
Application.ScreenUpdating = False
Dim objRange As Range
Set objRange = Cells.Find("HeaderText")
While objRange <> ""
objRange.Offset(1, 0).Rows(1).EntireRow.Delete
objRange.Rows(1).EntireRow.Delete
Set objRange = Cells.Find("HeaderText")
Wend
MsgBox ("I'm done removing page headers!")
End Sub
Thanks in advance!
Try the following sub. It loops from the bottomm-most row to the top, checking column 3 for "HeaderText". If that's found, it delete the row and the one below it. On a C2D E8500 with 2 gigs of RAM it takes just over a minute per 100,000 rows on a sheet with 1 million rows.
Sub RemoveHeaders()
Dim i As Long
Application.ScreenUpdating = False
Debug.Print "Started: " & Now
For i = ActiveSheet.UsedRange.Rows.Count To 1 Step -1
If ActiveSheet.Cells(i, 3) = "HeaderText" Then
ActiveSheet.Range(i & ":" & i + 1).EntireRow.Delete
End If
Next i
Application.ScreenUpdating = True
Debug.Print "Finished: " & Now
End Sub
EDIT
For a slightly ghetto but possibly much faster solution try this:
Change the constant in the below code to the number of the first column that's blank in every row. For example if your data takes up columns A-F, you want the constant to be 7 (column G).
Run the code, it will put the row number next to every entry. Should take around 30 seconds.
Sort the ENTIRE data by column C; this should take less than a minute.
Find "HeaderText" visually, select and delete all the rows.
Sort by your row-numbered column ("G" in my example).
Delete the row-numbered column (again, "G" in my example).
Sub NumberColumns()
Const BLANK_COLUMN = 7
Dim i As Long
For i = ActiveSheet.UsedRange.Rows.Count To 1 Step -1
ActiveSheet.Cells(i, BLANK_COLUMN) = i
Next i
Debug.Print "done"
End Sub
Even if it doesn't fully answer the question, it may help any reader so...
There are several tips on the web about optimizing vba. In particular, you can do:
'turn off some Excel functionality so your code runs faster
'these two are especially very efficient
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
'use these if you really need to
Application.DisplayStatusBar = False
Application.EnableEvents = False 'is very efficient if you have ANY event associated with what your macro is going to do
'code goes here
'at the end, don't forget to restore the default behavior
'calculate the formulas
Application.Calculate
Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic
Application.DisplayStatusBar = True
Application.EnableEvents = True
See here for more information
Putting this entry in a little late. It should be about 2X faster than the accepted solution. I used my XP Excel 2003 computer with 1 gig to figure it out.
Sub DeleteHeaderText()
Dim bUnion As Boolean
Dim d1 As Double
Dim l As Long
Dim rDelete As Range
Dim wks As Worksheet
Dim vData As Variant
d1 = Timer
Application.EnableEvents = False
Application.ScreenUpdating = False
bUnion = False
Set wks = ActiveSheet
lEnd = ActiveSheet.UsedRange.Rows.Count
vData = wks.Range("C1:C" & lEnd).Value2
For l = 1 To lEnd
If vData(l, 1) = "HeaderText" Then
If bUnion Then
Set rDelete = Union(rDelete, wks.Range("A" & l, "A" & l + 1))
Else
Set rDelete = wks.Range("A" & l, "A" & l + 1)
bUnion = True
End If
l = l + 1
End If
Next l
Debug.Print Timer() - d1
rDelete.EntireRow.Delete
Debug.Print Timer() - d1
End Sub
I know this is late, but if I understand your problem, then you are deleting rows based on a "HeaderText" in column C. So, since i didn't look at your data, i created my own. I created 700,000 rows and every 9th row contained the "HeaderText" string. It deleted ~233k rows ("HeaderText" row + row before + row after) and ran in 2.2 seconds on my computer. Give it a try!!
Private Declare Function timeGetTime Lib "winmm.dll" () As Long
Sub DeleteHeaders()
Dim LastRow As Long
Dim I As Long
Dim WkSheet As Excel.Worksheet
Dim VArray As Variant
Dim NewArray() As String
Dim BooleanArray() As Boolean
Dim NewArrayCount As Long
Dim J As Long
Dim T As Double
Dim DeleteRowCount As Long
T = timeGetTime
With Application
.ScreenUpdating = False
.EnableEvents = False
.Calculation = xlCalculationManual
End With
Set WkSheet = ThisWorkbook.Sheets("Sheet1")
With WkSheet.UsedRange
LastRow = .Rows.Count
VArray = .Value
End With
ReDim BooleanArray(0 To UBound(VArray, 1) - 1), NewArray(UBound(VArray, 1) - 1, 0 To UBound(VArray, 2))
For I = 1 To UBound(VArray, 1)
If InStrB(1, VArray(I, 3), "HeaderText", vbBinaryCompare) <> 0 Then
BooleanArray(I - 1) = Not BooleanArray(I - 1)
BooleanArray(I) = Not BooleanArray(I)
BooleanArray(I + 1) = Not BooleanArray(I + 1)
End If
Next I
For I = LBound(BooleanArray, 1) To UBound(BooleanArray, 1)
If BooleanArray(I) = False Then
For J = LBound(VArray, 2) To UBound(VArray, 2)
NewArray(NewArrayCount, J - 1) = VArray(I + 1, J)
Next J
NewArrayCount = NewArrayCount + 1
Else
DeleteRowCount = DeleteRowCount + 1
End If
Next I
With WkSheet
.Cells.Delete
.Range("a1:c" & NewArrayCount).Value = NewArray
End With
With Application
.ScreenUpdating = True
.EnableEvents = True
.Calculation = xlCalculationAutomatic
End With
Erase NewArray, BooleanArray, VArray
MsgBox "Deleted " & DeleteRowCount & " rows." & vbNewLine & vbNewLine & _
"Run time: " & Round((timeGetTime - T) / 1000, 3) & " seconds.", vbOKOnly, "RunTime"
End Sub
Here's a solution that will run on 100k rows in about 5-20 seconds depending on how many occurances of 'HeaderText' you have. As you requested, it will delete both the row with HeaderText in the C column as well as the row directly above it.
Update:
As it's been pointed out, this works on smaller data sets up to about 100k, but on larger sets it's really doesn't. Back to the drawing board :)
Sub DeleteHeaders()
Application.ScreenUpdating = False
Dim lastRow As Long
Dim varray As Variant
lastRow = Range("C" & Rows.Count).End(xlUp).Row
On Error Resume Next
varray = Range("C1:C" & lastRow).Value
For i = UBound(varray, 1) To 1 Step -1
If varray(i, 1) = "HeaderText" Then
Range("C" & i - 1, Range("C" & i)).EntireRow.Delete
i = i - 1
End If
Next
Application.ScreenUpdating = True
End Sub
How it works:
By dumping the entire C column into a variant array and working from it within excel, you get major speed increase. The varray is laid out like (1, 1), (2, 1), (3, 1) with the first number being the row number, so all you have to do is loop through it backwards. The key is making sure to delete both rows at the same time and decrementing i by one more.
The following is code lifted from a Bill Jelen book that is fantastic for this purpose.
Use a column (column A for my code) with some logic to determine if a row should be hidden on not.
Use the following formula in all applicable cells in that column
=IF(test TRUE to hide, 1, "keep")
Now use the VBA below
Range("A1:A10000").SpecialCells(xlCellTypeFormulas, xlNumbers).EntireRow.Delete
This selects all rows with a number returned by the formula at once, which is exactly the rows you want to delete. No looping required!
Here on my blog have a scripts for this:
Sample One:
Sub DelBlankRows()
Range("D1:D" & Cells _
(Rows.Count,2).End(xlUp).Row).SpecialCells(xlCellTypeBlanks).EntireRow.Delete
End Sub
Sample two:
Sub DeleteRowsWithSpecifiedData()
'Looks in Column D and requires Column IV to be clean
Columns(4).EntireColumn.Insert
With Range("D1:D" & ActiveSheet.UsedRange.Rows.Count)
.FormulaR1C1 = "=IF(RC[1]="""","""",IF(RC[1]=""Not Needed"",NA()))"
.Value = .Value
On Error Resume Next
.SpecialCells(xlCellTypeConstants, xlErrors).EntireRow.Delete
End With
On Error GoTo 0
Columns(4).EntireColumn.Delete
End Sub